CN1859511A - Telephone conference voice mixing method - Google Patents

Telephone conference voice mixing method Download PDF

Info

Publication number
CN1859511A
CN1859511A CN 200510034524 CN200510034524A CN1859511A CN 1859511 A CN1859511 A CN 1859511A CN 200510034524 CN200510034524 CN 200510034524 CN 200510034524 A CN200510034524 A CN 200510034524A CN 1859511 A CN1859511 A CN 1859511A
Authority
CN
China
Prior art keywords
audio mixing
time
sound
data
district
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN 200510034524
Other languages
Chinese (zh)
Inventor
朱祥文
吴宗武
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN 200510034524 priority Critical patent/CN1859511A/en
Publication of CN1859511A publication Critical patent/CN1859511A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Telephonic Communication Services (AREA)

Abstract

A teleconference sound mixing method includes A, configuring one fixed time section, calculating each attend a meeting side time domain energy in fixed time section, judging maximum sound side and sub-maximum side; B, in current time section previous time section maximum sound side and sub-maximum side to make sound mixing, to obtain first sound mixing data, and according to current time section maximum sound side sub-maximum side to make sound mixing, to obtain second sound mixing data; C, dividing fixed time section into sound mixing zone smooth zone; in current time section sound mixing zone, sound mixing result being first sound mixing data or second sound mixing data; but in current time section smooth zone, first sound mixing data descent along with time, second sound mixing data increasing along with time, sound mixing result formed by first and second sound mixing data superposition. Said method makes adjacent time section different sound mixing data capable of smooth transition, improving sound mixing effect.

Description

A kind of telephone conference voice mixing method
Technical field
The present invention relates to communication technical field, relate in particular to a kind of sound mixing method that improves voice quality in the videoconference.
Background technology
Often relate to MPTY in the middle of the conference telephone, we are modal to be two side's audio mixings.Participate in a conference telephone simultaneously such as A, B, C, D, if A, B talk simultaneously, C, D will hear the sound of A, B simultaneously, and its networking diagram as shown in Figure 1.Our equipment is when handling conference telephone, judge at first in the middle of this meeting that whose sound maximum, whose sound are time big, then maximum acoustic and time big sound are superimposed, giving other does not have to speak or speak the little participant of sound, has just realized conference telephone capabilities substantially; And be exactly audio mixing with the operation that sound is superimposed, sound mixing method has directly determined the effect of meeting-place audio mixing.Existing method is to calculate the time domain energy of each participant side in the fixed length time period (such as 20ms) mostly, determine maximum acoustic passage and time loud noise passage, in the next time period, the sound of largest passages and the sound of time major path are superposeed in proportion, give participant side.For example shown in Figure 2, suppose to have tripartite A/B/C to participate in same meeting, at first calculate the time domain energy of each voice channel in the current slot, draw the most generous and inferior generous of this time period meeting-place, as A and B, make audio mixing according to the most generous and inferior generous channel number A/B then in the next time period, the rest may be inferred in ensuing processing.Adopt this sound mixing method as shown in Figure 3, because the most generous and inferior generous channel number of volume constantly changes, the long audio mixing result of window the last period, with the long audio mixing result of back one section window, exist than big-difference in the junction, the audio mixing data are directly switched, and will cause sound variation stiff, can recognize significantly " noise ", the audio mixing quality is not good enough.
Summary of the invention
Technical problem to be solved by this invention is: when the most generous and inferior generous channel number of adjacent time period volume changes, cause sound obvious noise to occur because the audio mixing data are directly switched, so that the problem that voice quality descends.
The present invention solves the problems of the technologies described above the technical scheme that is adopted to be:
A kind of telephone conference voice mixing method may further comprise the steps:
A, set time section of setting are calculated the time domain energy of each participant side in described set time section, judge to draw sound largest passages and sound time major path in each time period;
B, in current slot, described sound largest passages and sound time major path according to a last time period carry out audio mixing, obtain the first audio mixing data, and carry out audio mixing, obtain the second audio mixing data according to the described sound largest passages and the sound time major path of current slot;
C, described set time section is divided into two intervals, audio mixing district peace skating area; In the audio mixing district of current slot, the audio mixing result is the described first audio mixing data or the second audio mixing data; And in the level and smooth district of current slot, the described first audio mixing data are dull in time to descend, and the described second audio mixing data are dull in time to be increased, and the audio mixing result is formed by stacking by the described first audio mixing data and the second audio mixing data.
Described method, wherein: in described level and smooth district, the described first audio mixing data are linear in time to descend, and the described second audio mixing data are linear in time to be increased.
Described method, wherein: the audio mixing data in described audio mixing district satisfy following formula:
MixOut ( n ) = ( M - ramp ) · X 1 ( n ) + ramp · X 2 ( n ) M
Wherein: X 1(n) be the described first audio mixing data;
X 2(n) be the described second audio mixing data;
M is the digital quantity of the level and smooth district of expression time span, for greater than 0 positive integer;
Ramp is variable transit time, linear change in time, and its excursion is 0~M.
Described method, wherein: the time span that described level and smooth district is set is less than or equal to 1/2 described set time segment length, and when described set time section was 20ms, the time span in described level and smooth district was less than or equal to 10ms, and corresponding described M is less than or equal to 80.
Described method, wherein: corresponding length is the set time section of 20ms, and the best value of described M is 80.
Described method, wherein: when described level and smooth district was arranged on the rear portion of described set time section, the audio mixing result in described audio mixing district was the described first audio mixing data.
Described method, wherein: when described level and smooth district was arranged on described set time section anterior, the audio mixing result in described audio mixing district was the described second audio mixing data.
Beneficial effect of the present invention is: owing to adopted method of the present invention, when the most generous and inferior generous channel number of adjacent time period volume changes, the linking of adjacent time period different blended sound data is level and smooth, there is not saltus step, thereby greatly improved the audio mixing effect, make sound totally not have impurity, improved voice quality.
Description of drawings
Fig. 1 is a conference telephone audio mixing networking schematic diagram;
Fig. 2 is the existing audio mixing algorithm schematic diagram of conference telephone;
Fig. 3 causes the noise schematic diagram for existing audio mixing algorithm audio mixing switches;
Fig. 4 a, Fig. 4 b are respectively the audio mixing algorithm schematic diagram that the level and smooth district of the present invention is arranged on rear portion/front portion.
Embodiment
With embodiment the present invention is described in further detail with reference to the accompanying drawings below:
Referring to Fig. 3, adopt original sound mixing method, the audio mixing data that obtain in very first time section are: MixOut (n)=A (n)+B (n); The audio mixing data that obtain in second time period are: MixOut (n)=B (n)+C (n); Since from the time period 1 when time periods 2 transition, the audio mixing data are directly switched, and are easy to produce noise; The present invention attempts when the audio mixing passage changes, and by the data of level and smooth two adjacent time periods, the audio mixing effect is improved.Sound mixing method of the present invention is set a set time section equally referring to Fig. 4 a, Fig. 4 b, calculates the time domain energy of each participant side in the set time section earlier, judges to draw volume largest passages and time major path in each time period; Afterwards, in current slot, will judge that the sound of the largest passages that draws and time major path superposes in proportion according to last time period and promptly carry out audio mixing, obtain the first audio mixing data, and largest passages and time major path to current slot carry out audio mixing, obtain the second audio mixing data.Different is with prior art, sound mixing method of the present invention with each set time section to schedule length be divided into two intervals, audio mixing district peace skating area, and level and smooth district can be arranged on the front portion shown in Fig. 4 b, also can be arranged on the rear portion shown in Fig. 4 a.In the audio mixing district of current slot, the position that the audio mixing result looks level and smooth district setting is not all the first audio mixing data or the second audio mixing data; When level and smooth district was arranged on the rear portion of set time section, shown in Fig. 4 a, the audio mixing result in audio mixing district was the first audio mixing data; When level and smooth district was arranged on set time section anterior, shown in Fig. 4 b, the audio mixing result in audio mixing district was the second audio mixing data.And in the level and smooth district of current slot, the audio mixing result is formed by stacking by the first audio mixing data and the second audio mixing data, and the first audio mixing data are dull in time to descend, be equivalent to the first audio mixing data and take advantage of the coefficient of a monotone decreasing, gradually go out from this level and smooth district, increase and the second audio mixing data are dull in time, be equivalent to the coefficient that the second audio mixing data are taken advantage of a monotone increasing, be fade-in this level and smooth district.So-called monotone variation is meant and only does the variation of rising or descending in time, in change procedure without any fluctuation, i.e. always positive number or negative always of the time dependent slope of function, the simplest as linear change function in time.Because the effect in level and smooth district makes the linking of adjacent time period different blended sound data level and smooth, does not have saltus step, thereby has greatly improved the audio mixing effect.Below we to be arranged on set time section rear portion with level and smooth district be example, the inventive method is made detail analysis sets forth:
Referring to Fig. 4 a, judge the volume of this time period the most generous and inferior generous be A and B passage in the time period 0, afterwards, A and B passage are carried out audio mixing in the time period 1, obtain the first audio mixing data X 1(n)=A (n)+B (n); In like manner, judge the volume of this time period 1 the most generous and inferior generous be B and C-channel, and B and C-channel are carried out audio mixing obtain the second audio mixing data X in the time period 1 2(n)=B (n)+C (n).Audio mixing is output as X in the audio mixing district of time period 1 1(n), and in the level and smooth district that the rear portion of time period 1 linked to each other with the time period 2, the first audio mixing data X 1(n) linear in time decline, during to level and smooth end of extent, X 1(n) equal 0, just gradually go out level and smooth district, form a triangle that descends in level and smooth district; And the second audio mixing data X 2(n) linear in time rising the in level and smooth district during to level and smooth end of extent, equals X 2(n), form a triangle that rises in level and smooth district; The audio mixing output MixOut (n) in so level and smooth district is two leg-of-mutton stacks.This shows that because the transient process of a gradual change is arranged from A, B audio mixing to B, C audio mixing, the transition nature that will become is level and smooth, therefore can obtain audio mixing effect preferably.In like manner can release, when the volume of next time period (time period 2) the most generous and volume time generous with the time period 1 not simultaneously, for example the volume of time period 2 the most generous and inferior generous be D/E, the time periods 2 the audio mixing district the audio mixing data X of audio mixing result for obtaining according to the most generous/time generous audio mixing of time period 1 2(n), and smoothly distinguish still at the rear portion of set time section, and in level and smooth district, audio mixing data X 2(n) linear in time decline, linear in time rising of audio mixing data according to the most generous D of volume and time generous C of time period 2 obtains seamlessly transits.The audio mixing in level and smooth district MixOut (n) as a result can be used following formulae express:
MixOut ( n ) = ( M - ramp ) · X 1 ( n ) + ramp · X 2 ( n ) M
In the formula: M is the digital quantity of the level and smooth district of expression time span, and M is the positive integer greater than 0; Ramp is variable transit time, and linear change, and its excursion in time is 0~M; By formula as seen, when ramp=0, the audio mixing in level and smooth district is MixOut (n)=X as a result 1(n); When ramp=M, MixOut (n)=X 2(n).By Fig. 4 and formula as seen, the level and smooth degree that adjacent time period audio mixing data are switched is determined jointly by ramp and M, the time span in the level and smooth district of operated by rotary motion is less than or equal to 1/2 set time segment length, and the set time segment length generally is set at 20ms, therefore the time span in level and smooth district is less than or equal to 10ms, and corresponding digital quantity M is 80; In actual applications, M just gets 80.Be also shown in by formula, when the volume of adjacent time period the most generous and time generous identical, i.e. X 1(n) equal X 2(n) time, the audio mixing result in level and smooth district just equals the audio mixing result in this audio mixing district time period, shown in the time period among Fig. 42.
Be understandable that, for those of ordinary skills, can be equal to replacement or change according to technical scheme of the present invention and inventive concept thereof, and all these changes or replacement all should belong to the protection range of the appended claim of the present invention.

Claims (7)

1, a kind of telephone conference voice mixing method may further comprise the steps:
A, set time section of setting are calculated the time domain energy of each participant side in described set time section, judge to draw sound largest passages and sound time major path in each time period;
B, in current slot, described sound largest passages and sound time major path according to a last time period carry out audio mixing, obtain the first audio mixing data, and carry out audio mixing, obtain the second audio mixing data according to the described sound largest passages and the sound time major path of current slot;
C, described set time section is divided into two intervals, audio mixing district peace skating area; In the audio mixing district of current slot, the audio mixing result is the described first audio mixing data or the second audio mixing data; And in the level and smooth district of current slot, the described first audio mixing data are dull in time to descend, and the described second audio mixing data are dull in time to be increased, and the audio mixing result is formed by stacking by the described first audio mixing data and the second audio mixing data.
2, method according to claim 1 is characterized in that: in described level and smooth district, the described first audio mixing data are linear in time to descend, and the described second audio mixing data are linear in time to be increased.
3, method according to claim 2 is characterized in that: the audio mixing data in described audio mixing district satisfy following formula:
MixOut ( n ) = ( M - ramp ) · X 1 ( n ) + ramp · X 2 ( n ) M
Wherein: X 1(n) be the described first audio mixing data;
X 2(n) be the described second audio mixing data;
M is the digital quantity of the level and smooth district of expression time span, for greater than 0 positive integer;
Ramp is variable transit time, linear change in time, and its excursion is 0~M.
4, method according to claim 3, it is characterized in that: the time span that described level and smooth district is set is less than or equal to 1/2 described set time segment length, when described set time section is 20ms, the time span in described level and smooth district is less than or equal to 10ms, and corresponding described M is less than or equal to 80.
5, method according to claim 4 is characterized in that: corresponding length is the set time section of 20ms, and the best value of described M is 80.
6, method according to claim 5 is characterized in that: when described level and smooth district was arranged on the rear portion of described set time section, the audio mixing result in described audio mixing district was the described first audio mixing data.
7, method according to claim 5 is characterized in that: when described level and smooth district was arranged on described set time section anterior, the audio mixing result in described audio mixing district was the described second audio mixing data.
CN 200510034524 2005-04-30 2005-04-30 Telephone conference voice mixing method Pending CN1859511A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 200510034524 CN1859511A (en) 2005-04-30 2005-04-30 Telephone conference voice mixing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 200510034524 CN1859511A (en) 2005-04-30 2005-04-30 Telephone conference voice mixing method

Publications (1)

Publication Number Publication Date
CN1859511A true CN1859511A (en) 2006-11-08

Family

ID=37298372

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 200510034524 Pending CN1859511A (en) 2005-04-30 2005-04-30 Telephone conference voice mixing method

Country Status (1)

Country Link
CN (1) CN1859511A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103259943A (en) * 2012-02-21 2013-08-21 深圳市东进软件开发有限公司 PSTN teleconference sound mixing method
CN104167210A (en) * 2014-08-21 2014-11-26 华侨大学 Lightweight class multi-side conference sound mixing method and device
CN107211058A (en) * 2015-02-03 2017-09-26 杜比实验室特许公司 Dialogue-based dynamic meeting segmentation
US9876913B2 (en) 2014-02-28 2018-01-23 Dolby Laboratories Licensing Corporation Perceptual continuity using change blindness in conferencing
CN112885329A (en) * 2021-02-02 2021-06-01 广州广哈通信股份有限公司 Control method and device for improving sound mixing quality and storage medium

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103259943A (en) * 2012-02-21 2013-08-21 深圳市东进软件开发有限公司 PSTN teleconference sound mixing method
US9876913B2 (en) 2014-02-28 2018-01-23 Dolby Laboratories Licensing Corporation Perceptual continuity using change blindness in conferencing
CN104167210A (en) * 2014-08-21 2014-11-26 华侨大学 Lightweight class multi-side conference sound mixing method and device
CN107211058A (en) * 2015-02-03 2017-09-26 杜比实验室特许公司 Dialogue-based dynamic meeting segmentation
US10522151B2 (en) 2015-02-03 2019-12-31 Dolby Laboratories Licensing Corporation Conference segmentation based on conversational dynamics
CN107211058B (en) * 2015-02-03 2020-06-16 杜比实验室特许公司 Session dynamics based conference segmentation
CN112885329A (en) * 2021-02-02 2021-06-01 广州广哈通信股份有限公司 Control method and device for improving sound mixing quality and storage medium

Similar Documents

Publication Publication Date Title
US10574828B2 (en) Method for carrying out an audio conference, audio conference device, and method for switching between encoders
CN1859511A (en) Telephone conference voice mixing method
CN104539816B (en) The intelligent sound mixing method and device of a kind of multipartite voice call
CN100505530C (en) Volume control method and system
CN1219264A (en) Speeking speed changing method and device
US9628630B2 (en) Method for improving perceptual continuity in a spatial teleconferencing system
CN1820542A (en) Hearing aid with acoustic feedback suppression
CN1271593C (en) Voice signal detection method
CN102857732B (en) Menu control method, equipment and system in a kind of many pictures video conference
CN1805006A (en) Quick and real-time sound mixing method for multimedia conference
CN103828232A (en) Dynamic range control
CN102664019B (en) DSP sound mixing method and device for full-interactive conference
CN1331883A (en) Methods and appts. for adaptive signal gain control in communications systems
CN1206860C (en) Mixed sound system of intelligent controlled video frequency conference and method of controlling conference course
CN1383657A (en) Telecom system and method with speed recognizer
CN1309237C (en) Method for automatically regulating volume of mobile telephone
WO2008043731A1 (en) Method for operating a hearing aid, and hearing aid
EP2047632A1 (en) Method for carrying out a voice conference, and voice conference system
CN1135786C (en) Method and apparatus for providing multi-party speech connection for use in wireless communication system
CN1845573A (en) Simultaneous interpretation video conference system and method for supporting high capacity mixed sound
CN102301748A (en) Detection Signal Delay Method, Detection Device And Encoder
CN1277401C (en) Mixing method of telephone meeting
CN113299299A (en) Audio processing apparatus, method and computer-readable storage medium
CN114566152A (en) Voice endpoint detection method based on deep learning
EP3949368B1 (en) Scalable voice scene media server

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication