CN117880696A - Sound mixing method, device, computer equipment and storage medium - Google Patents

Sound mixing method, device, computer equipment and storage medium Download PDF

Info

Publication number
CN117880696A
CN117880696A CN202211245868.9A CN202211245868A CN117880696A CN 117880696 A CN117880696 A CN 117880696A CN 202211245868 A CN202211245868 A CN 202211245868A CN 117880696 A CN117880696 A CN 117880696A
Authority
CN
China
Prior art keywords
audio signal
signal
audio
mixing
volume
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211245868.9A
Other languages
Chinese (zh)
Inventor
陈明良
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Kaidelian Software Technology Co ltd
Original Assignee
Guangzhou Kaidelian Software Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Kaidelian Software Technology Co ltd filed Critical Guangzhou Kaidelian Software Technology Co ltd
Priority to CN202211245868.9A priority Critical patent/CN117880696A/en
Publication of CN117880696A publication Critical patent/CN117880696A/en
Pending legal-status Critical Current

Links

Landscapes

  • Circuit For Audible Band Transducer (AREA)

Abstract

The application relates to a method, a device, a computer device and a storage medium for mixing sound, wherein the method comprises the following steps: acquiring a first audio signal of a near-field microphone and a second audio signal of a far-field microphone; performing first time delay alignment on the first audio signal and the second audio signal to obtain a third audio signal and a fourth audio signal; the third audio signal is a signal obtained by performing first time delay alignment on the first audio signal, and the fourth audio signal is a signal obtained by performing first time delay alignment on the second audio signal; detecting a first volume of the third audio signal and a second volume of the fourth audio signal; and comparing the first volume and the second volume with a preset threshold value, and performing audio mixing processing on the third audio signal and the fourth audio signal according to a audio mixing method corresponding to the comparison result to obtain an audio mixing result, thereby improving the audio mixing quality of the sound of a teacher and a student.

Description

Sound mixing method, device, computer equipment and storage medium
Technical Field
The present disclosure relates to the field of audio processing technologies, and in particular, to a method and apparatus for mixing audio, a computer device, and a storage medium.
Background
In a remote teaching scene, a recording and broadcasting host, a near-field microphone and a far-field microphone are arranged in a classroom, the near-field microphone collects the sound of a teacher, the far-field microphone collects the sound of a student, the recording and broadcasting host receives the sound of the teacher collected by the near-field microphone and the sound of the student collected by the far-field microphone and mixes the sound of the teacher and the sound of the student, so that the sound of the teacher and the sound of the student can be accurately mixed together to obtain a sound mixing result, the sound mixing result is shared to a network teaching platform, and the sharing of teaching resources is realized.
Because the sound can attenuate in the propagation process, in terms of the sound of the teacher, the signal-to-noise ratio of the sound of the teacher collected by the far-field microphone is low, so that the quality of the sound of the teacher can be influenced when the sound of the teacher collected by the far-field microphone is mixed with the sound of the teacher collected by the near-field microphone.
Disclosure of Invention
Accordingly, an object of the present application is to provide a mixing method, apparatus, computer device, and storage medium, which can improve the mixing quality of sound of a teacher and a student.
According to a first aspect of embodiments of the present application, there is provided a mixing method, including the steps of:
Acquiring a first audio signal of a near-field microphone and a second audio signal of a far-field microphone;
performing first time delay alignment on the first audio signal and the second audio signal to obtain a third audio signal and a fourth audio signal; the third audio signal is a signal obtained by performing first time delay alignment on the first audio signal, and the fourth audio signal is a signal obtained by performing first time delay alignment on the second audio signal;
detecting a first volume of the third audio signal and a second volume of the fourth audio signal;
comparing the first volume and the second volume with a preset threshold value, and performing audio mixing processing on the third audio signal and the fourth audio signal according to a audio mixing method corresponding to the comparison result to obtain an audio mixing result.
According to a second aspect of embodiments of the present application, there is provided a mixing device, including:
the signal acquisition module is used for acquiring a first audio signal of the near-field microphone and a second audio signal of the far-field microphone;
the signal alignment module is used for performing first time delay alignment on the first audio signal and the second audio signal to obtain a third audio signal and a fourth audio signal; the third audio signal is a signal obtained by performing first time delay alignment on the first audio signal, and the fourth audio signal is a signal obtained by performing first time delay alignment on the second audio signal;
The volume detection module is used for detecting the first volume of the third audio signal and the second volume of the fourth audio signal;
and the signal mixing module is used for comparing the first volume and the second volume with a preset threshold value, and carrying out mixing processing on the third audio signal and the fourth audio signal according to a mixing method corresponding to the comparison result to obtain a mixing result.
According to a third aspect of embodiments of the present application, there is provided a computer device comprising: a processor and a memory; wherein the memory stores a computer program adapted to be loaded by the processor and to perform the mixing method according to any of the preceding claims.
According to a fourth aspect of embodiments of the present application, there is provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements a mixing method as described in any of the above.
The method comprises the steps of obtaining a first audio signal of a near-field microphone and a second audio signal of a far-field microphone; performing first time delay alignment on the first audio signal and the second audio signal to obtain a third audio signal and a fourth audio signal; the third audio signal is a signal obtained by performing first time delay alignment on the first audio signal, and the fourth audio signal is a signal obtained by performing first time delay alignment on the second audio signal; detecting a first volume of the third audio signal and a second volume of the fourth audio signal; comparing the first volume and the second volume with a preset threshold value, and performing audio mixing processing on the third audio signal and the fourth audio signal according to the audio mixing method corresponding to the comparison result to obtain an audio mixing result, so that the corresponding audio mixing method is determined according to the first volume of the third audio signal and the second volume of the fourth audio signal, and the audio mixing quality of the teacher and the students is improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
For a better understanding and implementation, the present invention is described in detail below with reference to the drawings.
Drawings
Fig. 1 is an application scenario schematic diagram of a mixing method according to an embodiment of the present application;
fig. 2 is a flow chart of a mixing method according to an embodiment of the present application;
fig. 3 is a schematic flow chart of step S40 in the audio mixing method according to an embodiment of the present application;
fig. 4 is a flowchart illustrating step S42 in the audio mixing method according to an embodiment of the present application;
fig. 5 is a flowchart illustrating step S43 in the audio mixing method according to an embodiment of the present application;
fig. 6 is a schematic flowchart of step S10 in the audio mixing method according to an embodiment of the present application;
fig. 7 is a block diagram of a sound mixing device according to an embodiment of the present application;
fig. 8 is a schematic block diagram of an electronic device according to an embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the following detailed description of the embodiments of the present application will be given with reference to the accompanying drawings.
It should be understood that the described embodiments are merely some, but not all, of the embodiments of the present application. All other embodiments, based on the embodiments herein, which would be apparent to one of ordinary skill in the art without making any inventive effort, are intended to be within the scope of the present application.
The terminology used in the embodiments of the application is for the purpose of describing particular embodiments only and is not intended to be limiting of the embodiments of the application. As used in this application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items.
When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present application as detailed in the accompanying claims. In the description of this application, it should be understood that the terms "first," "second," "third," and the like are used merely to distinguish between similar objects and are not necessarily used to describe a particular order or sequence, nor should they be construed to indicate or imply relative importance. The specific meaning of the terms in this application will be understood by those of ordinary skill in the art as the case may be.
Furthermore, in the description of the present application, unless otherwise indicated, "a plurality" means two or more. "and/or", describes an association relationship of an association object, and indicates that there may be three relationships, for example, a and/or B, and may indicate: a exists alone, A and B exist together, and B exists alone. The character "/" generally indicates that the context-dependent object is an "or" relationship.
The application scenario of the audio mixing method in the embodiment of the application can be applied to a remote teaching scenario, and also can be applied to a meeting, a report, a lecture and other scenarios, and in the embodiment of the application, the scheme of the application is described by taking the remote teaching scenario as an illustration.
As shown in fig. 1, an application scenario of the audio mixing method in the embodiment of the present application includes a near-field microphone 100, a plurality of far-field microphones 110, and a recording and playing host 120. The near-field microphone 100 is a microphone carried by a teacher, and may be a microphone worn by the teacher or held by the teacher, for collecting the voice of the teacher teaching. Far-field microphones 110 may be installed at front and rear positions of a student area, for example, on front and rear walls of a classroom, respectively, to collect sounds of front-row students and rear-row students, respectively. The recording and broadcasting host 120 can be arranged behind a classroom, the near-field microphone 100 and the far-field microphone 110 are connected with the recording and broadcasting host 120 through wires or wirelessly, the recording and broadcasting host 120 receives audio signals collected by the near-field microphone 100 and the far-field microphone 110, and mixes teacher sound collected by the far-field microphone 110 with teacher sound collected by the near-field microphone 100, so that the sound of the teacher and the sound of students can be accurately mixed together, a mixing result is obtained, the mixing result is shared to a network teaching platform, and the sharing of teaching resources is realized.
Because the sound can attenuate in the propagation process, in terms of the sound of the teacher teaching, the signal to noise ratio of the teacher teaching sound collected by the far-field microphone 110 is lower than that of the near-field microphone 100, so that when the teacher teaching sound collected by the far-field microphone 110 and the teacher teaching sound collected by the near-field microphone 100 are mixed, the quality of the teacher teaching sound can be reduced, and the remote teaching quality is poor, so that teaching experience is affected.
Example 1
Please refer to fig. 1, which is a flowchart illustrating a method for mixing audio according to an embodiment of the present application. The audio mixing method provided by the embodiment of the application comprises the following steps:
s10: a first audio signal of the near-field microphone and a second audio signal of the far-field microphone are acquired.
The near-field microphone may be a microphone carried by the teacher, specifically, the near-field microphone may be a microphone worn by the teacher or a microphone held by the teacher, and the near-field microphone may also be a microphone disposed near the teacher.
The near-field microphone is used for collecting teaching sounds of a teacher end, the first audio signal collected by the near-field microphone is generally only an audio signal of the teacher, when students answer questions or read aloud, the near-field microphone can collect sounds of the student end, and at the moment, the first audio signal collected by the near-field microphone also comprises the audio signal of the students.
The far-field microphone can be a microphone installed on a wall or a ceiling in the teaching room, and can also be a microphone built in the teaching recording and playing equipment.
The far-field microphone is used for collecting sounds at the student end, and can specifically comprise sounds of questions answered by students or sounds of a clatter, and the second audio signal of the far-field microphone is generally an audio signal of the students. When the teacher speaks, the sound of the teacher can be collected by the far-field microphone, and the second audio signal of the far-field microphone also comprises the audio signal of the teacher.
It is known that the first audio signal of the near-field microphone and the second audio signal of the far-field microphone may appear as signals collected for the same sound object, for example, when the teacher gives lessons or the students read aloud, the near-field microphone and the far-field microphone collect corresponding audio signals. Meanwhile, the first audio signal of the near-field microphone and the second audio signal of the far-field microphone may appear as signals collected for different sound production objects, for example, when speaking for a teacher, the student inserts a conversation, and the near-field microphone and the far-field microphone both collect audio signals mixed by the teacher and the student. For this reason, for this phenomenon, it is necessary to improve the audio quality of the same sound emission object and to preserve the sound of different sound emission objects at the time of mixing.
S20: performing first time delay alignment on the first audio signal and the second audio signal to obtain a third audio signal and a fourth audio signal; the third audio signal is a signal obtained by performing first time delay alignment on the first audio signal, and the fourth audio signal is a signal obtained by performing first time delay alignment on the second audio signal.
Among them, since the performance parameters of the microphones are different, for example, the performance parameters include directivity of the microphones, and the response capability of the microphones to sounds in different directions is different, there is an inherent delay difference between the near-field microphone and the far-field microphone when the same audio signal is collected. In the mixing process, if the inherent delay difference is not eliminated, the mixed audio signal is out of sync, so that the quality of the mixed audio is reduced. In order to eliminate the inherent delay difference between the near-field microphone and the far-field microphone, the first audio signal and the second audio signal are subjected to first delay alignment, and a third audio signal and a fourth audio signal are obtained.
In the embodiment of the present application, the delay value between the first audio signal and the second audio signal may be obtained according to the start time point of the first audio signal and the start time point of the second audio signal. If the starting time point of the first audio signal is earlier than the starting time point of the second audio signal, zero can be added in a time period corresponding to the time delay value before the first audio signal, and the first time delay alignment is carried out on the first audio signal and the second audio signal, so that a third audio signal and a fourth audio signal are obtained. If the starting time point of the first audio signal is later than the starting time point of the second audio signal, zero can be added in a time period corresponding to the time delay value before the second audio signal, and the first time delay alignment is carried out on the first audio signal and the second audio signal, so that a third audio signal and a fourth audio signal are obtained. After the first audio signal and the second audio signal are aligned, the first audio signal and the second audio signal are in one-to-one correspondence in time, so that the effect of time delay alignment is achieved.
S30: the first volume of the third audio signal and the second volume of the fourth audio signal are detected.
In the embodiment of the application, the first volume value and the second volume value of the first volume are obtained by detecting the first volume of the third audio signal and the second volume of the fourth audio signal.
S40: comparing the first volume and the second volume with a preset threshold value, and performing audio mixing processing on the third audio signal and the fourth audio signal according to a audio mixing method corresponding to the comparison result to obtain an audio mixing result.
In the embodiment of the application, the volume value of the first volume is compared with a preset threshold, and if the volume value of the first volume is greater than or equal to the preset threshold, the teacher is judged to be speaking or the student is speaking in a clatter. Further, comparing the volume value of the second volume with a preset threshold, and judging that the student is in the clatter reading if the volume value of the second volume is larger than or equal to the preset threshold. The third audio signal and the fourth audio signal may be mixed to obtain a mixing result. If the volume value of the second volume is smaller than the preset threshold value, the teacher is judged to be speaking, and the third audio signal can be used as a sound mixing result. If the volume value of the first volume is smaller than the preset threshold, it is judged that the teacher does not speak and the student does not read the sound, and if the student only answers the questions, the third audio signal and the fourth audio signal can be mixed to obtain a mixing result, and the fourth audio signal can also be directly used as the mixing result.
By applying the embodiment of the application, the first audio signal of the near-field microphone and the second audio signal of the far-field microphone are obtained; performing first time delay alignment on the first audio signal and the second audio signal to obtain a third audio signal and a fourth audio signal; the third audio signal is a signal obtained by performing first time delay alignment on the first audio signal, and the fourth audio signal is a signal obtained by performing first time delay alignment on the second audio signal; detecting a first volume of the third audio signal and a second volume of the fourth audio signal; and comparing the first volume and the second volume with a preset threshold value, and performing audio mixing processing on the third audio signal and the fourth audio signal according to the audio mixing method corresponding to the comparison result to obtain an audio mixing result, so that the corresponding audio mixing method is determined according to the first volume of the third audio signal and the second volume of the fourth audio signal, and the audio mixing quality of the teacher and the students is improved.
In an alternative embodiment, referring to fig. 3, the preset threshold includes a first preset threshold and a second preset threshold, step S40 compares the first volume and the second volume with the preset thresholds, and performs mixing processing on the third audio signal and the fourth audio signal according to a mixing method corresponding to the comparison result, so as to obtain a mixing result, including steps S41 to S43, which are specifically as follows:
S41: if the first volume is larger than or equal to a first preset threshold value and the second volume is smaller than a second preset threshold value, performing second time delay alignment on the third audio signal and the fourth audio signal to obtain a fifth audio signal and a sixth audio signal; the fifth audio signal is a signal obtained by performing second time delay alignment on the third audio signal, and the sixth audio signal is a signal obtained by performing second time delay alignment on the fourth audio signal.
The first preset threshold is a volume threshold when a teacher speaks, the second preset threshold is a volume threshold when a student speaks in a flush manner, and the second preset threshold is larger than the first preset threshold.
In this embodiment of the present application, if the first volume is greater than or equal to the first preset threshold, and the second volume is less than the second preset threshold, it may be determined that the teacher is speaking, and the student does not have a clatter reading. The student may be in the middle of a conversation or may not speak.
Wherein, because the near-field microphone and the far-field microphone are different from the teacher and the students, when only the teacher speaks, the time delay exists for the teacher's sound collected by both the first audio signal collected by the near-field microphone and the second audio signal collected by the far-field microphone. Specifically, when a teacher teaches a class, the teacher moves in a classroom, and the distance from the sound of the teacher to the near-field microphone is fixed and the distance from the sound of the teacher to the far-field microphone is variable, so that there is a time delay in the sound of the teacher collected by both the first audio signal collected by the near-field microphone and the second audio signal collected by the far-field microphone.
When only students speak, there is time delay in student sounds collected by both the first audio signal collected by the near-field microphone and the second audio signal collected by the far-field microphone. If the first audio signal and the second audio signal with time delay are directly subjected to coherence computation, the accuracy of the coherence computation is reduced.
In order to improve accuracy of coherence computation, second time delay alignment is performed on the third audio signal and the fourth audio signal, and a fifth audio signal and a sixth audio signal are obtained. Specifically, the delay values of the third audio signal and the fourth audio signal may be obtained according to the start time point of the third audio signal and the start time point of the fourth audio signal. If the start time point of the third audio signal is earlier than the start time point of the fourth audio signal, the fourth audio signal at a time after the time period corresponding to the delay time delay value may be acquired after the third audio signal is acquired. For example, if the third audio signal is 1ms earlier than the fourth audio signal, 1ms after the time of acquiring the third audio signal, the fourth audio signal is acquired again. If the starting time point of the third audio signal is later than the starting time point of the fourth audio signal, the third audio signal at a time after the time period corresponding to the delay time delay value can be obtained after the fourth audio signal is obtained, and the third audio signal and the fourth audio signal are subjected to second time delay alignment to obtain a fifth audio signal and a sixth audio signal.
S42: and calculating the coherence of the fifth audio signal and the sixth audio signal to obtain a coherence result.
Where the coherence of signals refers to the degree of correlation between signals. When the first audio signal and the second audio signal are the same audio signal, specifically, the first audio signal and the second audio signal are both audio signals of a teacher or are both audio signals of a student, the coherence is high. When the first audio signal and the second audio signal are both mixed audio signals, specifically, the first audio signal and the second audio signal are both mixed audio signals of a teacher and a student, the coherence is low.
In the embodiment of the present application, the coherence result is obtained by calculating the coherence of the fifth audio signal and the sixth audio signal, and according to the coherence result, it is determined whether the first audio signal and the second audio signal are the same audio signal.
S43: and obtaining a mixing result of the fifth audio signal and the sixth audio signal according to the coherence result.
In the embodiment of the present application, the coherence result may be a specific value, and different values correspond to different mixing methods. Specifically, there may be a mapping relationship between a value and a mixing method, where the value is greater than or equal to a preset coherence threshold, and the corresponding mixing method is to directly use the fifth audio signal as a mixing result. When the value is smaller than the preset coherence threshold, the corresponding audio mixing method is to mix the fifth audio signal and the sixth audio signal, and the audio signal after audio mixing is used as a mixing result.
The mapping relation exists between the interval where the numerical value is located and the mixing method, and different intervals correspond to different mixing methods. Specifically, when the value is located in the first interval, the corresponding audio mixing method is to directly use the fifth audio signal as the audio mixing result. When the numerical value is in the second interval, the corresponding audio mixing method is to mix the fifth audio signal and the sixth audio signal, and the audio signal after the audio mixing is used as a mixing result. Wherein the value in the first interval is greater than the value in the second interval.
By calculating the coherence of the fifth audio signal and the sixth audio signal, it is possible to distinguish whether the first audio signal and the second audio signal are the same audio signal or a mixed audio signal, thereby improving the quality of sound mixing of a teacher and a student.
In an alternative embodiment, referring to fig. 4, step S42 calculates coherence of the fifth audio signal and the sixth audio signal to obtain a coherence result, including steps S421 to S422, specifically as follows:
s421: if the first volume is larger than or equal to a first preset threshold and the second volume is smaller than a second preset threshold, respectively performing time-frequency conversion on the fifth audio signal and the sixth audio signal to obtain a first frequency domain signal corresponding to the fifth audio signal and a second frequency domain signal corresponding to the sixth audio signal.
The first frequency domain signal corresponds to a fifth audio signal, the second frequency domain signal corresponds to a sixth audio signal, the first frequency domain signal and the second frequency domain signal correspond to the same frequency domain, and the frequency domain comprises a plurality of frequency points. The time-frequency transformation method is the prior art and will not be described in detail here.
S422: the coherence result is obtained by dividing the square of the cross-power spectrum of the first frequency domain signal and the second frequency domain signal by the product between the power spectrum of the first frequency domain signal and the power spectrum of the second frequency domain signal.
In the embodiment of the present application, the calculation formula of the coherence result is as follows:
wherein S is yx (omega) represents the mutual power of the first frequency domain signal and the second frequency domain signal at the omega frequency pointSpectrum, S x (omega) represents the power spectrum of the first frequency domain signal at the omega frequency point, S y And (omega) represents the power spectrum of the second frequency domain signal at omega frequency points.
The coherence of the fifth audio signal and the sixth audio signal can be automatically and quickly calculated by the first frequency domain signal and the second frequency domain signal.
In an alternative embodiment, referring to fig. 5, step S43 includes steps S431 to S432 of obtaining a mixing result of the fifth audio signal and the sixth audio signal according to the coherence result, specifically as follows:
S431: and if the coherence result is greater than or equal to a preset coherence threshold, taking the fifth audio signal as a mixing result.
If the first audio signal and the second audio signal are the same audio signal, specifically, if the first audio signal and the second audio signal are both audio signals of a teacher, only the first audio signal may be reserved and the second audio signal may be ignored during mixing. This is because the teacher's voice attenuates during the propagation so that the signal-to-noise ratio of the first audio signal is higher than that of the second audio signal, which is mixed with the first audio signal to have a lower audio quality than the pure first audio signal.
In the embodiment of the present application, the calculated coherence result is compared with a preset coherence threshold, if the coherence result is greater than or equal to the preset coherence threshold, it indicates that the coherence between the fifth audio signal and the sixth audio signal is high, the fifth audio signal and the sixth audio signal are both audio signals of a teacher, and the fifth audio signal is reserved as a mixing result.
S432: and if the coherence result is smaller than a preset coherence threshold, mixing the fifth audio signal and the sixth audio signal to obtain a first mixed signal, and taking the first mixed signal as a mixed result.
If the first audio signal and the second audio signal are mixed audio signals of a teacher and a student, that is, the teacher speaks and the student inserts a speech, the fifth audio signal and the sixth audio signal can be mixed to obtain a mixing result, so that the speaking sounds of the teacher and the student are ensured to be kept.
In the embodiment of the present application, if the coherence result is smaller than the preset coherence threshold, it indicates that the coherence between the fifth audio signal and the sixth audio signal is low, the fifth audio signal and the sixth audio signal are both mixed audio signals of a teacher and a student, the fifth audio signal and the sixth audio signal are mixed to obtain a first mixed audio signal, and the first mixed audio signal is used as a mixed audio result, so that the sound of the teacher and the student is simultaneously saved.
By comparing the calculated coherence result with a preset coherence threshold, the mixing result of the fifth audio signal and the sixth audio signal can be automatically and quickly determined according to the comparison result.
In an optional embodiment, step S40 compares the first volume and the second volume with a preset threshold, and performs a mixing process on the third audio signal and the fourth audio signal according to a mixing method corresponding to the comparison result, so as to obtain a mixing result, which includes step S44, specifically includes the following steps:
S44: and if the first volume is larger than or equal to a first preset threshold value and the second volume is larger than or equal to a second preset threshold value, mixing the third audio signal and the fourth audio signal to obtain a second mixed signal, and taking the second mixed signal as a mixed result.
In this application embodiment, when only student's multiple sound is read aloud, because student's multiple sound is loud, far-field microphone and near-field microphone all can gather student's multiple sound and read aloud's sound. If the first volume is greater than or equal to the first preset threshold and the second volume is greater than or equal to the second preset threshold, the student can be judged to read in a clatter. At this time, the teacher may or may not be speaking.
And mixing the third audio signal and the fourth audio signal to obtain a second mixed signal, and taking the second mixed signal as a mixed result, thereby ensuring that the sound of the teacher and the students in the same sound reading is kept.
In an alternative embodiment, referring to fig. 6, step S10 of acquiring a first audio signal of a near-field microphone and a second audio signal of a far-field microphone includes steps S101 to S102, which are specifically as follows:
s101: acquiring a first audio signal of a near-field microphone and a third audio signal of a far-field microphone;
S102: and taking the first audio signal as a reference signal, and carrying out silencing treatment on the third audio signal to obtain a second audio signal of the far-field microphone.
In the embodiment of the present application, the third audio signal of the far-field microphone may be an audio signal of a teacher collected by the far-field microphone, or may be a mixed audio signal of a teacher and a student collected by the far-field microphone. And the first audio signal is used as a reference signal, and the third audio signal is subjected to silencing treatment, so that the audio signal of a teacher in the third audio signal is removed, and a second audio signal is obtained, wherein the second audio signal is the audio signal of only students. The silencing treatment can be performed by adopting an echo cancellation method or an adaptive filtering algorithm.
The second audio signal of the far-field microphone is obtained by carrying out silencing treatment on the third audio signal, and as the second audio signal does not have the audio signal of a teacher, the second audio signal of the far-field microphone and the first audio signal of the near-field microphone can be directly mixed, so that the sound mixing quality of the teacher and students is not reduced.
Example 2
The following are examples of apparatus that may be used to perform the method of example 1 of the present application. For details not disclosed in the device embodiments of the present application, please refer to the method in embodiment 1 of the present application.
Fig. 7 is a schematic structural diagram of a sound mixing device according to an embodiment of the present disclosure. The audio mixing apparatus 5 provided in the embodiment of the present application includes:
a signal acquisition module 51 for acquiring a first audio signal of a near-field microphone and a second audio signal of a far-field microphone;
a signal alignment module 52, configured to perform a first time delay alignment on the first audio signal and the second audio signal, so as to obtain a third audio signal and a fourth audio signal; the third audio signal is a signal obtained by performing first time delay alignment on the first audio signal, and the fourth audio signal is a signal obtained by performing first time delay alignment on the second audio signal;
a volume detection module 53 for detecting a first volume of the third audio signal and a second volume of the fourth audio signal;
the signal mixing module 54 is configured to compare the first volume and the second volume with a preset threshold, and perform mixing processing on the third audio signal and the fourth audio signal according to a mixing method corresponding to the comparison result, so as to obtain a mixing result.
Optionally, the signal acquisition module includes:
a first audio signal acquisition unit configured to acquire a first audio signal of a near-field microphone and a third audio signal of a far-field microphone;
And the signal silencing processing unit is used for silencing the third audio signal by taking the first audio signal as a reference signal to obtain a second audio signal of the far-field microphone.
Optionally, the signal mixing module includes:
the signal alignment unit is used for performing second time delay alignment on the third audio signal and the fourth audio signal if the first volume is larger than or equal to a first preset threshold value and the second volume is smaller than a second preset threshold value, so as to obtain a fifth audio signal and a sixth audio signal; the fifth audio signal is a signal obtained by performing second time delay alignment on the third audio signal, and the sixth audio signal is a signal obtained by performing second time delay alignment on the fourth audio signal;
a coherence calculating unit for calculating coherence of the fifth audio signal and the sixth audio signal to obtain a coherence result;
and a mixing result obtaining unit for obtaining the mixing result of the fifth audio signal and the sixth audio signal according to the coherence result.
Optionally, the coherence calculating unit includes:
the time-frequency conversion unit is used for respectively performing time-frequency conversion on the fifth audio signal and the sixth audio signal if the first volume is larger than or equal to a first preset threshold value and the second volume is smaller than a second preset threshold value, so as to obtain a first frequency domain signal corresponding to the fifth audio signal and a second frequency domain signal corresponding to the sixth audio signal;
A coherence result obtaining unit, configured to divide the square of the cross power spectrum of the first frequency domain signal and the second frequency domain signal by the product between the power spectrum of the first frequency domain signal and the power spectrum of the second frequency domain signal, to obtain a coherence result.
Optionally, the mixing result obtaining unit includes:
the first judging unit is used for taking the fifth audio signal as a sound mixing result if the coherence result is greater than or equal to a preset coherence threshold value;
and the second judging unit is used for mixing the fifth audio signal and the sixth audio signal to obtain a first mixed signal if the coherence result is smaller than a preset coherence threshold value, and taking the first mixed signal as a mixed result.
Optionally, the signal mixing module includes:
and the sound mixing unit is used for mixing the third audio signal and the fourth audio signal to obtain a second sound mixing signal, and taking the second sound mixing signal as a sound mixing result if the first sound volume is larger than or equal to a first preset threshold value and the second sound volume is larger than or equal to a second preset threshold value.
By applying the embodiment of the application, the first audio signal of the near-field microphone and the second audio signal of the far-field microphone are obtained; performing first time delay alignment on the first audio signal and the second audio signal to obtain a third audio signal and a fourth audio signal; the third audio signal is a signal obtained by performing first time delay alignment on the first audio signal, and the fourth audio signal is a signal obtained by performing first time delay alignment on the second audio signal; detecting a first volume of the third audio signal and a second volume of the fourth audio signal; and comparing the first volume and the second volume with a preset threshold value, and performing audio mixing processing on the third audio signal and the fourth audio signal according to the audio mixing method corresponding to the comparison result to obtain an audio mixing result, so that the corresponding audio mixing method is determined according to the first volume of the third audio signal and the second volume of the fourth audio signal, and the audio mixing quality of the teacher and the students is improved.
Example 3
The following are device embodiments of the present application that may be used to perform the method of embodiment 1 of the present application. For details not disclosed in the apparatus embodiments of the present application, please refer to the method in embodiment 1 of the present application.
Referring to fig. 8, the present application further provides an electronic device 300, which may be specifically a computer, a mobile phone, a tablet computer, an interactive tablet, and the like, in an exemplary embodiment of the present application, the electronic device 300 is an interactive tablet, and the interactive tablet may include: at least one processor 301, at least one memory 302, at least one display, at least one network interface 303, a user interface 304, and at least one communication bus 305.
The user interface 304 is mainly used for providing an input interface for a user, and acquiring data input by the user. Optionally, the user interface may also include a standard wired interface, a wireless interface.
The network interface 303 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface), among others.
Wherein a communication bus 305 is used to enable connected communications between these components.
Wherein the processor 301 may include one or more processing cores. The processor uses various interfaces and lines to connect various portions of the overall electronic device, perform various functions of the electronic device, and process data by executing or executing instructions, programs, code sets, or instruction sets stored in memory, and invoking data stored in memory. Alternatively, the processor may be implemented in hardware in at least one of digital signal processing (Digital Signal Processing, DSP), field programmable gate array (Field-Programmable Gate Array, FPGA), programmable logic array (Programmable Logic Array, PLA). The processor may integrate one or a combination of several of a central processing unit (Central Processing Unit, CPU), an image processor (Graphics Processing Unit, GPU), and a modem etc. The CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing the content required to be displayed by the display layer; the modem is used to handle wireless communications. It will be appreciated that the modem may not be integrated into the processor and may be implemented by a single chip.
The Memory 302 may include a random access Memory (Random Access Memory, RAM) or a Read-Only Memory (Read-Only Memory). Optionally, the memory includes a non-transitory computer readable medium (non-transitory computer-readable storage medium). The memory may be used to store instructions, programs, code sets, or instruction sets. The memory may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing the above-described respective method embodiments, etc.; the storage data area may store data or the like referred to in the above respective method embodiments. The memory may optionally also be at least one storage device located remotely from the aforementioned processor. The memory as a computer storage medium may include an operating system, a network communication module, a user interface module, and an operating application program.
The processor may be configured to call an application program of the video resolution adjustment method stored in the memory, and specifically execute the method steps of the foregoing embodiment 1, and the specific execution process may refer to the specific description shown in embodiment 1, which is not repeated herein.
Example 4
The present application further provides a computer readable storage medium, on which a computer program is stored, where instructions are adapted to be loaded by a processor and execute the method steps of the above-described embodiment 1, and the specific execution process may refer to the specific description shown in the embodiment, which is not repeated herein. The storage medium can be an electronic device such as a personal computer, a notebook computer, a smart phone, a tablet computer and the like.
For the device embodiments, reference is made to the description of the method embodiments for the relevant points, since they essentially correspond to the method embodiments. The above-described apparatus embodiments are merely illustrative, in which components illustrated as separate components may or may not be physically separate, and components shown as units may or may not be physical units, may be located in one place, or may be distributed over multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purposes of the present application. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart block or blocks and/or block diagram block or blocks.
In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, etc., such as Read Only Memory (ROM) or flash RAM. Memory is an example of a computer-readable medium.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises an element.
The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and changes may be made to the present application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc. which are within the spirit and principles of the present application are intended to be included within the scope of the claims of the present application.

Claims (10)

1. A method of mixing sound, the method comprising the steps of:
acquiring a first audio signal of a near-field microphone and a second audio signal of a far-field microphone;
performing first time delay alignment on the first audio signal and the second audio signal to obtain a third audio signal and a fourth audio signal; the third audio signal is a signal obtained by performing first time delay alignment on the first audio signal, and the fourth audio signal is a signal obtained by performing first time delay alignment on the second audio signal;
Detecting a first volume of the third audio signal and a second volume of the fourth audio signal;
comparing the first volume and the second volume with a preset threshold value, and performing audio mixing processing on the third audio signal and the fourth audio signal according to a audio mixing method corresponding to the comparison result to obtain an audio mixing result.
2. The mixing method according to claim 1, characterized in that:
the preset threshold comprises a first preset threshold and a second preset threshold;
comparing the first volume and the second volume with a preset threshold, and performing audio mixing processing on the third audio signal and the fourth audio signal according to a audio mixing method corresponding to a comparison result to obtain an audio mixing result, wherein the audio mixing method comprises the following steps:
if the first volume is greater than or equal to the first preset threshold and the second volume is less than the second preset threshold, performing second time delay alignment on the third audio signal and the fourth audio signal to obtain a fifth audio signal and a sixth audio signal; the fifth audio signal is a signal obtained by performing second time delay alignment on the third audio signal, and the sixth audio signal is a signal obtained by performing second time delay alignment on the fourth audio signal;
Calculating coherence of the fifth audio signal and the sixth audio signal to obtain a coherence result;
and obtaining a mixing result of the fifth audio signal and the sixth audio signal according to the coherence result.
3. The mixing method according to claim 2, characterized in that:
the step of obtaining a mixing result of the fifth audio signal and the sixth audio signal according to the coherence result includes:
if the coherence result is greater than or equal to a preset coherence threshold, taking the fifth audio signal as a sound mixing result;
and if the coherence result is smaller than the preset coherence threshold, mixing the fifth audio signal and the sixth audio signal to obtain a first mixed signal, and taking the first mixed signal as a mixed result.
4. A mixing method according to any one of claims 2 to 3, characterized in that:
the step of calculating the coherence of the fifth audio signal and the sixth audio signal to obtain a coherence result includes:
if the first volume is greater than or equal to the first preset threshold and the second volume is less than the second preset threshold, performing time-frequency conversion on the fifth audio signal and the sixth audio signal respectively to obtain a first frequency domain signal corresponding to the fifth audio signal and a second frequency domain signal corresponding to the sixth audio signal;
And dividing the square of the cross power spectrum of the first frequency domain signal and the second frequency domain signal by the product between the power spectrum of the first frequency domain signal and the power spectrum of the second frequency domain signal to obtain a coherence result.
5. The mixing method according to claim 1, characterized in that:
comparing the first volume and the second volume with a preset threshold, and performing audio mixing processing on the third audio signal and the fourth audio signal according to a audio mixing method corresponding to a comparison result to obtain an audio mixing result, wherein the audio mixing method comprises the following steps:
and if the first volume is larger than or equal to the first preset threshold value and the second volume is larger than or equal to the second preset threshold value, mixing the third audio signal and the fourth audio signal to obtain a second mixed signal, and taking the second mixed signal as a mixed result.
6. A mixing method according to any one of claims 1 to 3 or claim 5, wherein:
the step of acquiring the first audio signal of the near-field microphone and the second audio signal of the far-field microphone comprises the following steps:
acquiring a first audio signal of a near-field microphone and a third audio signal of a far-field microphone;
And taking the first audio signal as a reference signal, and carrying out silencing treatment on the third audio signal to obtain a second audio signal of the far-field microphone.
7. A mixing device, characterized by comprising:
the signal acquisition module is used for acquiring a first audio signal of the near-field microphone and a second audio signal of the far-field microphone;
the signal alignment module is used for performing first time delay alignment on the first audio signal and the second audio signal to obtain a third audio signal and a fourth audio signal; the third audio signal is a signal obtained by performing first time delay alignment on the first audio signal, and the fourth audio signal is a signal obtained by performing first time delay alignment on the second audio signal;
the volume detection module is used for detecting the first volume of the third audio signal and the second volume of the fourth audio signal;
and the signal mixing module is used for comparing the first volume and the second volume with a preset threshold value, and carrying out mixing processing on the third audio signal and the fourth audio signal according to a mixing method corresponding to the comparison result to obtain a mixing result.
8. The apparatus of claim 7, wherein the signal mixing module comprises:
The signal alignment unit is used for performing second time delay alignment on the third audio signal and the fourth audio signal if the first volume is larger than or equal to the first preset threshold value and the second volume is smaller than the second preset threshold value, so as to obtain a fifth audio signal and a sixth audio signal; the fifth audio signal is a signal obtained by performing second time delay alignment on the third audio signal, and the sixth audio signal is a signal obtained by performing second time delay alignment on the fourth audio signal;
a coherence calculating unit, configured to calculate coherence of the fifth audio signal and the sixth audio signal, and obtain a coherence result;
and a mixing result obtaining unit, configured to obtain a mixing result of the fifth audio signal and the sixth audio signal according to the coherence result.
9. A computer device, comprising: a processor, a memory and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any one of claims 1 to 6 when the computer program is executed.
10. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the method according to any one of claims 1 to 6.
CN202211245868.9A 2022-10-12 2022-10-12 Sound mixing method, device, computer equipment and storage medium Pending CN117880696A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211245868.9A CN117880696A (en) 2022-10-12 2022-10-12 Sound mixing method, device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211245868.9A CN117880696A (en) 2022-10-12 2022-10-12 Sound mixing method, device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117880696A true CN117880696A (en) 2024-04-12

Family

ID=90588837

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211245868.9A Pending CN117880696A (en) 2022-10-12 2022-10-12 Sound mixing method, device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117880696A (en)

Citations (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050114121A1 (en) * 2003-11-26 2005-05-26 Inria Institut National De Recherche En Informatique Et En Automatique Perfected device and method for the spatialization of sound
JP2008311876A (en) * 2007-06-13 2008-12-25 Funai Electric Co Ltd Television set with telephone function, television system and method for removing noise signal
WO2011026518A1 (en) * 2009-09-03 2011-03-10 Robert Bosch Gmbh Delay unit for a conference audio system, method for delaying audio input signals, computer program and conference audio system
EP3214858A1 (en) * 2016-03-03 2017-09-06 Thomson Licensing Apparatus and method for determining delay and gain parameters for calibrating a multi channel audio system
US20180053512A1 (en) * 2016-08-22 2018-02-22 Intel Corporation Reverberation compensation for far-field speaker recognition
US20180295463A1 (en) * 2015-10-12 2018-10-11 Nokia Technologies Oy Distributed Audio Capture and Mixing
CN109658935A (en) * 2018-12-29 2019-04-19 苏州思必驰信息科技有限公司 The generation method and system of multichannel noisy speech
CN109727607A (en) * 2017-10-31 2019-05-07 腾讯科技(深圳)有限公司 Delay time estimation method, device and electronic equipment
WO2020069310A1 (en) * 2018-09-28 2020-04-02 Knowles Electronics, Llc Synthetic nonlinear acoustic echo cancellation systems and methods
CN110970045A (en) * 2019-11-15 2020-04-07 北京达佳互联信息技术有限公司 Mixing processing method, mixing processing device, electronic equipment and storage medium
CN111385780A (en) * 2020-01-17 2020-07-07 北京塞宾科技有限公司 Bluetooth audio signal transmission method and device
CN111402868A (en) * 2020-03-17 2020-07-10 北京百度网讯科技有限公司 Voice recognition method and device, electronic equipment and computer readable storage medium
CN111583950A (en) * 2020-04-21 2020-08-25 珠海格力电器股份有限公司 Audio processing method and device, electronic equipment and storage medium
CN111866664A (en) * 2020-07-20 2020-10-30 深圳市康冠商用科技有限公司 Audio processing method, device, equipment and computer readable storage medium
CN111883156A (en) * 2020-07-22 2020-11-03 Oppo(重庆)智能科技有限公司 Audio processing method and device, electronic equipment and storage medium
US10867619B1 (en) * 2018-09-20 2020-12-15 Apple Inc. User voice detection based on acoustic near field
CN112489670A (en) * 2020-12-01 2021-03-12 广州华多网络科技有限公司 Time delay estimation method and device, terminal equipment and computer readable storage medium
CN212785857U (en) * 2020-07-28 2021-03-23 深圳大趋智能科技有限公司 Microphone array-based self-gain sound amplification device
CN112887875A (en) * 2021-01-22 2021-06-01 平安科技(深圳)有限公司 Conference system voice data acquisition method and device, electronic equipment and storage medium
CN113259762A (en) * 2021-04-07 2021-08-13 广州虎牙科技有限公司 Audio processing method and device, electronic equipment and computer readable storage medium
WO2021170061A1 (en) * 2020-02-28 2021-09-02 华为技术有限公司 Wireless sound amplification system and terminal
CN113555030A (en) * 2021-07-29 2021-10-26 杭州萤石软件有限公司 Audio signal processing method, device and equipment
CN113658579A (en) * 2021-09-18 2021-11-16 重庆紫光华山智安科技有限公司 Audio signal processing method and device, electronic equipment and readable storage medium
CN114424583A (en) * 2019-09-23 2022-04-29 杜比实验室特许公司 Hybrid near-field/far-field speaker virtualization

Patent Citations (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050114121A1 (en) * 2003-11-26 2005-05-26 Inria Institut National De Recherche En Informatique Et En Automatique Perfected device and method for the spatialization of sound
JP2008311876A (en) * 2007-06-13 2008-12-25 Funai Electric Co Ltd Television set with telephone function, television system and method for removing noise signal
WO2011026518A1 (en) * 2009-09-03 2011-03-10 Robert Bosch Gmbh Delay unit for a conference audio system, method for delaying audio input signals, computer program and conference audio system
US20180295463A1 (en) * 2015-10-12 2018-10-11 Nokia Technologies Oy Distributed Audio Capture and Mixing
EP3214858A1 (en) * 2016-03-03 2017-09-06 Thomson Licensing Apparatus and method for determining delay and gain parameters for calibrating a multi channel audio system
US20180053512A1 (en) * 2016-08-22 2018-02-22 Intel Corporation Reverberation compensation for far-field speaker recognition
CN109727607A (en) * 2017-10-31 2019-05-07 腾讯科技(深圳)有限公司 Delay time estimation method, device and electronic equipment
US10867619B1 (en) * 2018-09-20 2020-12-15 Apple Inc. User voice detection based on acoustic near field
WO2020069310A1 (en) * 2018-09-28 2020-04-02 Knowles Electronics, Llc Synthetic nonlinear acoustic echo cancellation systems and methods
CN109658935A (en) * 2018-12-29 2019-04-19 苏州思必驰信息科技有限公司 The generation method and system of multichannel noisy speech
CN114424583A (en) * 2019-09-23 2022-04-29 杜比实验室特许公司 Hybrid near-field/far-field speaker virtualization
CN110970045A (en) * 2019-11-15 2020-04-07 北京达佳互联信息技术有限公司 Mixing processing method, mixing processing device, electronic equipment and storage medium
CN111385780A (en) * 2020-01-17 2020-07-07 北京塞宾科技有限公司 Bluetooth audio signal transmission method and device
WO2021170061A1 (en) * 2020-02-28 2021-09-02 华为技术有限公司 Wireless sound amplification system and terminal
CN111402868A (en) * 2020-03-17 2020-07-10 北京百度网讯科技有限公司 Voice recognition method and device, electronic equipment and computer readable storage medium
CN111583950A (en) * 2020-04-21 2020-08-25 珠海格力电器股份有限公司 Audio processing method and device, electronic equipment and storage medium
CN111866664A (en) * 2020-07-20 2020-10-30 深圳市康冠商用科技有限公司 Audio processing method, device, equipment and computer readable storage medium
CN111883156A (en) * 2020-07-22 2020-11-03 Oppo(重庆)智能科技有限公司 Audio processing method and device, electronic equipment and storage medium
CN212785857U (en) * 2020-07-28 2021-03-23 深圳大趋智能科技有限公司 Microphone array-based self-gain sound amplification device
CN112489670A (en) * 2020-12-01 2021-03-12 广州华多网络科技有限公司 Time delay estimation method and device, terminal equipment and computer readable storage medium
CN112887875A (en) * 2021-01-22 2021-06-01 平安科技(深圳)有限公司 Conference system voice data acquisition method and device, electronic equipment and storage medium
CN113259762A (en) * 2021-04-07 2021-08-13 广州虎牙科技有限公司 Audio processing method and device, electronic equipment and computer readable storage medium
CN113555030A (en) * 2021-07-29 2021-10-26 杭州萤石软件有限公司 Audio signal processing method, device and equipment
CN113658579A (en) * 2021-09-18 2021-11-16 重庆紫光华山智安科技有限公司 Audio signal processing method and device, electronic equipment and readable storage medium

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
JUN LIAO: "Virtual mixer: Real-time audio mixing across clients and the cloud for multiparty conferencing", 《2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)》, 30 August 2012 (2012-08-30) *
左荣荣: "基于语音降噪的智能语音对话APP设计与实现", 《中国优秀硕士学位论文全文数据库-信息科技辑》, 15 March 2022 (2022-03-15) *
樊星: "多媒体会议中的快速实时自适应混音方案研究", 《软件学报》, 17 March 2005 (2005-03-17) *
郭威: "嵌入式语音识别在混响环境中的信号增强方法", 《计算机应用研究》, 15 December 2010 (2010-12-15) *

Similar Documents

Publication Publication Date Title
CN109074816B (en) Far field automatic speech recognition preprocessing
US8233352B2 (en) Audio source localization system and method
EP2926572B1 (en) Collaborative sound system
US9918174B2 (en) Wireless exchange of data between devices in live events
CN112017681B (en) Method and system for enhancing directional voice
JP6163468B2 (en) Sound quality evaluation apparatus, sound quality evaluation method, and program
CN109658935B (en) Method and system for generating multi-channel noisy speech
US10755728B1 (en) Multichannel noise cancellation using frequency domain spectrum masking
CN109524013B (en) Voice processing method, device, medium and intelligent equipment
CN101454825A (en) Method and apparatus for extracting and changing the reveberant content of an input signal
JP2017530396A (en) Method and apparatus for enhancing a sound source
CN110956976B (en) Echo cancellation method, device and equipment and readable storage medium
CN111863011B (en) Audio processing method and electronic equipment
Kirsch et al. Spatial resolution of late reverberation in virtual acoustic environments
CN111145773A (en) Sound field restoration method and device
Liu et al. Robust speech recognition in reverberant environments by using an optimal synthetic room impulse response model
CN117880696A (en) Sound mixing method, device, computer equipment and storage medium
CN116312570A (en) Voice noise reduction method, device, equipment and medium based on voiceprint recognition
CN111312244B (en) Voice interaction system and method for sand table
CN113066504A (en) Audio transmission method, device and computer storage medium
CN110265048B (en) Echo cancellation method, device, equipment and storage medium
CN113496699A (en) Voice processing method, device, storage medium and terminal
CN117118956B (en) Audio processing method, device, electronic equipment and computer readable storage medium
JP6126053B2 (en) Sound quality evaluation apparatus, sound quality evaluation method, and program
WO2023245700A1 (en) Audio energy analysis method and related apparatus

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination