CN112951262B - Audio recording method and device, electronic equipment and storage medium - Google Patents

Audio recording method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN112951262B
CN112951262B CN202110208491.9A CN202110208491A CN112951262B CN 112951262 B CN112951262 B CN 112951262B CN 202110208491 A CN202110208491 A CN 202110208491A CN 112951262 B CN112951262 B CN 112951262B
Authority
CN
China
Prior art keywords
frequency domain
frequency
signal
filtering
background noise
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110208491.9A
Other languages
Chinese (zh)
Other versions
CN112951262A (en
Inventor
孙云飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Xiaomi Pinecone Electronic Co Ltd
Original Assignee
Beijing Xiaomi Pinecone Electronic Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Xiaomi Pinecone Electronic Co Ltd filed Critical Beijing Xiaomi Pinecone Electronic Co Ltd
Priority to CN202110208491.9A priority Critical patent/CN112951262B/en
Publication of CN112951262A publication Critical patent/CN112951262A/en
Application granted granted Critical
Publication of CN112951262B publication Critical patent/CN112951262B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L21/0232Processing in the frequency domain

Abstract

The embodiment of the disclosure relates to an audio recording method and device, electronic equipment and a storage medium. The audio recording method may include: respectively collecting a background noise signal and a first audio signal; wherein the background noise signal and the first audio signal are both time domain signals; obtaining a frequency domain feature of the background noise signal, wherein the frequency domain feature comprises: n filtering frequency bands for filtering background noise and the frequency domain amplitude of each filtering frequency band, wherein N is a positive integer; performing time-frequency domain conversion on the first audio signal to obtain a first frequency domain signal; and according to the frequency domain characteristics, carrying out frequency domain filtering on the first frequency domain signal to obtain a second audio signal.

Description

Audio recording method and device, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of audio technologies, and in particular, to an audio recording method and apparatus, an electronic device, and a storage medium.
Background
There are many recording software in the related art, which generally obtains a background noise signal by recording the background noise, calculates the maximum amplitude or average amplitude of the background noise signal, and then subtracts the maximum amplitude or average amplitude of the noise signal from the recorded audio when recording an audio file. Such muting based on the highest amplitude or the average amplitude of the background noise signal may cause a portion of the amplitude of the target audio signal to be recorded as noise to be filtered out, thereby causing a large distortion of the target audio signal.
Disclosure of Invention
The disclosure provides an audio recording method and device, an electronic device and a storage medium.
A first aspect of the embodiments of the present disclosure provides an audio recording method, including:
respectively collecting a background noise signal and a first audio signal; wherein the background noise signal and the first audio signal are both time domain signals;
obtaining a frequency domain feature of the background noise signal, wherein the frequency domain feature comprises: n filtering frequency bands for filtering background noise and the frequency domain amplitude of each filtering frequency band, wherein N is a positive integer;
performing time-frequency domain conversion on the first audio signal to obtain a first frequency domain signal;
and according to the frequency domain characteristics, carrying out frequency domain filtering on the first frequency domain signal to obtain a second audio signal.
Based on the above scheme, the obtaining the frequency domain characteristics of the background noise signal includes:
performing time-frequency domain conversion on the background noise signal to obtain a second frequency domain signal;
and determining N filtering frequency bands and the frequency domain amplitude of each filtering frequency band according to the frequency domain amplitude of each frequency band of the second frequency domain signal.
Based on the above scheme, the determining N filtering frequency bands and frequency domain amplitudes of the filtering frequency bands according to the frequency domain amplitude of each frequency band of the second frequency domain signal includes:
integrating the frequency domain amplitude of each frequency band of the second frequency domain signal on a frequency domain to obtain an integral value;
and selecting N frequency bands with the maximum integral value as N filtering frequency bands, and acquiring the frequency domain amplitude of each filtering frequency band in the second frequency domain signal.
Based on the above solution, the determining N filtering frequency bands and frequency domain amplitudes of the filtering frequency bands according to the frequency domain amplitudes of each frequency band of the second frequency domain signal includes:
selecting N frequency bands with the largest frequency domain average amplitude from the second frequency domain signals to obtain N filtering frequency bands;
determining a frequency domain amplitude for the filtered frequency band based on the second frequency domain signal.
Based on the above solution, the performing frequency domain filtering on the first audio signal according to the frequency domain characteristics to obtain a second audio signal includes:
and at each filtering frequency section, subtracting the frequency amplitude corresponding to the filtering frequency section from the first frequency domain signal to obtain the second audio signal.
Based on the above scheme, the acquiring background noise includes:
acquiring the background noise signal before recording the first audio signal;
alternatively, the first and second electrodes may be,
and acquiring the background noise signal in the recording gap of the first audio signal.
A second aspect of the embodiments of the present disclosure provides an audio recording apparatus, including:
the acquisition module is used for respectively acquiring a background noise signal and a first audio signal; wherein the background noise signal and the first audio signal are both time domain signals;
an obtaining module, configured to obtain a frequency domain feature of the background noise signal in a frequency domain, where the frequency domain feature includes: n filtering frequency bands for filtering background noise and the frequency domain amplitude value of each filtering frequency band, wherein N is a positive integer;
the conversion module is used for carrying out time-frequency domain conversion on the first audio signal to obtain a first frequency domain signal;
and the filtering module is used for carrying out frequency domain filtering on the first frequency domain signal according to the frequency domain characteristics to obtain a second audio signal.
Based on the above scheme, the obtaining module is specifically configured to perform time-frequency domain conversion on the background noise signal to obtain a second frequency domain signal; and determining N filtering frequency bands and the frequency domain amplitude of each filtering frequency band according to the frequency domain amplitude of each frequency band of the second frequency domain signal.
Based on the above scheme, the obtaining module specifically performs frequency domain amplitude integration on each frequency band of the second frequency domain signal in a frequency domain to obtain an integral value; and selecting N frequency bands with the maximum integral value as N filtering frequency bands, and acquiring the frequency domain amplitude of each filtering frequency band in the second frequency domain signal.
Based on the above scheme, the obtaining module is specifically configured to select N frequency bands with the largest frequency domain average amplitude from the second frequency domain signal, so as to obtain N filtering frequency bands; determining a frequency domain amplitude of the filtered frequency band based on the second frequency domain signal.
Based on the above scheme, the filtering module specifically subtracts, at each filtering frequency band, the frequency amplitude corresponding to the filtering frequency band from the first frequency domain signal to obtain the second audio signal.
Based on the above scheme, the acquiring module is specifically configured to acquire the background noise signal before recording the first audio signal; alternatively, the background noise signal is acquired in the time gap between the two first audio signals.
According to a third aspect of the embodiments of the present disclosure, there is provided an electronic apparatus including:
a memory for storing processor-executable instructions;
a processor coupled to the memory;
wherein the processor is configured to perform the audio recording method as described above.
According to a fourth aspect of embodiments of the present disclosure, there is provided a non-transitory computer-readable storage medium having instructions thereon, which when executed by a processor of a computer, enable the computer to perform the audio recording method as described above.
The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects:
in the embodiment of the disclosure, during recording, the recorded first audio signal is not directly recorded according to the maximum amplitude value or the average amplitude value of the background noise signal in the time domain, but is not directly recorded after being converted into the frequency domain, and the first audio signal is subjected to frequency domain filtering according to the frequency domain characteristic of the background noise signal in the frequency domain, but is not directly subjected to subtraction of an amplitude value in the time domain, so that distortion caused in the recording filtering process is reduced, and the recording quality is improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.
Fig. 1 is a schematic flow diagram illustrating an audio recording method according to an exemplary embodiment;
FIG. 2 is a schematic diagram illustrating a time domain waveform of background noise in accordance with an exemplary embodiment;
fig. 3 is a schematic flow diagram illustrating an audio recording method according to an example embodiment;
FIG. 4 is a waveform diagram illustrating background noise in the frequency domain according to an exemplary embodiment;
FIG. 5A is a diagram illustrating a default input module for detecting brightness adjustments when a display screen is partially illuminated, according to an exemplary embodiment;
FIG. 5B is a diagram illustrating a default input module for detecting brightness adjustments when a display screen is partially illuminated, according to an exemplary embodiment;
FIG. 5C is a diagram illustrating preset input modules for detecting brightness adjustments when a display screen is partially illuminated, according to an exemplary embodiment;
fig. 6 is a schematic diagram illustrating the structure of an audio recording apparatus according to an exemplary embodiment;
fig. 7 is a schematic structural diagram of an electronic device shown in accordance with an example embodiment.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of devices consistent with certain aspects of the present disclosure, as detailed in the appended claims.
As shown in fig. 1, an embodiment of the present disclosure provides an audio recording method, including:
s110: respectively collecting a background noise signal and a first audio signal; wherein the background noise signal and the first audio signal are both time domain signals;
s120: obtaining a frequency domain feature of the background noise signal, wherein the frequency domain feature comprises: n filtering frequency bands for filtering background noise and the frequency domain amplitude of each filtering frequency band, wherein N is a positive integer;
s130: performing time-frequency domain conversion on the first audio signal to obtain a first frequency domain signal;
s140: and according to the frequency domain characteristics, carrying out frequency domain filtering on the first frequency domain signal to obtain a second audio signal.
The embodiment of the disclosure can be applied to various electronic devices supporting audio recording methods. The electronic device includes, but is not limited to, a terminal, for example, the terminal may include a mobile phone, a tablet computer, or a wearable device or a simple recording pen.
In the disclosed embodiment, a background noise signal is collected. The background noise signal herein may include at least: an audio signal is acquired from ambient noise of the first audio signal. A typical background noise signal may include: a gaussian white noise signal.
In the disclosed embodiment, the background noise signal may be: an audio signal of ambient noise present in the acquisition environment of the first audio signal for a relatively long time.
For example, in a ventilated environment, such ambient noise signals may include: and collecting the sound signal obtained by wind sound.
As another example, at the side of a road, such ambient noise may include: the sound signal is obtained for the sound of the vehicle.
The acquisition of the sound is typically performed as a time domain signal, and thus the background noise signal and the first audio signal are both time domain signals.
In an embodiment of the disclosure, the first audio signal may be: a mixed signal of an audio signal collected for a target sound and a sound signal collected for a background noise.
Fig. 2 is a schematic diagram showing the amplitude distribution of a background noise signal in the time domain. As can be seen from fig. 2, in some cases, the amplitude of the background noise signal is relatively smoothly distributed in the time domain, and the waveform pattern does not change much in the time domain, i.e., has relatively high stability.
After the background noise signal is acquired, the background noise signal is converted into a frequency domain signal by time-frequency domain conversion. For example, fourier transform is used to map the time domain signal of the background noise to the frequency domain, so as to obtain a frequency domain signal.
Illustratively, the time domain signal is converted to a frequency domain signal using the following formula:
Figure BDA0002950321670000051
wherein F (t) is a signal in the time domain, and F (w) is a signal in the frequency domain; t is the time point of the time domain.
In S120, the frequency domain characteristics of the frequency domain signal are analyzed to obtain the frequency domain characteristics of the frequency domain signal. The frequency domain features may include: the method comprises the steps of N filtering frequency bands of a background noise signal to be filtered for the background noise and the frequency domain amplitude of each filtering frequency band, wherein N is a positive integer.
The filtering frequency band here is: the frequency bands in the first audio signal where frequency filtering is required.
The frequency domain amplitude of the filtered frequency band may include:
firstly, the method comprises the following steps: adopting the frequency domain amplitude values of the filtering frequency band, so that the adopted frequency domain amplitude values have high similarity with a connecting line of original frequency amplitude values of the slope rate frequency band after a connecting line of the frequency domains;
secondly, the method comprises the following steps: obtaining the frequency average amplitude;
thirdly, the steps of: acquiring a median of the frequency amplitude of the filtering frequency band;
fourthly: acquiring the minimum value of the frequency amplitude of the filtering frequency band;
fifth: and acquiring the maximum value of the frequency amplitude of the filtering frequency band.
The value of N may be any positive integer, for example, 1, 2, 3, 4, or 5.
In an embodiment, the preferable value of N is between 2 and 4, so that neither noise filtering is not clean enough due to too small value, nor the frequency filtering calculation amount is large due to too many frequency bands needing filtering.
In a word, by analyzing the signal characteristics of the frequency domain signal, the frequency band of the background noise signal concentration, the distribution position of the amplitude extremum of the background noise signal in the frequency domain, and other frequency domain characteristics of the signal characteristics reflected in the frequency domain can be determined.
In the embodiment of the present disclosure, the frequency domain feature extracted in S120 may be the frequency domain feature that most interferes with the first audio signal.
In summary, in the embodiment of the present disclosure, the frequency domain characteristic of the background noise signal used for frequency filtering the first audio signal is one or more frequency domain components of the background noise signal with the highest amplitude in the frequency domain, that is, a larger interference frequency component of the background noise signal to the target signal to be recorded. The interference frequency component may be a continuous frequency component or a plurality of discretely distributed frequency domain components. Therefore, the frequency domain characteristics are adopted to carry out the first frequency domain signal filtering, so that the components with larger interference of the background noise signal to the target signal are removed; due to the frequency filtering, the precise filtering of the interfered frequency band in the first audio signal is realized, the non-frequency domain characteristic part is not filtered, and obviously, the frequency component of the target signal is not generated. The target signal here may be: audio signals that need to be recorded, such as the user's voice, etc.
The step S130 may include: the method comprises the steps of converting a first audio signal acquired in a time domain into a frequency domain through Fourier transform and the like to obtain a first frequency domain signal of the first audio signal, and filtering the first frequency domain signal by using the frequency domain characteristics of a background noise signal to obtain a filtered first audio signal, namely obtaining a second audio signal in the form of a frequency domain signal.
After the frequency domain characteristics of the background noise signal in the frequency domain are obtained, frequency domain filtering is carried out on the first frequency domain signal; when the frequency-domain filtering is performed on the first frequency-domain signal, the frequency-domain filtering is performed with reference to the frequency-domain characteristics of the frequency-domain signal of the background noise signal, and after the frequency-domain filtering is completed, the filtered first audio signal, that is, the second audio signal, is obtained. And further performing time-frequency domain conversion on the second audio signal to obtain a time domain signal corresponding to the second audio signal, wherein the time domain signal of the second audio signal is the audio signal for filtering the background noise of the first audio signal, and the time domain signal of the second audio signal is output so that the user can hear the sound.
The first audio signal is: a superimposed audio signal of the background noise signal and the target signal. The second audio signal is obtained after the first audio signal is subjected to frequency filtering, so that the amplitude ratio of the target signal relative to the background noise signal in the second audio signal is larger than the amplitude ratio of the target signal relative to the background signal in the first audio signal. I.e. the signal-to-noise ratio of the second audio signal is higher than the signal-to-noise ratio of the first audio signal.
In the embodiment of the disclosure, when recording, the electronic device does not directly record the recorded first audio signal according to the maximum amplitude value or the average amplitude value of the time domain of the background noise signal, but does not perform frequency domain filtering according to the frequency domain characteristic of the background noise signal in the frequency domain after converting to the frequency domain, but does not directly perform distortion caused by subtracting one amplitude value from the first audio signal in the time domain, thereby reducing distortion caused in the recording filtering process and improving the recording effect.
In some embodiments, as shown in fig. 3, the S120 of the embodiments of the present disclosure may include:
s121: performing time-frequency domain conversion on the background noise signal to obtain a second frequency domain signal;
s122: and determining N filtering frequency bands and the frequency domain amplitude of each filtering frequency band according to the frequency domain amplitude of each frequency band of the second frequency domain signal.
FIG. 4 is a graph of amplitude waveforms of a background noise signal in the frequency domain; the second frequency domain signal comprises frequency bands having different differences between the upper and lower frequency limits. For example, in one embodiment, a frequency band may relate to a local maximum of the second frequency domain signal. For example, one peak corresponds to one frequency band in fig. 4. Therefore, the difference between the upper and lower limits of different frequency bands is different.
In one embodiment, S122 may include:
integrating the frequency domain amplitude of each frequency band of the second frequency domain signal on a frequency domain to obtain an integral value;
and selecting N frequency bands with the maximum integral value as N filtering frequency bands, and acquiring the frequency domain amplitude of each filtering frequency band in the second frequency domain signal.
And integrating the frequency domain amplitude values of all the frequency bands on the frequency domain to obtain an integral value, obtaining one or more frequency bands with the maximum integral value of the background noise signal in the frequency domain as the filtering frequency band, and obtaining the frequency domain amplitude value of the filtering in the subsequent processing process of the second frequency domain signal. The integrated value can be represented by the areas of the different frequency bands shown in fig. 4. And calculating according to the integral area, wherein frequency bands of the frequency values 1171Hz,1813Hz and 2654Hz corresponding to the frequency amplitude peak are filtering frequency bands.
In other embodiments, the S122 may include:
selecting N frequency bands with the largest frequency domain average amplitude from the second frequency domain signals to obtain N filtering frequency bands;
determining a frequency domain amplitude of the filtered frequency band based on the second frequency domain signal.
In the embodiment of the disclosure, the selection of the filtering frequency band is simply completed through the average value calculation, and compared with the selection of the integral value, the determined filtering frequency band is mostly consistent, and meanwhile, the method has the characteristics of simple calculation and small calculation amount.
In still other embodiments, the S122 may include:
according to the frequency domain amplitude of the second frequency domain signal, taking the frequency band with the frequency domain average amplitude larger than a preset threshold value as the N frequency bands to be filtered;
and acquiring frequency domain amplitude values of the N frequency bands to be filtered.
If the frequency amplitude of the background noise signal in a certain frequency band is small, the volume of the frequency component of the background noise signal is small, and the volume can be ignored. Therefore, in the embodiment of the present disclosure, it is possible to select the accompanying frequency component with a larger volume simply by screening the preset threshold, and perform frequency filtering.
There are many ways to select the filtering frequency band, and the above provides several options, and the specific implementation is not limited to any of the above.
In summary, in the embodiment of the present disclosure, the filtered frequency band is a partial frequency band in a frequency band in which the background noise signal is located.
In some embodiments, the S140 may include:
and at each filtering frequency section, subtracting the frequency amplitude corresponding to the filtering frequency section from the first frequency domain signal to obtain the second audio signal.
In the embodiment of the present disclosure, the filtered second audio signal may be obtained by calculating the difference between the frequency amplitudes of the corresponding frequencies, and the method has the characteristics of good filtering effect of the background noise signal and more complete reservation of the target signal in the first audio signal, so that the method has the characteristic of small distortion caused by filtering.
In one embodiment, the S110 may include:
acquiring the background noise signal before recording the first audio signal;
alternatively, the first and second electrodes may be,
and collecting the background noise signal in the recording interval of the first audio signal.
For example, taking recording voice as an example, one way to implement this is: before formal recording is started, a small section of background noise is recorded to obtain the frequency domain characteristic of the background noise in a frequency domain, and the frequency domain characteristic is used as the frequency domain characteristic of frequency filtering of the voice recording. The other realization mode is as follows: for a mare response recording, but a human utterance cannot be gapless in the time domain, for example, a period of silence of the human Voice is determined through Voice Activity Detection (VAD), and the detected Voice in the period can be regarded as the background noise.
Fig. 5A is a time-domain waveform diagram of a first audio signal acquired by an electronic device, the first audio signal including: a background noise signal and a target signal.
Fig. 5B is a time domain waveform diagram of a second audio signal obtained after frequency domain filtering is performed by using the audio recording method provided by the embodiment of the present disclosure.
Fig. 5C is a time domain waveform diagram of an audio signal after time domain average amplitude filtering with a background noise signal.
By comparing fig. 5A and fig. 5B, it can be seen that: since the amplitude of the target signal is larger, the time domain waveform diagrams of fig. 5B and 5A do not change much after the filtering of the background noise signal, so that the distortion of the target signal is reduced while the background noise is filtered to the maximum extent.
Comparing fig. 5A, 5B, and 5C, it can be seen that the time domain waveform diagram shown in fig. 5C has a larger amplitude loss and thus introduces a larger distortion than that shown in fig. 5A and 5B.
(1) A time domain signal of a background noise signal is collected, and fig. 2 is a time domain waveform diagram of the noise signal. It can be seen from fig. 2 that the waveform does not change much with time, that is, it is reasonable to calculate the background noise signal and then record the signal at each recording time. That is, the background noise may be collected before recording for noise cancellation of all recorded audio in a subsequent recording session.
(2) To convert the noise waveform from the time domain to the frequency domain, a fourier transform is used to calculate the frequency domain signal of the background noise signal.
Figure BDA0002950321670000081
Wherein F (t) is in the time domain signal, F (w) the frequency domain signal; t is the time point of the time domain. This formula can be used to convert a time domain signal of audio to a frequency domain signal.
(b) It can be found that generally, in order to perform fast denoising, only the first three frequency domain amplitude areas of the frequency domain signal of the background noise signal F (w) need to be selected, for example, the first three frequency domain amplitude areas with the largest frequency domain in fig. 4 are frequency bands corresponding to frequencies of 1171hz,1813hz, and 2654 Hz.
Immediately starting the recording after obtaining the frequency domain characteristics of the background noise signal in the time period, wherein the obtained recording is a mixed signal of the background noise signal and the original recording, similarly, performing fourier transform on the mixed signal, performing low-frequency filtering on the mixed signal after converting the mixed signal into the frequency domain, wherein the frequency of the filtering is derived from the frequency domain characteristics of fig. 4, so that a final waveform is finally obtained, and three time domain waveforms as shown in fig. 5A to 5C are listed for visual display.
FIG. 5A is a waveform diagram of an audio file in the time domain having a mixture of recorded speech and background noise signal waveforms;
FIG. 5B is a time domain waveform plot derived from denoising a background noise signal in the frequency domain feature of the comment; FIG. 5C is a time domain waveform diagram obtained from amplitude value de-noising of a time domain background noise signal
From the comparison of fig. 5A to 5C, it can be seen that compared with the method of eliminating the background noise signal based on the frequency domain features in the embodiment of the present disclosure, the waveform is more damaged and the distortion is more serious than in the method of utilizing the time domain waveform amplitude elimination.
As shown in fig. 6, an embodiment of the present disclosure provides an audio recording apparatus, which may be applied to an electronic device, where the apparatus includes:
an acquisition module 610, configured to acquire a background noise signal and a first audio signal, respectively; wherein the background noise signal and the first audio signal are both time domain signals;
an obtaining module 620, configured to obtain a frequency-domain feature of the background noise signal in a frequency domain, where the frequency-domain feature includes: n filtering frequency bands for filtering background noise and the frequency domain amplitude of each filtering frequency band, wherein N is a positive integer;
a converting module 630, configured to perform time-frequency domain conversion on the first audio signal to obtain a first frequency domain signal;
and a filtering module 640, configured to perform frequency domain filtering on the first frequency domain signal according to the frequency domain characteristic to obtain a second audio signal.
In one embodiment, the acquisition module 610, the obtaining module 620, the conversion module 630, and the filtering module 640 may all be program modules; the program modules may be executed by the processor to perform operations associated with the various modules described above.
In another embodiment, the acquisition module 610, the obtaining module 620, the conversion module 630 and the filtering module 640 may be all hardware and software combined modules; the soft and hard combining module can comprise various programmable arrays; such as a complex programmable array or a field programmable array.
In yet another embodiment, the acquisition module 610, the obtaining module 620, the conversion module 630, and the filtering module 640 may all be pure hardware modules, and the storage hardware modules may include, but are not limited to, application specific integrated circuits.
In an embodiment, the obtaining module 620 is specifically configured to perform time-frequency domain conversion on the background noise signal to obtain a second frequency domain signal; and determining N filtering frequency bands and the frequency domain amplitude of each filtering frequency band according to the frequency domain amplitude of each frequency band of the second frequency domain signal.
In an embodiment, the obtaining module 620 is specifically configured to perform frequency domain amplitude integration on each frequency band of the second frequency domain signal in a frequency domain to obtain an integrated value; and selecting N frequency bands with the maximum integral value as N filtering frequency bands, and acquiring the frequency domain amplitude of each filtering frequency band in the second frequency domain signal.
In an embodiment, the obtaining module 620 is specifically configured to select N frequency bands with the largest frequency domain average amplitude from the second frequency domain signal, so as to obtain N filtering frequency bands; determining a frequency domain amplitude for the filtered frequency band based on the second frequency domain signal.
In an embodiment, the filtering module 640 is specifically configured to, at each of the filtering frequency bands, subtract a frequency amplitude corresponding to the filtering frequency band from the first frequency-domain signal to obtain the second audio signal.
In an embodiment, the acquiring module 610 is specifically configured to acquire the background noise signal before recording the first audio signal; alternatively, the background noise signal is acquired in the time gap between the two first audio signals.
An embodiment of the present disclosure provides an electronic device, including:
a memory for storing processor-executable instructions;
a processor connected with the memory;
wherein the processor is configured to execute the audio recording method provided by any of the above technical solutions.
The processor may include various types of storage media, non-transitory computer storage media capable of continuing to remember to store the information thereon after a power loss to the communication device.
Here, the electronic device may include: a terminal and/or a server.
The processor may be connected to the memory via a bus or the like for reading an executable program stored on the memory, e.g. at least one of the methods as shown in fig. 1 and/or fig. 3.
Fig. 7 is a block diagram illustrating a mobile electronic device 800 according to an example embodiment. For example, the electronic device 800 may be a mobile phone, a mobile computer, or the like.
Referring to fig. 7, electronic device 800 may include one or more of the following components: processing component 802, memory 804, power component 806, multimedia component 808, audio component 810, input/output (I/O) interface 812, sensor component 814, and communication component 816.
The processing component 802 generally controls overall operation of the electronic device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing component 802 may include one or more processors 820 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interaction between the processing component 802 and other components. For example, the processing component 802 may include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.
The memory 804 is configured to store various types of data to support operation at the device 800. Examples of such data include instructions for any application or method operating on the electronic device 800, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 804 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
The power supply component 806 provides power to the various components of the electronic device 800. The power components 806 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for the electronic device 800.
The multimedia component 808 includes a screen that provides an output interface between the electronic device 800 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front facing camera and/or a rear facing camera. The front camera and/or the rear camera may receive external multimedia data when the device 800 is in an operational state, such as a shooting state or a video state. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.
The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the electronic device 800 is in an operational state, such as a call state, a recording state, and a voice recognition state. The received audio signal may further be stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 also includes a speaker for outputting audio signals.
The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.
The sensor assembly 814 includes one or more sensors for providing various aspects of state assessment for the electronic device 800. For example, sensor assembly 814 may detect the open/closed status of device 800, the relative positioning of components, such as a display and keypad of device 800, sensor assembly 814 may also detect a change in position of device 800 or a component of device 800, the presence or absence of user contact with device 800, orientation or acceleration/deceleration of device 800, and a change in temperature of device 800. Sensor assembly 814 may include a proximity sensor configured to detect the presence of a nearby object in the absence of any physical contact. The sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communication component 816 is configured to facilitate wired or wireless communication between the electronic device 800 and other devices. The electronic device 800 may access a wireless network based on a communication standard, such as Wi-Fi,2G, or 3G, or a combination thereof. In an exemplary embodiment, the communication component 816 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, communications component 816 further includes a Near Field Communications (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, ultra Wideband (UWB) technology, bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the electronic device 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.
In an exemplary embodiment, a non-transitory computer-readable storage medium comprising instructions, such as the memory 804 comprising instructions, executable by the processor 820 of the electronic device 800 to perform the above-described method is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
The disclosed embodiments provide a non-transitory computer-readable storage medium, wherein instructions of the storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the audio recording method provided by any of the foregoing embodiments, e.g., at least the method as shown in fig. 1 and/or fig. 3 may be performed.
The audio recording method comprises the following steps: respectively collecting a background noise signal and a first audio signal; wherein the background noise signal and the first audio signal are both time domain signals; acquiring a frequency domain characteristic of the background noise signal, wherein the frequency domain characteristic comprises: n filtering frequency bands for filtering background noise and the frequency domain amplitude of each filtering frequency band, wherein N is a positive integer; performing time-frequency domain conversion on the first audio signal to obtain a first frequency domain signal; and according to the frequency domain characteristics, carrying out frequency domain filtering on the first frequency domain signal to obtain a second audio signal.
It is to be understood that the acquiring the frequency domain feature of the background noise signal includes: performing time-frequency domain conversion on the background noise signal to obtain a second frequency domain signal; and determining N filtering frequency bands and the frequency domain amplitude of each filtering frequency band according to the frequency domain amplitude of each frequency band of the second frequency domain signal.
It can be understood that, the determining N filtering frequency bands and the frequency domain amplitudes of the filtering frequency bands according to the frequency domain amplitudes of the frequency bands of the second frequency domain signal includes: integrating the frequency domain amplitude of each frequency band of the second frequency domain signal on a frequency domain to obtain an integral value; and selecting N frequency bands with the maximum integral value as N filtering frequency bands, and acquiring the frequency domain amplitude of each filtering frequency band in the second frequency domain signal.
It can be understood that, the determining N filtering frequency bands and the frequency domain amplitudes of the filtering frequency bands according to the frequency domain amplitudes of the frequency bands of the second frequency domain signal includes: selecting N frequency bands with the largest frequency domain average amplitude from the second frequency domain signals to obtain N filtering frequency bands; and determining the frequency domain amplitude corresponding to the filtering frequency band based on the second frequency domain signal.
As can be understood, the frequency-domain filtering the first audio signal according to the frequency-domain characteristics to obtain a second audio signal includes: and at each filtering frequency section, subtracting the frequency amplitude corresponding to the filtering frequency section from the first frequency domain signal to obtain the second audio signal.
As can be appreciated, the acquiring background noise includes: acquiring the background noise signal before recording the first audio signal; or, in the recording interval of the first audio signal, collecting the background noise signal.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice in the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (12)

1. An audio recording method, comprising:
respectively collecting a background noise signal and a first audio signal; wherein the background noise signal and the first audio signal are both time domain signals;
obtaining a frequency domain feature of the background noise signal, wherein the frequency domain feature comprises: n filtering frequency bands for filtering background noise and the frequency domain amplitude of each filtering frequency band, wherein N is a positive integer; the obtaining of the frequency domain feature of the background noise signal includes: performing time-frequency domain conversion on the background noise signal to obtain a second frequency domain signal; determining N filtering frequency bands and the frequency domain amplitude value of each filtering frequency band according to the frequency domain amplitude of each frequency band of the second frequency domain signal;
performing time-frequency domain conversion on the first audio signal to obtain a first frequency domain signal;
and according to the frequency domain characteristics, carrying out frequency domain filtering on the first frequency domain signal to obtain a second audio signal.
2. The method of claim 1, wherein determining N of the filtered frequency bands and the frequency domain amplitudes of the filtered frequency bands according to the frequency domain amplitudes of the frequency bands of the second frequency domain signal comprises:
integrating the frequency domain amplitude of each frequency band of the second frequency domain signal on a frequency domain to obtain an integral value;
and selecting N frequency bands with the maximum integral value as N filtering frequency bands, and acquiring the frequency domain amplitude of each filtering frequency band in the second frequency domain signal.
3. The method of claim 1, wherein determining N of the filtered frequency bands and the frequency domain amplitudes of the filtered frequency bands according to the frequency domain amplitudes of the frequency bands of the second frequency domain signal comprises:
selecting N frequency bands with the largest frequency domain average amplitude from the second frequency domain signals to obtain N filtering frequency bands;
and determining the frequency domain amplitude corresponding to the filtering frequency band based on the second frequency domain signal.
4. The method of any of claims 1 to 3, wherein the frequency-domain filtering the first audio signal according to the frequency-domain feature to obtain a second audio signal comprises:
and at each filtering frequency section, subtracting the frequency amplitude corresponding to the filtering frequency section from the first frequency domain signal to obtain the second audio signal.
5. The method of any one of claims 1 to 3, wherein the acquiring background noise comprises:
acquiring the background noise signal before recording the first audio signal;
alternatively, the first and second liquid crystal display panels may be,
and acquiring the background noise signal in the recording gap of the first audio signal.
6. An audio recording apparatus, comprising:
the acquisition module is used for respectively acquiring a background noise signal and a first audio signal; wherein the background noise signal and the first audio signal are both time domain signals;
an obtaining module, configured to obtain a frequency domain feature of the background noise signal in a frequency domain, where the frequency domain feature includes: n filtering frequency bands for filtering background noise and the frequency domain amplitude value of each filtering frequency band, wherein N is a positive integer; the acquisition module is specifically configured to perform time-frequency domain conversion on the background noise signal to obtain a second frequency domain signal; determining N filtering frequency bands and the frequency domain amplitude value of each filtering frequency band according to the frequency domain amplitude of each frequency band of the second frequency domain signal;
the conversion module is used for carrying out time-frequency domain conversion on the first audio signal to obtain a first frequency domain signal;
and the filtering module is used for carrying out frequency domain filtering on the first frequency domain signal according to the frequency domain characteristics to obtain a second audio signal.
7. The apparatus according to claim 6, wherein the obtaining module is specifically configured to perform frequency domain amplitude integration on each frequency band of the second frequency domain signal in a frequency domain to obtain an integrated value; and selecting N frequency bands with the maximum integral value as N filtering frequency bands, and acquiring the frequency domain amplitude of each filtering frequency band in the second frequency domain signal.
8. The apparatus according to claim 6, wherein the obtaining module is specifically configured to select N frequency bands with a largest frequency domain average amplitude from the second frequency domain signal, so as to obtain N filtering frequency bands; determining a frequency domain amplitude for the filtered frequency band based on the second frequency domain signal.
9. The apparatus according to any one of claims 6 to 8, wherein the filtering module is specifically configured to subtract, at each of the filtering frequency bands, a frequency amplitude corresponding to the filtering frequency band from the first frequency-domain signal to obtain the second audio signal.
10. The apparatus according to any of the claims 6 to 8, wherein the acquisition module is specifically configured to acquire the background noise signal before recording the first audio signal; alternatively, the background noise signal is acquired in the time gap between two of the first audio signals.
11. An electronic device, comprising:
a memory for storing processor-executable instructions;
a processor coupled to the memory;
wherein the processor is configured to perform the audio recording method of any of claims 1 to 5.
12. A non-transitory computer-readable storage medium, instructions in which, when executed by a processor of a computer, enable the computer to perform the audio recording method of any one of claims 1 to 5.
CN202110208491.9A 2021-02-24 2021-02-24 Audio recording method and device, electronic equipment and storage medium Active CN112951262B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110208491.9A CN112951262B (en) 2021-02-24 2021-02-24 Audio recording method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110208491.9A CN112951262B (en) 2021-02-24 2021-02-24 Audio recording method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112951262A CN112951262A (en) 2021-06-11
CN112951262B true CN112951262B (en) 2023-03-10

Family

ID=76246055

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110208491.9A Active CN112951262B (en) 2021-02-24 2021-02-24 Audio recording method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112951262B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103021420A (en) * 2012-12-04 2013-04-03 中国科学院自动化研究所 Speech enhancement method of multi-sub-band spectral subtraction based on phase adjustment and amplitude compensation
CN105869649A (en) * 2015-01-21 2016-08-17 北京大学深圳研究院 Perceptual filtering method and perceptual filter
CN106874872A (en) * 2017-02-16 2017-06-20 武汉中旗生物医疗电子有限公司 Industrial frequency noise filtering device and method
CN106910511A (en) * 2016-06-28 2017-06-30 阿里巴巴集团控股有限公司 A kind of speech de-noising method and apparatus
CN108391190A (en) * 2018-01-30 2018-08-10 努比亚技术有限公司 A kind of noise-reduction method, earphone and computer readable storage medium
CN108461081A (en) * 2018-03-21 2018-08-28 广州蓝豹智能科技有限公司 Method, apparatus, equipment and the storage medium of voice control
CN111128213A (en) * 2019-12-10 2020-05-08 展讯通信(上海)有限公司 Noise suppression method and system for processing in different frequency bands
CN112183407A (en) * 2020-09-30 2021-01-05 山东大学 Tunnel seismic wave data denoising method and system based on time-frequency domain spectral subtraction

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9978393B1 (en) * 2017-09-12 2018-05-22 Rob Nokes System and method for automatically removing noise defects from sound recordings
CN110021305B (en) * 2019-01-16 2021-08-20 上海惠芽信息技术有限公司 Audio filtering method, audio filtering device and wearable equipment
CN110335620B (en) * 2019-07-08 2021-07-27 广州欢聊网络科技有限公司 Noise suppression method and device and mobile terminal

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103021420A (en) * 2012-12-04 2013-04-03 中国科学院自动化研究所 Speech enhancement method of multi-sub-band spectral subtraction based on phase adjustment and amplitude compensation
CN105869649A (en) * 2015-01-21 2016-08-17 北京大学深圳研究院 Perceptual filtering method and perceptual filter
CN106910511A (en) * 2016-06-28 2017-06-30 阿里巴巴集团控股有限公司 A kind of speech de-noising method and apparatus
CN106874872A (en) * 2017-02-16 2017-06-20 武汉中旗生物医疗电子有限公司 Industrial frequency noise filtering device and method
CN108391190A (en) * 2018-01-30 2018-08-10 努比亚技术有限公司 A kind of noise-reduction method, earphone and computer readable storage medium
CN108461081A (en) * 2018-03-21 2018-08-28 广州蓝豹智能科技有限公司 Method, apparatus, equipment and the storage medium of voice control
CN111128213A (en) * 2019-12-10 2020-05-08 展讯通信(上海)有限公司 Noise suppression method and system for processing in different frequency bands
CN112183407A (en) * 2020-09-30 2021-01-05 山东大学 Tunnel seismic wave data denoising method and system based on time-frequency domain spectral subtraction

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于多频带谱减法的老年人语音增强算法的研究;祁晓等;《电声技术》;20200505(第05期);第34-37页 *
基于有色噪声的DSP语音增强实现;李鸿等;《甘肃科技》;20070915(第09期);全文 *

Also Published As

Publication number Publication date
CN112951262A (en) 2021-06-11

Similar Documents

Publication Publication Date Title
CN110970057B (en) Sound processing method, device and equipment
CN104991754B (en) The way of recording and device
CN106161705B (en) Audio equipment testing method and device
CN111128221B (en) Audio signal processing method and device, terminal and storage medium
CN111883164B (en) Model training method and device, electronic equipment and storage medium
CN107833579B (en) Noise elimination method, device and computer readable storage medium
CN111968662A (en) Audio signal processing method and device and storage medium
CN109087650B (en) Voice wake-up method and device
CN112037825B (en) Audio signal processing method and device and storage medium
CN111986693A (en) Audio signal processing method and device, terminal equipment and storage medium
CN110931028B (en) Voice processing method and device and electronic equipment
CN112185388B (en) Speech recognition method, device, equipment and computer readable storage medium
CN109256145B (en) Terminal-based audio processing method and device, terminal and readable storage medium
CN106782625B (en) Audio-frequency processing method and device
CN112201267A (en) Audio processing method and device, electronic equipment and storage medium
CN112951262B (en) Audio recording method and device, electronic equipment and storage medium
CN111988704B (en) Sound signal processing method, device and storage medium
CN111292761B (en) Voice enhancement method and device
CN111698593B (en) Active noise reduction method and device, and terminal
CN111667842B (en) Audio signal processing method and device
CN113345461A (en) Voice processing method and device for voice processing
CN113190207A (en) Information processing method, information processing device, electronic equipment and storage medium
CN108491180B (en) Audio playing method and device
CN107566952B (en) Audio signal processing method and device
CN113077807B (en) Voice data processing method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant