CN111627459A

CN111627459A - Audio processing method and device, computer readable storage medium and electronic equipment

Info

Publication number: CN111627459A
Application number: CN202010524145.7A
Authority: CN
Inventors: 刘益帆; 徐银海
Original assignee: Beijing Ansheng Haolang Technology Co ltd
Current assignee: Beijing Ansheng Haolang Technology Co ltd
Priority date: 2019-09-19
Filing date: 2020-06-10
Publication date: 2020-09-04
Anticipated expiration: 2040-06-10
Also published as: CN111627459B

Abstract

An audio processing method and apparatus, a computer-readable storage medium, and an electronic device are disclosed. The audio processing method comprises the following steps: acquiring an audio signal; determining a screening condition corresponding to the audio signal, wherein the screening condition is used for screening a preset frequency point in the audio signal; and screening preset frequency points in the audio signals based on the screening conditions and the audio signals. According to the audio processing method provided by the embodiment of the disclosure, a preset frequency point in an audio signal is screened out by using a screening condition, and a precondition is provided for separating audio information corresponding to the preset frequency point from the audio signal. In addition, when the audio information corresponding to the preset frequency point is noise information, the embodiment of the disclosure can provide a precondition for timely, quickly and pertinently processing the noise.

Description

Audio processing method and device, computer readable storage medium and electronic equipment

Technical Field

The present disclosure relates to the field of signal processing technologies, and in particular, to an audio processing method, an audio processing apparatus, a computer-readable storage medium, and an electronic device.

Background

Existing acoustic signal processing techniques, especially for sound, usually employ complex algorithms to construct a signal matrix, and perform iterative or complex filtering processing or algorithm suppression on the signal to process narrow-band components or other specific components in the acoustic signal. Due to the complexity of the algorithm, most of the existing acoustic processing methods need to rely on a special signal processing chip, the system overhead is high, and the hardware cost is high.

Therefore, there is an urgent need for a method for conveniently processing acoustic signals to implement product sound quality construction or active noise reduction.

Disclosure of Invention

The present disclosure is proposed to solve the above technical problems. The embodiment of the disclosure provides an audio processing method and device, a computer readable storage medium and an electronic device.

In one aspect, an embodiment of the present disclosure provides an audio processing method, including: acquiring an audio signal; determining a screening condition corresponding to the audio signal, wherein the screening condition is used for screening a preset frequency point in the audio signal; and screening preset frequency points in the audio signals based on the screening conditions and the audio signals.

In an embodiment of the present disclosure, the screening of the preset frequency points in the audio signal based on the screening condition and the audio signal includes: performing time domain transformation operation on the audio signal to generate a frequency domain signal corresponding to the audio signal; and screening preset frequency points in the audio signal based on the screening condition and the frequency domain signal.

In an embodiment of the present disclosure, before performing a time domain transform operation on an audio signal to generate a frequency domain signal corresponding to the audio signal, the method further includes: and performing truncation segmentation operation on the audio signal to generate at least two segments of audio fragment signals. The method for performing time domain transformation operation on the audio signal to generate a frequency domain signal corresponding to the audio signal includes: respectively carrying out Fourier transform operation on at least two sections of audio clip signals to determine frequency domain signals corresponding to the at least two sections of audio clip signals; and, screening the frequency point of presetting the frequency in the audio signal based on screening condition and frequency domain signal, include: and respectively determining the preset frequency points corresponding to the at least two sections of audio clip signals based on the screening condition and the frequency domain signals corresponding to the at least two sections of audio clip signals.

In an embodiment of the present disclosure, determining, based on the screening condition and the frequency domain signals corresponding to the at least two segments of audio clip signals, the preset frequency points corresponding to the at least two segments of audio clip signals respectively includes: determining frequency domain curves corresponding to the at least two sections of audio clip signals respectively based on the frequency domain signals corresponding to the at least two sections of audio clip signals respectively; aiming at each section of audio clip signal, arranging all frequency points in a frequency domain curve corresponding to the audio clip signal in an ascending order or a descending order according to the magnitude of the absolute value of a vertical coordinate in a frequency domain coordinate; and carrying out interception and screening operation based on the rearranged frequency points so as to determine the preset frequency points corresponding to the audio clip signals.

In an embodiment of the present disclosure, determining, based on the screening condition and the frequency domain signals corresponding to the at least two segments of audio clip signals, the preset frequency points corresponding to the at least two segments of audio clip signals respectively includes: determining frequency domain curves corresponding to the at least two sections of audio clip signals respectively based on the frequency domain signals corresponding to the at least two sections of audio clip signals respectively; determining a frequency domain average value corresponding to at least two sections of audio fragment signals based on the frequency domain curves corresponding to the at least two sections of audio fragment signals; and determining the preset frequency points corresponding to the at least two sections of audio clip signals based on the frequency domain average values corresponding to the at least two sections of audio clip signals and the derivative information of the frequency domain curves corresponding to the at least two sections of audio clip signals.

In an embodiment of the present disclosure, determining a preset frequency point corresponding to at least two segments of audio segment signals based on a frequency domain average value corresponding to each of the at least two segments of audio segment signals and derivative information of a frequency domain curve corresponding to each of the at least two segments of audio segment signals includes: determining a first frequency point with a zero first derivative value corresponding to at least two sections of audio segment signals based on derivative information of frequency domain curves corresponding to the at least two sections of audio segment signals; filtering the first frequency points corresponding to the at least two sections of audio clip signals based on the first frequency points corresponding to the at least two sections of audio clip signals and the frequency domain average values corresponding to the at least two sections of audio clip signals to determine second frequency points corresponding to the at least two sections of audio clip signals; filtering the second frequency points corresponding to the at least two sections of audio clip signals respectively based on the adjacent frequency points of the second frequency points corresponding to the at least two sections of audio clip signals respectively so as to determine the third frequency points corresponding to the at least two sections of audio clip signals respectively; and determining the third frequency points corresponding to the at least two sections of audio clip signals as the preset frequency points corresponding to the at least two sections of audio clip signals.

In an embodiment of the present disclosure, determining, based on the screening condition and the frequency domain signals corresponding to the at least two segments of audio clip signals, the preset frequency points corresponding to the at least two segments of audio clip signals respectively includes: determining frequency domain curves corresponding to the at least two sections of audio clip signals respectively based on the frequency domain signals corresponding to the at least two sections of audio clip signals respectively; determining a fourth frequency point with an N-order derivative value of zero corresponding to at least two sections of audio segment signals based on derivative information of frequency domain curves corresponding to the at least two sections of audio segment signals, wherein N is an integer greater than or equal to 2; and determining the fourth frequency points corresponding to the at least two sections of audio clip signals as the preset frequency points corresponding to the at least two sections of audio clip signals.

In an embodiment of the present disclosure, after the preset frequency points in the audio signal are screened based on the screening condition and the audio signal, the method further includes: and determining audio information corresponding to the preset frequency points based on the audio signals.

In an embodiment of the present disclosure, determining audio information corresponding to a preset frequency point based on an audio signal includes: determining a first frequency domain coefficient corresponding to a preset frequency point based on the audio signal; and generating audio information corresponding to the preset frequency point based on the first frequency domain coefficient.

In an embodiment of the present disclosure, the number of the preset frequency bins is multiple, and before the audio information corresponding to the preset frequency bins is generated based on the first frequency domain coefficient, the method further includes: and determining the average value of the frequency domain coefficients based on the first frequency domain coefficients corresponding to the multiple preset frequency points. The method for generating the audio information corresponding to the preset frequency point based on the first frequency domain coefficient includes: and determining audio information corresponding to the multiple preset frequency points respectively based on the first frequency domain coefficient and the frequency domain coefficient average value.

In an embodiment of the present disclosure, determining, based on the first frequency domain coefficient and the frequency domain coefficient average value, audio information corresponding to each of a plurality of preset frequency points includes: replacing the first frequency domain coefficients corresponding to the multiple preset frequency points by the frequency domain coefficient average value to generate second frequency domain coefficients corresponding to the multiple preset frequency points; and calculating the audio information corresponding to the multiple preset frequency points based on the second frequency domain coefficients corresponding to the multiple preset frequency points.

In an embodiment of the present disclosure, after determining audio information corresponding to a preset frequency point based on an audio signal, the method further includes: and separating the audio information corresponding to the preset frequency points from the audio signals.

In another aspect, an embodiment of the present disclosure provides an audio processing apparatus, including: the acquisition module is used for acquiring an audio signal; the device comprises a first determining module, a second determining module and a processing module, wherein the first determining module is used for determining a screening condition corresponding to an audio signal, and the screening condition is used for screening a preset frequency point in the audio signal; and the screening module is used for screening the preset frequency points in the audio signals based on the screening conditions and the audio signals.

In another aspect, an embodiment of the present disclosure provides a computer-readable storage medium, which stores a computer program for executing the audio processing method mentioned in any of the above embodiments.

In another aspect, an embodiment of the present disclosure provides an electronic device, including: a processor and a memory for storing processor executable instructions, the processor being configured to perform the audio processing method as mentioned in any of the embodiments above.

According to the audio processing method provided by the embodiment of the disclosure, a preset frequency point in an audio signal is screened out by using a screening condition, and a precondition is provided for separating audio information corresponding to the preset frequency point from the audio signal. In addition, when the audio information corresponding to the preset frequency point is noise information, the embodiment of the disclosure can provide a precondition for timely, quickly and pertinently processing the noise.

In another aspect, an embodiment of the present disclosure provides an audio separation method, which includes: acquiring an audio signal; converting the audio signal into a frequency domain signal; screening a part of frequency points in the frequency domain signals as narrow-band frequency points; and respectively outputting the audio signals corresponding to the narrow-frequency points and the other audio signals.

Optionally, in the audio separation method, the narrow-band frequency point is a maximum value point in the frequency domain signal or a frequency point in the frequency domain signal where a high-order derivative value is 0.

Optionally, in the audio separation method, the remaining audio signals are audio signals corresponding to the wideband frequency point, which are obtained by subtracting the audio signal corresponding to the narrowband frequency point from the audio signal obtained before interception.

The present disclosure also proposes another audio separation method, comprising the steps of: acquiring an audio signal; intercepting the audio signal into at least two segments; respectively converting the audio signals into frequency domain signals; screening a part of frequency points in the frequency domain signals as narrow-band frequency points; respectively acquiring frequency domain coefficients corresponding to narrow-frequency points in each section of frequency domain signal, and acquiring the average value of the frequency domain coefficients in the previous step; and respectively outputting the audio signals corresponding to the narrow-frequency points and the rest audio signals according to the average value of the frequency domain coefficients in the last step.

Optionally, in the audio separation method, the remaining audio signals are audio signals corresponding to the wideband frequency point, and the remaining audio signals are audio signals obtained before interception and audio signals corresponding to the narrowband frequency point subtracted from the audio signals obtained before interception.

The disclosed embodiments are used in the acoustic field to perform component separation on some specific sound components to realize processing of specific sound components. Particularly, when the narrow-band frequency point is a noise frequency point, the sound frequency relatively sensitive to human ears can be quickly separated out with the minimum system overhead or operation cost through screening of frequency point extreme values and processing of frequency domain coefficients corresponding to the frequency points. And then, in the subsequent processing process, the sound is processed aiming at the specific frequency band. The method has a simple processing mode for the acoustic signals, and can further reduce the requirements on the system operation unit while ensuring the processing efficiency and reducing the phase delay caused by operation. The method can conveniently separate the sound broadband and the narrow frequency, and can be widely applied to sound quality construction aiming at products or scenes such as active noise reduction and the like, and narrow frequency or broadband signals can be distinguished and processed when signal processing is carried out subsequently.

Drawings

The above and other objects, features and advantages of the present disclosure will become more apparent by describing in more detail embodiments of the present disclosure with reference to the attached drawings. The accompanying drawings are included to provide a further understanding of the embodiments of the disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the principles of the disclosure and not to limit the disclosure. In the drawings, like reference numbers generally represent like parts or steps.

Fig. 1 is a schematic flowchart illustrating an audio processing method according to an exemplary embodiment of the present disclosure.

Fig. 2 is a schematic flowchart illustrating a process of screening preset frequency points in an audio signal based on a screening condition and the audio signal according to an exemplary embodiment of the present disclosure.

Fig. 3 is a schematic flowchart illustrating a process of screening preset frequency points in an audio signal based on a screening condition and the audio signal according to another exemplary embodiment of the present disclosure.

Fig. 4 is a schematic flow chart illustrating a process of determining preset frequency points corresponding to at least two segments of audio segment signals respectively based on a screening condition and frequency domain signals corresponding to the at least two segments of audio segment signals respectively according to an exemplary embodiment of the present disclosure.

Fig. 5 is a schematic flow chart illustrating a process of determining preset frequency points corresponding to at least two segments of audio clip signals based on frequency domain average values corresponding to the at least two segments of audio clip signals and first derivative information of frequency domain curves corresponding to the at least two segments of audio clip signals according to an exemplary embodiment of the present disclosure.

Fig. 6 is a schematic flow chart illustrating a process of determining preset frequency points corresponding to at least two segments of audio segment signals respectively based on a screening condition and frequency domain signals corresponding to the at least two segments of audio segment signals respectively according to another exemplary embodiment of the present disclosure.

Fig. 7 is a flowchart illustrating an audio processing method according to another exemplary embodiment of the present disclosure.

Fig. 8 is a schematic flowchart illustrating a process of determining audio information corresponding to a preset frequency point based on an audio signal according to an exemplary embodiment of the present disclosure.

Fig. 9 is a schematic flowchart illustrating a process of determining audio information corresponding to a preset frequency point based on an audio signal according to another exemplary embodiment of the present disclosure.

Fig. 10 is a schematic flowchart illustrating a process of determining audio information corresponding to each of a plurality of preset frequency bins based on a first frequency domain coefficient and a frequency domain coefficient average value according to an exemplary embodiment of the present disclosure.

Fig. 11 is a flowchart illustrating an audio separation method according to another exemplary embodiment of the present disclosure.

Fig. 12 is a schematic structural diagram of an audio processing apparatus according to an exemplary embodiment of the present disclosure.

Fig. 13 is a schematic structural diagram of a screening module according to an exemplary embodiment of the present disclosure.

Fig. 14 is a schematic structural diagram of an audio processing apparatus according to another exemplary embodiment of the present disclosure.

Fig. 15 is a block diagram illustrating an architecture of a sound processing system according to an exemplary embodiment of the present disclosure.

Fig. 16 is a schematic structural diagram of an electronic device according to an exemplary embodiment of the present disclosure.

Detailed Description

The preferred embodiments of the present disclosure will be described below with reference to the accompanying drawings, and it should be understood that the preferred embodiments described herein are merely for purposes of illustrating and explaining the present disclosure and are not intended to limit the present disclosure.

Fig. 1 is a schematic flowchart illustrating an audio processing method according to an exemplary embodiment of the present disclosure. As shown in fig. 1, an audio processing method provided by an embodiment of the present disclosure includes the following steps.

And 5, acquiring an audio signal.

And step 10, determining a screening condition corresponding to the audio signal, wherein the screening condition is used for screening a preset frequency point in the audio signal.

It should be noted that the preset frequency mentioned in step 10 may be determined according to actual situations, and this is not limited in the embodiment of the present disclosure.

Illustratively, the preset frequency bin mentioned in step 10 refers to a narrow frequency bin. It should be understood that the specific frequency range corresponding to the narrow frequency point can be set according to the actual situation. For example, in an embodiment of the present disclosure, the narrow frequency bins are determined based on the derivative information of the frequency domain curve corresponding to the audio signal (see the embodiments shown in fig. 5 and fig. 6 of the present disclosure described below).

And 20, screening preset frequency points in the audio signals based on the screening conditions and the audio signals.

In the practical application process, firstly, the audio signal is obtained, the screening condition corresponding to the audio signal is determined, and then the preset frequency point in the audio signal is screened based on the screening condition and the audio signal.

Fig. 2 is a schematic flowchart illustrating a process of screening preset frequency points in an audio signal based on a screening condition and the audio signal according to an exemplary embodiment of the present disclosure. The embodiment shown in fig. 2 of the present disclosure is extended on the basis of the embodiment shown in fig. 1 of the present disclosure, and the differences between the embodiment shown in fig. 2 and the embodiment shown in fig. 1 are emphasized below, and the descriptions of the same parts are omitted.

As shown in fig. 2, in the audio processing method provided in the embodiment of the present disclosure, the step of screening the preset frequency points in the audio signal based on the screening condition and the audio signal includes the following steps.

And step 25, performing time domain transformation operation on the audio signal to generate a frequency domain signal corresponding to the audio signal.

Illustratively, after performing a fourier transform operation on the audio signal, a frequency domain signal corresponding to the audio signal can be generated. That is, the time domain transform operation is a fourier transform operation. The formula of the fourier transform operation is as follows:

wherein x (n) represents the audio signal, C_kRepresenting frequency domain coefficients corresponding to different frequency points, N representing window scale of Fourier transform,

the calculation factors of the fourier transform are characterized. Illustratively, the generated frequency domain signal includes a plurality of different frequencies (i.e., f)₁,f₂,…,f_M) And each frequency point has a corresponding frequency domain coefficient C_kWherein k is 1,2, …, M. It should be understood that C_kTo include amplitude valuesAnd the complex number of the phase angle.

Note that, the frequency domain coefficient C is_kIn other embodiments of the present disclosure, the expression is also expressed in the form of c (k), and both of the above expression forms may represent frequency domain coefficients, also referred to as key coefficients.

And 26, screening preset frequency points in the audio signals based on the screening conditions and the frequency domain signals.

According to the audio processing method provided by the embodiment of the disclosure, the audio signal is subjected to time domain transformation operation to generate a frequency domain signal corresponding to the audio signal, and the preset frequency point in the audio signal is screened based on the screening condition and the frequency domain signal, so that the purpose of screening the preset frequency point in the audio signal based on the screening condition and the audio signal is achieved.

Fig. 3 is a schematic flowchart illustrating a process of screening preset frequency points in an audio signal based on a screening condition and the audio signal according to another exemplary embodiment of the present disclosure. The embodiment shown in fig. 3 of the present disclosure is extended on the basis of the embodiment shown in fig. 2 of the present disclosure, and the differences between the embodiment shown in fig. 3 and the embodiment shown in fig. 2 are emphasized below, and the descriptions of the same parts are omitted.

As shown in fig. 3, in the audio processing method provided by the embodiment of the present disclosure, before the step of performing a time domain transform operation on the audio signal to generate a frequency domain signal corresponding to the audio signal, the following steps are further included.

And 24, performing intercepting and segmenting operation on the audio signal to generate at least two segments of audio segment signals.

Illustratively, the audio signal is subjected to a truncation segmentation operation to divide the audio signal into three segments of audio segment signals of equal length. It should be noted that after the audio signal is divided into three segments of audio segment signals with equal length, the error of the frequency domain coefficient corresponding to the determined preset frequency point can be appropriately reduced without increasing the calculation pressure of the system.

The method comprises the following steps of carrying out time domain transformation operation on an audio signal to generate a frequency domain signal corresponding to the audio signal.

Step 251, performing fourier transform operation on the at least two segments of audio segment signals respectively to determine frequency domain signals corresponding to the at least two segments of audio segment signals.

And the step of screening the preset frequency points in the audio signals based on the screening conditions and the frequency domain signals comprises the following steps.

And 261, respectively determining preset frequency points corresponding to the at least two sections of audio clip signals based on the screening condition and the frequency domain signals corresponding to the at least two sections of audio clip signals.

In the practical application process, firstly, the audio signals and the screening conditions corresponding to the audio signals are determined, then, the audio signals are subjected to intercepting and segmenting operation to generate at least two sections of audio clip signals, then, the at least two sections of audio clip signals are subjected to Fourier transform operation respectively to determine frequency domain signals corresponding to the at least two sections of audio clip signals, and the preset frequency points corresponding to the at least two sections of audio clip signals are determined respectively based on the screening conditions and the frequency domain signals corresponding to the at least two sections of audio clip signals.

According to the audio processing method provided by the embodiment of the disclosure, at least two sections of audio clip signals are generated by intercepting and segmenting the audio signals, and each generated section of audio clip signal is subjected to Fourier transform operation respectively, so that the preset frequency points corresponding to each section of audio clip signal are determined respectively, errors of the determined preset frequency points are effectively reduced, and then preconditions are provided for subsequent operations such as accurate audio separation and accurate noise elimination.

Fig. 4 is a schematic flow chart illustrating a process of determining preset frequency points corresponding to at least two segments of audio segment signals respectively based on a screening condition and frequency domain signals corresponding to the at least two segments of audio segment signals respectively according to an exemplary embodiment of the present disclosure. The embodiment shown in fig. 4 of the present disclosure is extended on the basis of the embodiment shown in fig. 3 of the present disclosure, and the differences between the embodiment shown in fig. 4 and the embodiment shown in fig. 3 are emphasized below, and the descriptions of the same parts are omitted.

As shown in fig. 4, in the audio processing method provided in the embodiment of the present disclosure, the step of respectively determining the preset frequency points corresponding to at least two sections of audio clip signals based on the screening condition and the frequency domain signals corresponding to at least two sections of audio clip signals includes the following steps.

Step 2611, determine frequency domain curves corresponding to the at least two sections of audio clip signals based on the frequency domain signals corresponding to the at least two sections of audio clip signals.

Illustratively, the frequency points in the frequency domain signal corresponding to each segment of audio segment signal are sequentially connected to generate a frequency domain curve corresponding to the segment of audio segment signal.

Step 2612, determine the frequency domain average value of at least two sections of audio fragment signals based on the frequency domain curve corresponding to at least two sections of audio fragment signals.

Illustratively, for each audio segment signal, the arithmetic mean of the absolute values of the ordinate of all the frequency points in the frequency domain coordinate is calculated to obtain the frequency domain mean value corresponding to the audio segment signal.

Step 2613, determine the preset frequency point corresponding to each of the at least two segments of audio segment signals based on the frequency domain average value corresponding to each of the at least two segments of audio segment signals and the derivative information of the frequency domain curve corresponding to each of the at least two segments of audio segment signals.

According to the audio processing method provided by the embodiment of the disclosure, the accuracy of the determined preset frequency point is effectively ensured by means of determining the preset frequency point corresponding to each section of audio segment signal by means of the frequency domain average value corresponding to each section of audio segment signal and the derivative information of the frequency domain curve.

In another embodiment of the present disclosure, the step of determining, based on the screening condition and the frequency domain signals corresponding to the at least two segments of audio clip signals, the preset frequency points corresponding to the at least two segments of audio clip signals respectively includes: determining frequency domain curves corresponding to the at least two sections of audio clip signals respectively based on the frequency domain signals corresponding to the at least two sections of audio clip signals respectively; aiming at each section of audio clip signal, arranging all frequency points in a frequency domain curve corresponding to the audio clip signal in an ascending order or a descending order according to the magnitude of the absolute value of a vertical coordinate (namely according to the magnitude of the absolute value of an amplitude value) in a frequency domain coordinate; and performing interception and screening operation based on the rearranged frequency points to determine the preset frequency point corresponding to the audio clip signal.

For example, a frequency domain curve of an audio segment signal includes 100 frequency points, the 100 frequency points are arranged in a descending order according to the magnitude of the absolute value of the amplitude, and then the 11 th frequency point after the descending order is used as a screening threshold, so that the first 10 frequency points are selected as preset frequency points, and the last 90 frequency points are filtered out.

It should be noted that the screening threshold may be selected according to actual situations, and this is not uniformly limited in the embodiments of the present disclosure. In addition, the method for determining the preset frequency point mentioned in the embodiment of the present disclosure is not limited to be applied to the audio clip signal mentioned in the embodiment shown in fig. 4, and can also be directly applied to the audio signal mentioned in the embodiment shown in fig. 1.

The embodiment of the disclosure can quickly and accurately screen out the preset frequency point, and further improve the speed and accuracy of the audio processing method.

Fig. 5 is a schematic flow chart illustrating a process of determining preset frequency points corresponding to at least two segments of audio clip signals based on frequency domain average values corresponding to the at least two segments of audio clip signals and derivative information of frequency domain curves corresponding to the at least two segments of audio clip signals according to an exemplary embodiment of the present disclosure. The embodiment shown in fig. 5 of the present disclosure is extended on the basis of the embodiment shown in fig. 4 of the present disclosure, and the differences between the embodiment shown in fig. 5 and the embodiment shown in fig. 4 are emphasized below, and the descriptions of the same parts are omitted.

As shown in fig. 5, in the audio processing method provided in the embodiment of the present disclosure, the step of determining the preset frequency points corresponding to at least two segments of audio segment signals based on the frequency domain average values corresponding to the at least two segments of audio segment signals and the derivative information of the frequency domain curves corresponding to the at least two segments of audio segment signals includes the following steps.

Step 26131, determining a first frequency point where the first derivative value of each of the at least two segments of audio segment signals is zero based on the derivative information of the frequency domain curves corresponding to each of the at least two segments of audio segment signals.

Step 26132, filter the first frequency point corresponding to each of the at least two sections of audio clip signals based on the first frequency point corresponding to each of the at least two sections of audio clip signals and the frequency domain average value corresponding to each of the at least two sections of audio clip signals, so as to determine the second frequency point corresponding to each of the at least two sections of audio clip signals.

Illustratively, for each of the at least two mentioned segments of audio segment signals, a first frequency point corresponding to the segment of audio segment signal is filtered based on a frequency domain average value corresponding to the segment of audio segment signal to filter out a first frequency point whose frequency domain coefficient is smaller than the frequency domain average value, and a first frequency point whose frequency domain coefficient is greater than or equal to the frequency domain average value is reserved as a second frequency point. Thus, a second frequency point corresponding to the section of audio segment signal is determined.

Step 26133, filtering the second frequency points corresponding to the at least two segments of audio clip signals based on the adjacent frequency points of the second frequency points corresponding to the at least two segments of audio clip signals, so as to determine the third frequency points corresponding to the at least two segments of audio clip signals.

Illustratively, for each second frequency point in each audio segment signal, when the first derivative of the adjacent frequency point before the second frequency point is a positive number and the first derivative of the adjacent frequency point after the second frequency point is a negative number, the second frequency point is reserved, otherwise, the second frequency point is filtered. And performing the filtering operation on all the second frequency points in the section of audio clip signal to generate a third frequency point corresponding to the section of audio clip signal. That is, for the segment of audio segment signal, the second frequency point meeting the retention condition is determined as the third frequency point corresponding to the segment of audio segment signal.

In step 26134, the preset frequency points corresponding to the at least two segments of audio clip signals are determined based on the third frequency points corresponding to the at least two segments of audio clip signals.

Illustratively, the third frequency points corresponding to the at least two segments of audio clip signals are determined as the preset frequency points corresponding to the at least two segments of audio clip signals.

Illustratively, the preset frequency bin mentioned in step 26134 refers to a narrow frequency bin. In other words, the frequency points in each audio segment signal that satisfy the filtering condition mentioned in the above step are the narrow frequency points.

According to the audio processing method provided by the embodiment of the disclosure, the purpose of determining whether the frequency point is a preset frequency point is achieved by means of the first derivative value, the frequency domain average value and the adjacent frequency point information corresponding to the frequency point in each section of audio segment signal. The method and the device for processing the audio frequency can further improve the accuracy of the determined preset frequency point, and further improve the accuracy of subsequent audio frequency separation or noise processing.

Fig. 6 is a schematic flow chart illustrating a process of determining preset frequency points corresponding to at least two segments of audio segment signals respectively based on a screening condition and frequency domain signals corresponding to the at least two segments of audio segment signals respectively according to another exemplary embodiment of the present disclosure. The embodiment shown in fig. 6 of the present disclosure is extended on the basis of the embodiment shown in fig. 4 of the present disclosure, and the differences between the embodiment shown in fig. 6 and the embodiment shown in fig. 4 are emphasized below, and the descriptions of the same parts are omitted.

As shown in fig. 6, in the audio processing method provided by the embodiment of the disclosure, after the step of determining the frequency domain curves corresponding to the at least two segments of audio clip signals based on the frequency domain signals corresponding to the at least two segments of audio clip signals, the following steps are further included.

Step 2614, determine a fourth frequency point where the respective N-order derivative values of the at least two segments of audio segment signals are zero based on the derivative information of the frequency domain curves corresponding to the at least two segments of audio segment signals, where N is an integer greater than or equal to 2.

Step 2615, determine the fourth frequency point corresponding to at least two sections of audio fragment signals as the preset frequency point corresponding to at least two sections of audio fragment signals.

According to the audio processing method provided by the embodiment of the disclosure, the purpose of determining whether the frequency point is a preset frequency point is achieved by means of the high-order derivative value corresponding to the frequency point in each section of audio clip signal. Compared with the embodiment shown in fig. 5, the embodiment of the present disclosure can further simplify the calculation steps on the premise of ensuring the accuracy of the determined preset frequency point.

Fig. 7 is a flowchart illustrating an audio processing method according to another exemplary embodiment of the present disclosure. The embodiment shown in fig. 7 of the present disclosure is extended on the basis of the embodiment shown in fig. 1 of the present disclosure, and the differences between the embodiment shown in fig. 7 and the embodiment shown in fig. 1 are emphasized below, and the descriptions of the same parts are omitted.

As shown in fig. 7, in the audio processing method provided in the embodiment of the present disclosure, after the step of screening the preset frequency bins in the audio signal based on the screening condition and the audio signal, the following steps are further included.

And step 30, determining audio information corresponding to the preset frequency point based on the audio signal.

It should be understood that the preset frequency point is determined based on the frequency domain signal corresponding to the audio signal, and the audio information corresponding to the preset frequency point can be determined based on the preset frequency point and the audio signal.

Fig. 8 is a schematic flowchart illustrating a process of determining audio information corresponding to a preset frequency point based on an audio signal according to an exemplary embodiment of the present disclosure. As shown in fig. 8, in the audio processing method provided in the embodiment of the present disclosure, the step of determining the audio information corresponding to the preset frequency point based on the audio signal includes the following steps.

And step 31, determining a first frequency domain coefficient corresponding to a preset frequency point based on the audio signal.

Illustratively, the first frequency domain coefficients mentioned in step 31 refer to frequency domain coefficients corresponding to preset frequency bins determined based on frequency domain signals generated by performing a fourier transform operation on the audio signals (see formula (1) and formula (2) above).

And 32, generating audio information corresponding to the preset frequency point based on the first frequency domain coefficient.

Illustratively, the audio information corresponding to the preset frequency bin is determined based on the following formulas (3) and (4). It should be understood that the relevant parameters in equations (3) and (4) can be seen in the explanations relating to equations (1) and (2) above.

N_k(n)＝|C_k|cos(2πf_kn+θ_k) (3)

In the formulae (3) and (4), N_k(N) representing the audio information corresponding to the preset frequency points, M representing the number of the preset frequency points, and W (N) representing that the audio information N corresponding to the preset frequency points is separated from the audio signals x (N)_k(n) audio information remaining after (n), θ_kThe amount of adjustment of the phase of the signal is characterized.

The audio processing method provided by the embodiment of the disclosure achieves the purpose of determining the audio information corresponding to the preset frequency point by using the frequency domain coefficient corresponding to the preset frequency point, and further provides a precondition for subsequent audio separation operations (such as wideband and narrowband separation operations).

Fig. 9 is a schematic flowchart illustrating a process of determining audio information corresponding to a preset frequency point based on an audio signal according to another exemplary embodiment of the present disclosure. The embodiment shown in fig. 9 of the present disclosure is extended on the basis of the embodiment shown in fig. 8 of the present disclosure, and the differences between the embodiment shown in fig. 9 and the embodiment shown in fig. 8 are emphasized below, and the descriptions of the same parts are omitted.

As shown in fig. 9, in the audio processing method provided in the embodiment of the present disclosure, before the step of generating audio information corresponding to a preset frequency point based on a first frequency domain coefficient, the following steps are further included.

And step 33, determining a frequency domain coefficient average value based on the first frequency domain coefficients corresponding to the multiple preset frequency points.

Illustratively, the first frequency domain coefficients corresponding to each of the multiple preset frequency bins are subjected to arithmetic averaging to obtain a frequency domain coefficient average value.

The step of generating audio information corresponding to the preset frequency point based on the first frequency domain coefficient comprises the following steps.

Step 321, determining audio information corresponding to each of the plurality of preset frequency points based on the first frequency domain coefficient and the frequency domain coefficient average value.

Fig. 10 is a schematic flowchart illustrating a process of determining audio information corresponding to each of a plurality of preset frequency bins based on a first frequency domain coefficient and a frequency domain coefficient average value according to an exemplary embodiment of the present disclosure. As shown in fig. 10, in the audio processing method provided in the embodiment of the present disclosure, the step of determining the audio information corresponding to each of the multiple preset frequency bins based on the first frequency domain coefficient and the frequency domain coefficient average value includes the following steps.

Step 3211, the frequency domain coefficient average value is used to replace a first frequency domain coefficient corresponding to each of the multiple preset frequency points, so as to generate a second frequency domain coefficient corresponding to each of the multiple preset frequency points.

Exemplarily, the second frequency domain coefficient corresponding to each of the multiple preset frequency points is the calculated frequency domain coefficient average value.

Step 3212, audio information corresponding to each of the plurality of preset frequency bins is calculated based on the second frequency domain coefficients corresponding to each of the plurality of preset frequency bins.

The audio processing method provided by the embodiment of the disclosure can further reduce the calculation error, and further improve the accuracy of the audio information corresponding to the determined preset frequency point.

Fig. 11 is a flowchart illustrating an audio separation method according to another exemplary embodiment of the present disclosure. As shown in fig. 11, the audio separation method mainly includes the steps of:

firstly, acquiring an audio signal;

secondly, converting the audio signal into a frequency domain signal;

thirdly, screening a part of frequency points in the frequency domain signals as narrow-band frequency points;

and fourthly, respectively outputting the audio signals corresponding to the narrow-frequency points and the rest audio signals.

According to the audio signals corresponding to the narrow-band frequency points obtained in the above steps, the noise frequency relatively sensitive to human ears (that is, the audio signals corresponding to the default narrow-band frequency points are the noise relatively sensitive to human ears) can be selected, so that specific processing can be performed on the sensitive frequency in the following process, and the effect of performing optimized processing on the sound signals can be maintained as much as possible while the operation overhead is reduced. For example, active cancellation of noise signals may be performed only for the sensitive frequency point, that is, a corresponding cancellation signal is output for the screened narrow frequency point, where the amplitude of the cancellation signal is the same as or close to the amplitude of the narrow frequency corresponding to the frequency domain coefficient of the sensitive frequency point, but the phases of the cancellation signal are opposite. The sound wave signals of the two are mutually superposed when being transmitted in space, so that the mutual offset effect can be achieved, and active noise reduction or the construction of the sound quality of the product can be realized.

In another implementation, the separation of the audio signal may be performed as follows.

In a first step, an audio signal is acquired. For example, the sound signal is converted into an electrical signal x (n) by an audio sensing device 1, such as a microphone, a sound sensor, etc., to obtain an audio signal that can be processed by a signal processing unit.

Second, the signal processing unit cuts the electrical signal x (n) into at least two segments, such as x₁(n) and x₂(n), typically, the signals of each segment are steady state signals.

Thirdly, each section of the electric signal x is processed₁(n) and x₂(n) separately converting into frequency domain signals X₁(f) And X₂(f)。

Wherein the content of the first and second substances,

C_krepresenting the frequency domain coefficients corresponding to different frequency points as a complex number; k denotes the index of the frequency point, the frequency of which corresponds to f_k；

The calculation factors of the fourier transform are characterized.

In one implementation, the formula can be used

Calculating frequency domain coefficients and performing audio separation to improve the operation efficiency; where i is 1 and 2 … … denotes the symbol of the signal segment.

And fourthly, screening a part of frequency points in the frequency domain signals as narrow-band frequency points. For example, a maximum value point or a point with a high-order derivative of 0 in the frequency domain signal is screened as a narrow frequency point, or some frequency points are directly selected as narrow frequency points according to experience, and a frequency domain coefficient corresponding to the narrow frequency point is marked as C_{k, i narrow}。

The fifth step, according to the frequency domain coefficient C_{k, i narrow}And respectively outputting the audio signals corresponding to the narrow-frequency points and other audio signals. For example, in the screening and determining process, the audio signals N corresponding to the narrow-band frequency points are output respectively_k(n)＝|C_{k, i narrow}|cos(2πf_kn+θ_k) Outputs the rest of the electric signals as

Wherein, the rest of the electrical signals refer to audio information N corresponding to the preset frequency points separated from the audio signals x (N)_k(n) audio information remaining after (n). Theta_kIndicating the amount of adjustment of the phase of the signal.

Wherein the maximum value is selected by connecting the frequency domain signals X_i(f) And solving a first derivative, a second derivative or a high-order derivative by the fitting curve of each intermediate frequency point, and judging by matching with a zero crossing point or a derivative value of the derivative.

When some frequency points are selected as narrow frequency points according to experience, the method comprises the following steps: and respectively obtaining the frequency domain coefficients corresponding to the narrow-frequency points in each section of frequency domain signal, obtaining the average value of the frequency domain coefficients in the previous step, and screening each frequency point according to the average value. For example, active cancellation of noise signals may be performed only for the sensitive frequency point, that is, a corresponding cancellation signal is output for the screened sensitive frequency point (frequency point with a specific frequency), where the amplitude of the narrow frequency point of the cancellation signal is the same as or close to the amplitude of the narrow frequency corresponding to the frequency domain coefficient of the sensitive frequency point, but the phases of the narrow frequency points are opposite. The sound wave signals of the two are mutually superposed when being transmitted in space, so that the mutual offset effect can be achieved, and active noise reduction or the construction of the sound quality of the product can be realized.

Fig. 12 is a schematic structural diagram of an audio processing apparatus according to an exemplary embodiment of the present disclosure. As shown in fig. 12, an audio processing apparatus provided in an embodiment of the present disclosure includes:

an obtaining module 50, configured to obtain an audio signal;

the first determining module 100 is configured to determine a screening condition corresponding to an audio signal, where the screening condition is used to screen a preset frequency point in the audio signal;

and the screening module 200 is configured to screen a preset frequency point in the audio signal based on the screening condition and the audio signal.

Fig. 13 is a schematic structural diagram of a screening module according to an exemplary embodiment of the present disclosure. The embodiment shown in fig. 13 of the present disclosure is extended on the basis of the embodiment shown in fig. 12 of the present disclosure, and the differences between the embodiment shown in fig. 13 and the embodiment shown in fig. 12 are emphasized below, and the descriptions of the same parts are omitted.

As shown in fig. 13, in the audio processing apparatus provided in the embodiment of the present disclosure, the filtering module 200 includes:

a segmenting unit 240, configured to perform a truncation segmentation operation on the audio signal to generate at least two segments of audio segment signals;

a generating unit 250, configured to perform a time domain transform operation on the audio signal to generate a frequency domain signal corresponding to the audio signal;

the generating unit 250 is further configured to perform fourier transform operation on the at least two segments of audio segment signals, respectively, to determine frequency domain signals corresponding to the at least two segments of audio segment signals;

the screening unit 260 is configured to screen a preset frequency point in the audio signal based on the screening condition and the frequency domain signal;

the screening unit 260 is further configured to determine, based on the screening condition and the frequency domain signals corresponding to the at least two segments of audio clip signals, the preset frequency points corresponding to the at least two segments of audio clip signals, respectively.

In an embodiment of the present disclosure, the screening unit 260 is further configured to determine a frequency domain curve corresponding to each of the at least two segments of audio segment signals based on the frequency domain signals corresponding to each of the at least two segments of audio segment signals, determine a frequency domain average value corresponding to each of the at least two segments of audio segment signals based on the frequency domain curve corresponding to each of the at least two segments of audio segment signals, and determine a preset frequency point corresponding to each of the at least two segments of audio segment signals based on the frequency domain average value corresponding to each of the at least two segments of audio segment signals and derivative information of the frequency domain curve corresponding to each of the at least two segments.

In an embodiment of the present disclosure, the screening unit 260 is further configured to determine a first frequency point where a first derivative value of each of the at least two segments of audio segment signals is zero based on the derivative information of the frequency domain curve corresponding to each of the at least two segments of audio segment signals, filtering the first frequency points corresponding to the at least two sections of audio clip signals based on the first frequency points corresponding to the at least two sections of audio clip signals and the frequency domain average values corresponding to the at least two sections of audio clip signals, to determine the second frequency points corresponding to the at least two sections of audio clip signals, then filtering the second frequency points corresponding to the at least two sections of audio clip signals based on the adjacent frequency points of the second frequency points corresponding to the at least two sections of audio clip signals, and determining the third frequency points corresponding to the at least two sections of audio clip signals respectively, and determining the preset frequency points corresponding to the at least two sections of audio clip signals respectively based on the third frequency points corresponding to the at least two sections of audio clip signals respectively.

In an embodiment of the present disclosure, the screening unit 260 is further configured to determine a fourth frequency point, where the N-order derivative value of each of the at least two segments of audio segment signals is zero, based on the derivative information of the frequency domain curve corresponding to each of the at least two segments of audio segment signals, and determine a preset frequency point corresponding to each of the at least two segments of audio segment signals based on the fourth frequency point corresponding to each of the at least two segments of audio segment signals.

Fig. 14 is a schematic structural diagram of an audio processing apparatus according to another exemplary embodiment of the present disclosure. The embodiment shown in fig. 14 of the present disclosure is extended on the basis of the embodiment shown in fig. 12 of the present disclosure, and the differences between the embodiment shown in fig. 14 and the embodiment shown in fig. 12 are emphasized below, and the descriptions of the same parts are omitted.

As shown in fig. 14, the audio processing apparatus provided in the embodiment of the present disclosure further includes:

the second determining module 300 is configured to determine, based on the audio signal, audio information corresponding to a preset frequency point.

In an embodiment of the present disclosure, the second determining module 300 is further configured to determine a first frequency domain coefficient corresponding to the preset frequency bin based on the audio signal, and generate audio information corresponding to the preset frequency bin based on the first frequency domain coefficient.

In an embodiment of the present disclosure, the second determining module 300 is further configured to determine a first frequency domain coefficient corresponding to a preset frequency point based on the audio signal, determine a frequency domain coefficient average value based on the first frequency domain coefficient corresponding to each of the plurality of preset frequency points, and determine audio information corresponding to each of the plurality of preset frequency points based on the first frequency domain coefficient and the frequency domain coefficient average value.

In an embodiment of the present disclosure, the second determining module 300 is further configured to replace the first frequency domain coefficient corresponding to each of the multiple preset frequency points with the frequency domain coefficient average value to generate a second frequency domain coefficient corresponding to each of the multiple preset frequency points, and calculate the audio information corresponding to each of the multiple preset frequency points based on the second frequency domain coefficient corresponding to each of the multiple preset frequency points.

It should be understood that the operations and functions of the obtaining module 50, the first determining module 100, the screening module 200, and the second determining module 300 included in the audio processing apparatus provided in fig. 12 to 14, and the segmenting unit 240, the generating unit 250, and the screening unit 260 included in the screening module 200 may refer to the audio processing method provided in fig. 1 to 11, and are not described herein again to avoid repetition.

Fig. 15 is a block diagram illustrating an architecture of a sound processing system according to an exemplary embodiment of the present disclosure. It should be understood that the audio processing method mentioned in the above embodiments can be applied to the sound processing system shown in fig. 15. As shown in fig. 15, the system includes:

the audio sensing device 1 is used for converting a sound signal into an electric signal;

the signal processing unit is connected with the audio sensing device, receives the electric signal, processes the electric signal and outputs a control signal to the acoustic conversion device 3;

the acoustic conversion device 3 is controlled by the signal processing unit to output corresponding sound;

wherein the signal processing unit is arranged to perform the steps of:

intercepting the electric signal into at least two sections, wherein each section of signal is a steady-state signal;

converting the electric signals into frequency domain signals by a fourier transform device 21, for example, a Digital Signal Processing (DSP) chip such as a fourier transformer or a fast fourier transformer;

screening the maximum value in the frequency domain signal by a digital signal processing device 22 such as a digital operation device, a filter device and the like;

acquiring frequency domain coefficients corresponding to narrow-frequency points in each section of frequency domain signal, and respectively calculating the average value of the frequency domain coefficients of the narrow-frequency points;

and respectively outputting the audio signals corresponding to the narrow-frequency points and the rest audio signals according to the average value of the frequency domain coefficients in the last step. And confirming the corresponding frequency domain coefficient according to the narrow frequency point to generate a signal which is cancelled or enhanced with the electric signal of the narrow frequency point, and driving the acoustic conversion device by the cancelled or enhanced electric signal to output corresponding sound.

The signals which are offset from the separated electric signals of the frequency points are as follows: and the signal characteristics are close to or equal to the amplitude of the narrow frequency corresponding to the frequency domain coefficient of the electric signal of each frequency point, but the phase of the signal characteristics is opposite to that of the narrow frequency corresponding to the frequency domain coefficient of the electric signal of each frequency point. Therefore, the cancellation signal and the sound are superposed in the space, and the cancellation signal and the sound can be cancelled mutually, so that the sound in the transmission range of the cancellation signal is reduced. The enhanced signal of the electric signal of each frequency point is the signal which has the same frequency as the frequency point but has the changed phase or the amplitude and is enhanced.

The signal processing unit calculates an average value of the frequency domain coefficients corresponding to the signal, and the main purpose of the signal processing unit is to further reduce errors. The above process of cutting the signal into several segments, although the rough distribution of the frequency domain signal is not changed, inevitably introduces operation errors and changes of the signal distribution. Therefore, the frequency domain coefficients corresponding to the frequency points in each section of frequency domain signal can be replaced through the average value, so that the amplitude of the narrow frequency corresponding to the frequency domain coefficient is as close as possible to the component of the frequency in the actual sound, and a better sound quality processing effect is ensured. The step of calculating the average value of the frequency domain coefficients corresponding to the signal may specifically include:

acquiring frequency domain coefficients corresponding to narrow-band frequency points in each section of frequency domain signal, and respectively calculating the average value of the frequency domain coefficients of the narrow-band frequency points

Taking the average value

Replacing the frequency domain coefficient corresponding to the frequency point for replacement;

according to the average value after replacement

Outputting electric signal N of each narrow frequency point_k(n)＝|C_{k, i narrow}|cos(2πf_kn+θ_k)。

The advantages of the technical scheme of the disclosure are mainly reflected in that: firstly, Fourier transform is carried out on a time domain signal of sound to a frequency domain signal, and each frequency point and a corresponding frequency domain coefficient value are respectively obtained. Narrow frequency points can be separated by observation means (for example, peaks and spikes apparent in a spectrum signal are screened out by manual observation through a spectrometer), or by a separation algorithm (specifically, by means of a derivative) according to the present disclosure. By separating the wide and narrow frequencies, the present disclosure can screen out the narrow frequency range sensitive to human ears for the subsequent acoustic signal processing step, so that the subsequent operation unit processes and cancels the narrow frequency signal in an emphasis manner. Therefore, the method and the device can greatly reduce the operation overhead of the system by simplifying the calculation method while ensuring the sound processing effect. The method and the device can reduce the requirement on the arithmetic device while ensuring the sound processing effect, thereby fundamentally reducing the cost of sound quality construction or active noise reduction related products.

Next, an electronic apparatus according to an embodiment of the present disclosure is described with reference to fig. 16. Fig. 16 is a schematic structural diagram of an electronic device according to an exemplary embodiment of the present disclosure.

As shown in fig. 16, the electronic device 70 includes one or more processors 701 and a memory 702.

The processor 701 may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device 70 to perform desired functions.

Memory 702 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, Random Access Memory (RAM), cache memory (cache), and/or the like. The non-volatile memory may include, for example, Read Only Memory (ROM), hard disk, flash memory, etc. One or more computer program instructions may be stored on the computer-readable storage medium and executed by the processor 701 to implement the audio processing methods of the various embodiments of the present disclosure described above and/or other desired functions. Various contents such as audio signals may also be stored in the computer-readable storage medium.

In one example, the electronic device 70 may further include: an input device 703 and an output device 704, which are interconnected by a bus system and/or other form of connection mechanism (not shown).

The input device 703 may include, for example, a keyboard, a mouse, and the like.

The output device 704 may output various information to the outside, including audio information corresponding to the determined preset frequency point, and the like. The output means 704 may comprise, for example, a display, a communication network, a remote output device connected thereto, and the like.

Of course, for simplicity, only some of the components of the electronic device 70 relevant to the present disclosure are shown in fig. 16, omitting components such as buses, input/output interfaces, and the like. In addition, the electronic device 70 may include any other suitable components, depending on the particular application.

In addition to the above-described methods and apparatus, embodiments of the present disclosure may also be a computer program product comprising computer program instructions that, when executed by a processor, cause the processor to perform the steps in the audio processing method according to various embodiments of the present disclosure described in the "exemplary methods" section of this specification above.

The computer program product may write program code for carrying out operations for embodiments of the present disclosure in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server.

Furthermore, embodiments of the present disclosure may also be a computer-readable storage medium having stored thereon computer program instructions that, when executed by a processor, cause the processor to perform steps in an audio processing method according to various embodiments of the present disclosure described in the "exemplary methods" section above of this specification.

The computer-readable storage medium may take any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may include, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The foregoing describes the general principles of the present disclosure in conjunction with specific embodiments, however, it is noted that the advantages, effects, etc. mentioned in the present disclosure are merely examples and are not limiting, and they should not be considered essential to the various embodiments of the present disclosure. Furthermore, the foregoing disclosure of specific details is for the purpose of illustration and description and is not intended to be limiting, since the disclosure is not intended to be limited to the specific details so described.

The block diagrams of devices, apparatuses, systems referred to in this disclosure are only given as illustrative examples and are not intended to require or imply that the connections, arrangements, configurations, etc. must be made in the manner shown in the block diagrams. These devices, apparatuses, devices, systems may be connected, arranged, configured in any manner, as will be appreciated by those skilled in the art. Words such as "including," "comprising," "having," and the like are open-ended words that mean "including, but not limited to," and are used interchangeably therewith. The words "or" and "as used herein mean, and are used interchangeably with, the word" and/or, "unless the context clearly dictates otherwise. The word "such as" is used herein to mean, and is used interchangeably with, the phrase "such as but not limited to".

It is also noted that in the devices, apparatuses, and methods of the present disclosure, each component or step can be decomposed and/or recombined. These decompositions and/or recombinations are to be considered equivalents of the present disclosure.

Those of ordinary skill in the art will understand that: although the present disclosure has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that changes may be made in the embodiments and/or equivalents thereof without departing from the spirit and scope of the disclosure. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present disclosure should be included in the protection scope of the present disclosure.

Claims

1. An audio processing method, comprising:

acquiring an audio signal;

determining a screening condition corresponding to the audio signal, wherein the screening condition is used for screening a preset frequency point in the audio signal;

and screening the preset frequency points in the audio signals based on the screening conditions and the audio signals.

2. The audio processing method according to claim 1, wherein the filtering the preset frequency points in the audio signal based on the filtering condition and the audio signal comprises:

performing time domain transformation operation on the audio signal to generate a frequency domain signal corresponding to the audio signal;

and screening the preset frequency points in the audio signals based on the screening conditions and the frequency domain signals.

3. The audio processing method of claim 2, wherein before performing a time-domain transform operation on the audio signal to generate a frequency-domain signal corresponding to the audio signal, further comprising:

intercepting and segmenting the audio signal to generate at least two segments of audio segment signals;

performing time domain transformation operation on the audio signal to generate a frequency domain signal corresponding to the audio signal, including:

respectively carrying out Fourier transform operation on the at least two sections of audio clip signals to determine frequency domain signals corresponding to the at least two sections of audio clip signals;

and, screening the preset frequency points in the audio signal based on the screening condition and the frequency domain signal, including:

and respectively determining the preset frequency points corresponding to the at least two sections of audio clip signals based on the screening condition and the frequency domain signals corresponding to the at least two sections of audio clip signals.

4. The audio processing method according to claim 3, wherein determining the preset frequency bins corresponding to the at least two segments of audio segment signals respectively based on the screening condition and the frequency domain signals corresponding to the at least two segments of audio segment signals respectively comprises:

determining frequency domain curves corresponding to the at least two sections of audio fragment signals based on the frequency domain signals corresponding to the at least two sections of audio fragment signals;

aiming at each section of audio clip signal, arranging all frequency points in a frequency domain curve corresponding to the audio clip signal in an ascending order or a descending order according to the magnitude of the absolute value of a vertical coordinate in a frequency domain coordinate;

and performing interception and screening operation based on the rearranged frequency points to determine the preset frequency points corresponding to the audio clip signals.

5. The audio processing method according to claim 3, wherein determining the preset frequency bins corresponding to the at least two segments of audio segment signals respectively based on the screening condition and the frequency domain signals corresponding to the at least two segments of audio segment signals respectively comprises:

determining a frequency domain average value corresponding to each of the at least two segments of audio segment signals based on the frequency domain curves corresponding to each of the at least two segments of audio segment signals;

and determining the preset frequency points corresponding to the at least two sections of audio clip signals based on the frequency domain average values corresponding to the at least two sections of audio clip signals and the derivative information of the frequency domain curves corresponding to the at least two sections of audio clip signals.

6. The audio processing method according to claim 5, wherein determining the preset frequency bins corresponding to the at least two segments of audio segment signals based on the frequency domain average values corresponding to the at least two segments of audio segment signals and the derivative information of the frequency domain curves corresponding to the at least two segments of audio segment signals comprises:

determining a first frequency point with a zero first derivative value corresponding to the at least two sections of audio segment signals based on the derivative information of the frequency domain curves corresponding to the at least two sections of audio segment signals;

filtering the first frequency points corresponding to the at least two sections of audio clip signals based on the first frequency points corresponding to the at least two sections of audio clip signals and the frequency domain average values corresponding to the at least two sections of audio clip signals to determine second frequency points corresponding to the at least two sections of audio clip signals;

filtering the second frequency points corresponding to the at least two sections of audio clip signals respectively based on the adjacent frequency points of the second frequency points corresponding to the at least two sections of audio clip signals respectively so as to determine third frequency points corresponding to the at least two sections of audio clip signals respectively;

and determining the third frequency points corresponding to the at least two sections of audio clip signals as the preset frequency points corresponding to the at least two sections of audio clip signals.

7. The audio processing method according to claim 3, wherein determining the preset frequency bins corresponding to the at least two segments of audio segment signals respectively based on the screening condition and the frequency domain signals corresponding to the at least two segments of audio segment signals respectively comprises:

determining a fourth frequency point with an N-order derivative value of zero corresponding to the at least two sections of audio clip signals based on the derivative information of the frequency domain curves corresponding to the at least two sections of audio clip signals, wherein N is an integer greater than or equal to 2;

and determining the fourth frequency points corresponding to the at least two sections of audio clip signals as the preset frequency points corresponding to the at least two sections of audio clip signals.

8. The audio processing method according to any one of claims 1 to 7, further comprising, after the filtering the preset frequency bins in the audio signal based on the filtering condition and the audio signal:

and determining audio information corresponding to the preset frequency point based on the audio signal.

9. The audio processing method according to claim 8, wherein determining the audio information corresponding to the preset frequency bin based on the audio signal comprises:

determining a first frequency domain coefficient corresponding to the preset frequency point based on the audio signal;

and generating the audio information corresponding to the preset frequency point based on the first frequency domain coefficient.

10. The audio processing method according to claim 9, wherein the number of the preset frequency bins is multiple, and before the audio information corresponding to the preset frequency bins is generated based on the first frequency domain coefficients, the method further comprises:

determining a frequency domain coefficient average value based on the first frequency domain coefficients corresponding to the multiple preset frequency points respectively;

generating the audio information corresponding to the preset frequency point based on the first frequency domain coefficient includes:

and determining the audio information corresponding to the plurality of preset frequency points based on the first frequency domain coefficient and the frequency domain coefficient average value.

11. The audio processing method according to claim 10, wherein determining the audio information corresponding to each of the plurality of preset frequency bins based on the first frequency-domain coefficient and the frequency-domain coefficient average value comprises:

replacing the first frequency domain coefficients corresponding to the multiple preset frequency points with the frequency domain coefficient average value to generate second frequency domain coefficients corresponding to the multiple preset frequency points;

and calculating the audio information corresponding to the plurality of preset frequency points based on the second frequency domain coefficients corresponding to the plurality of preset frequency points.

12. The audio processing method according to claim 8, wherein after determining the audio information corresponding to the predetermined frequency bin based on the audio signal, further comprising:

and separating the audio information corresponding to the preset frequency point from the audio signal.

13. An audio processing apparatus, comprising:

the acquisition module is used for acquiring an audio signal;

the first determining module is used for determining a screening condition corresponding to the audio signal, wherein the screening condition is used for screening a preset frequency point in the audio signal;

and the screening module is used for screening the preset frequency points in the audio signals based on the screening conditions and the audio signals.

14. A computer-readable storage medium storing a computer program for executing the audio processing method of any of the above claims 1-12.

15. An electronic device, the electronic device comprising:

a processor;

a memory for storing the processor-executable instructions;

the processor for performing the audio processing method of any of the above claims 1-12.