CN112133332B

CN112133332B - Method, device and equipment for playing audio

Info

Publication number: CN112133332B
Application number: CN202011011871.5A
Authority: CN
Inventors: 闫震海
Original assignee: Tencent Music Entertainment Technology Shenzhen Co Ltd
Current assignee: Tencent Music Entertainment Technology Shenzhen Co Ltd
Priority date: 2020-09-23
Filing date: 2020-09-23
Publication date: 2022-04-12
Anticipated expiration: 2040-09-23
Also published as: CN112133332A

Abstract

The application discloses a method, a device and equipment for playing audio, and belongs to the technical field of internet. The method comprises the following steps: acquiring frequency amplitude response of the audio playing equipment when the detected audio data is played, and determining amplitude gain of the audio playing equipment at each frequency point based on the frequency amplitude response; determining the average value of the amplitude gain of each frequency point in a preset frequency range, and determining the average value of the amplitude gain as a reference amplitude gain corresponding to the audio playing equipment; determining the gain adjustment proportion of the audio playing equipment at each frequency point based on the amplitude gain of the audio playing equipment at each frequency point and the reference amplitude gain to form a gain adjustment proportion vector; and adjusting the target audio data based on the gain adjustment proportion vector, and playing the adjusted target audio data through the audio playing equipment. By the adoption of the method and the device, the problem that energy is increased or reduced when the audio playing equipment plays the audio signals of individual frequency points or frequency bands can be reduced.

Description

Method, device and equipment for playing audio

Technical Field

The present application relates to the field of internet technologies, and in particular, to a method, an apparatus, and a device for playing audio.

Background

The audio playing device may process and play the audio file through the audio playback system, which may include decoding of the audio file, transmission of an audio signal, conversion of the audio signal from a digital signal to an analog signal, playing of the audio signal by a sound generating unit, and the like.

In the course of implementing the present application, the inventors found that the related art has at least the following problems:

because the sounding unit is a hardware device, such as a vibration sounding unit of an earphone or a sound, there may be a certain hardware defect, and the problem of energy increase or reduction occurs when audio signals of individual frequency points or frequency bands are played.

Disclosure of Invention

The embodiment of the application provides a method, a device and equipment for playing audio, which can solve the problem that energy is increased or reduced when audio playing equipment plays audio signals of individual frequency points or frequency bands. The technical scheme is as follows:

in one aspect, a method of playing audio is provided, the method comprising:

acquiring frequency amplitude response of audio playing equipment when playing detected audio data, and determining amplitude gain of the audio playing equipment at each frequency point based on the frequency amplitude response;

determining the average value of the amplitude gain of each frequency point in the preset frequency range, and determining the average value of the amplitude gain as the reference amplitude gain corresponding to the audio playing equipment;

determining the gain adjustment proportion of the audio playing equipment at each frequency point based on the amplitude gain of the audio playing equipment at each frequency point and the reference amplitude gain to form a gain adjustment proportion vector;

when target audio data to be played through the audio playing device are acquired, the target audio data are adjusted based on the gain adjustment proportion vector to obtain adjusted target audio data, and the adjusted target audio data are played through the audio playing device.

Optionally, the determining the average value of the amplitude gains at the frequency points in the preset frequency range includes:

dividing each frequency point of the preset frequency range in a logarithmic domain based on a first division interval to obtain a plurality of frequency point sets;

averaging the amplitude gain corresponding to each frequency point in each frequency point set to obtain a mean value corresponding to each frequency point set;

and averaging the average values corresponding to the plurality of frequency point sets to obtain the amplitude gain average value.

Optionally, the determining the average value of the amplitude gain at each frequency point in the preset frequency range includes:

on the basis of a first division interval, performing equal interval division on each frequency point in the preset frequency range in a logarithmic domain to obtain a plurality of first frequency point sets;

averaging the amplitude gain corresponding to each frequency point in each first frequency point set to obtain a first average value corresponding to each first frequency point set;

fitting the first mean values to obtain fitting results corresponding to the first mean values;

determining the amplitude gain corresponding to each frequency point after fitting processing based on the fitting result corresponding to each first mean value;

dividing each frequency point of the preset frequency range in a logarithmic domain based on a second division interval to obtain a plurality of second frequency point sets, wherein the second preset division density is greater than the first preset division density;

averaging the amplitude gain corresponding to each frequency point in each second frequency point set to obtain a second average value corresponding to each second frequency point set;

and averaging second average values corresponding to the multiple frequency point sets to obtain the amplitude gain average value.

Optionally, the determining a gain adjustment ratio of the audio playing device at each frequency point based on the amplitude gain of the audio playing device at each frequency point and the reference amplitude gain includes:

and for each frequency point, determining the ratio of the reference amplitude gain to the amplitude gain corresponding to the frequency point as the gain adjustment proportion corresponding to the frequency point.

for each frequency point, determining a frequency section to which the frequency point belongs, and determining a maximum gain adjustment proportion corresponding to the frequency point according to a corresponding relation between a preset frequency interval and the maximum gain adjustment proportion, wherein the frequency section comprises a low-frequency section, a medium-frequency section and a high-frequency section;

when the corresponding amplitude gain in the frequency point is larger than the reference amplitude gain, if the ratio of the reference amplitude gain to the amplitude gain corresponding to the frequency point is larger than or equal to the reciprocal of the maximum gain adjustment ratio, determining the ratio as the gain adjustment ratio corresponding to the frequency point; if the ratio is smaller than the reciprocal of the maximum gain adjustment proportion, determining the reciprocal of the maximum gain adjustment proportion as the gain adjustment proportion corresponding to the frequency point;

when the amplitude gain corresponding to the frequency point is smaller than the reference amplitude gain, if the ratio of the reference amplitude gain to the amplitude gain corresponding to the frequency point is smaller than the maximum gain adjustment ratio, determining the ratio as the gain adjustment ratio corresponding to the frequency point; and if the ratio is larger than the reciprocal of the maximum gain adjustment proportion, determining the maximum gain adjustment proportion as the gain adjustment proportion corresponding to the frequency point.

Optionally, the adjusting the target audio data based on the gain adjustment vector to obtain adjusted target audio data includes:

converting the target audio data into a frequency domain to obtain frequency domain data corresponding to the target audio data;

multiplying the frequency domain data by the gain adjustment vector, and converting the multiplication result into a time domain to obtain adjusted target audio data; or the like, or, alternatively,

converting the gain adjustment vector into a time domain to obtain a converted gain adjustment vector;

and performing convolution processing on the gain adjustment vector and the target audio data to obtain adjusted target audio data.

In another aspect, an apparatus for playing audio is provided, the apparatus including:

the acquisition module is used for acquiring frequency amplitude response of the audio playing equipment when playing the detected audio data and determining amplitude gain of the audio playing equipment at each frequency point based on the frequency amplitude response;

the determining module is configured to determine a mean value of the amplitude gains of the frequency points in the preset frequency range, and determine the mean value of the amplitude gains as a reference amplitude gain corresponding to the audio playing device; determining the gain adjustment proportion of the audio playing equipment at each frequency point based on the amplitude gain of the audio playing equipment at each frequency point and the reference amplitude gain to form a gain adjustment proportion vector;

and the adjusting module is used for adjusting the target audio data based on the gain adjustment proportion vector when the target audio data to be played through the audio playing equipment is obtained, obtaining the adjusted target audio data, and playing the adjusted target audio data through the audio playing equipment.

Optionally, the determining module is configured to:

Optionally, the adjusting module is configured to:

In yet another aspect, a computer device is provided, which includes a processor and a memory, where at least one instruction is stored, and is loaded and executed by the processor to implement the operations performed by the method for playing audio as described above.

In yet another aspect, a computer-readable storage medium having at least one instruction stored therein is provided, which is loaded and executed by a processor to implement the operations performed by the method for playing audio as described above.

The technical scheme provided by the embodiment of the application has the following beneficial effects:

the corresponding reference amplitude gain is determined by detecting the amplitude gain of the audio playing device at each frequency, then the gain adjustment proportion of each frequency point is determined according to the reference amplitude gain, and the corresponding gain adjustment proportion vector is determined. Therefore, before the audio playing equipment plays the audio data, the audio data are adjusted according to the gain adjustment proportion vector, the amplitude of each frequency point corresponding to the audio data can be adjusted in advance according to the actual gain of each frequency point by the audio playing equipment, and the problem that the energy is increased or reduced when the audio playing equipment plays the audio signals of individual frequency points or frequency bands can be solved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a flowchart of a method for playing audio according to an embodiment of the present application;

fig. 2 is a schematic structural diagram of an apparatus for playing audio according to an embodiment of the present application;

fig. 3 is a schematic structural diagram of a terminal according to an embodiment of the present application.

Detailed Description

To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.

The method for playing the audio provided by the embodiment of the application can be realized by a terminal, and the terminal can be a smart phone, a tablet computer, a notebook computer, a desktop computer, various intelligent devices and the like. The terminal may have a processor and a memory, where the memory may store an execution program, processing data, and the like corresponding to the method for playing audio provided in the embodiment of the present application, and the processor may process the execution program and the processing data stored in the memory to implement the method for playing audio provided in the embodiment of the present application. The terminal can also be provided with an earphone interface, a loudspeaker, a Bluetooth module, a WiFi module and the like, and can play or transmit data such as audio and the like.

An audio playback system is a system in an audio playback apparatus that converts an audio file into mechanical vibrations of a generating unit. The method comprises the steps of transcoding an audio file, performing digital-to-analog conversion on an audio signal, converting an electric signal into mechanical vibration and the like. The audio playing device may be a single device, for example, a sound device that can store audio, or an audio playing device that is composed of a plurality of sub devices. For example, the audio playing device is composed of a mobile phone and a headset, a mobile phone and a bluetooth sound box, etc. Under an ideal condition, when the audio playing device plays the audio file through the audio playback system, the actually played audio signal should have the same gain for the audio signal corresponding to the played audio file in each frequency band. I.e. the audio signal input to the audio reproduction system and the signal output by the audio reproduction system have the same ratio of corresponding amplitudes at different frequency points. However, since the sound generating unit is a hardware device, when the sound generating unit generates sound by vibration, the sound generating unit is affected by various factors, and a certain error may exist, so that an audio signal output by the audio playback system may have an error at an individual frequency point, and the audio playback system may not perfectly restore the audio signal corresponding to the audio file, thereby reducing the experience of a user listening to the audio by using the audio playback device. The method for playing the audio, provided by the embodiment of the application, can set the corresponding adjustment vector according to the audio playback system, and can compensate the audio signal in advance when the audio file is played through the audio playing system, so that the problem that the energy is increased or reduced when the audio playing equipment plays the audio signal of a specific frequency point or frequency band can be reduced.

Fig. 1 is a flowchart of a method for playing audio according to an embodiment of the present application. Referring to fig. 2, the embodiment includes:

step 101, obtaining a frequency amplitude response of the audio playing device when playing the detected audio data, and determining an amplitude gain of the audio playing device at each frequency point based on the frequency amplitude response.

The frequency amplitude response comprises amplitude gains of audio signals played by the audio playing device corresponding to the frequency points, and the frequency amplitude response corresponding to the audio playing device can be obtained by performing Fourier transform on a detected time domain impulse response of the audio playing device when the detected audio data is played.

In an implementation, the time domain impulse response of the audio playing device when playing the detected audio data may be the time domain impulse response of an audio playback system in the audio playing device. The time domain impulse response of the audio playback system may be determined from a ratio of the detected signal output by the audio playback system and the detected signal input by the audio playback system. The signals output by the audio playback system can be obtained by recording the audio corresponding to the detected audio data played by the audio playing device by other sound pickup devices. The signal input by the audio playback system can be obtained after transcoding the played detected audio. After obtaining the time domain impulse response of the audio playback system, the time domain impulse response may be fourier transformed to obtain the frequency-amplitude response of the audio playback device, that is, the amplitude gain at each frequency point, as follows:

HAbs(w)＝abs(fft(h(n),Nfft))；

wherein the time domain impulse response of the music playback system is h (n). Function abs () represents the amplitude value of the complex number, function fft () represents the fast fourier transform of the Nfft point on the time domain signal, w represents the frequency identification, and habs (w) is the corresponding amplitude response at w frequency. Wherein the amplitude response is the amplitude gain.

Since the fourier transform of the real signal has conjugate symmetry, the amplitude response HAbs of the playback system only needs to retain part of the data, i.e. the frequency identification only needs to take 0-Nfft/2. The reserved frequency points are respectively as follows: freq [0, fs/Nfft,2 x fs/Nfft,3 x fs/Nfft, …, fs/2], fs is the sampling rate. The processing for each frequency point in the subsequent step 102-103 may be the processing for each frequency point in Freq.

Step 102, determining an average value of the amplitude gains of the frequency points in the preset frequency range, and determining the average value of the amplitude gains as a reference amplitude gain corresponding to the audio playing device.

In practice, when recording audio played by other audio playing devices through a sound pickup device, a large error may exist between the recorded audio signal and the audio actually played by the audio playing device in a low frequency band and a high frequency band, that is, a certain error exists between the amplitude of the recorded audio signal in the low frequency band or the high frequency band and the amplitude of the audio signal actually played. Therefore, when the amplitude gain of each frequency point is calculated, the average value of the amplitude gain can be determined only according to the amplitude gain corresponding to each frequency point in the middle frequency band (i.e. the preset frequency range). And then determining the average value of the determined amplitude gains as a reference amplitude gain corresponding to the audio playing equipment.

Optionally, since the perception capability of the human ear to the sound frequency has a logarithmic feature, when calculating the average value of the amplitude gain of each frequency point in the preset frequency range, each frequency point in the preset frequency range may be divided in a logarithmic domain to obtain different frequency segments, and then the average value of each frequency point in each frequency segment is averaged to obtain the reference amplitude gain. The corresponding processing may be as follows: dividing each frequency point of a preset frequency range in a logarithmic domain based on a first division interval to obtain a plurality of frequency point sets; averaging the amplitude gain corresponding to each frequency point in each frequency point set to obtain a mean value corresponding to each frequency point set; and averaging the average values corresponding to the plurality of frequency point sets to obtain an amplitude gain average value.

In implementation, dividing each frequency point of the preset frequency range in the logarithmic domain according to the first division interval may be divided by two division methods.

The first method is as follows: and converting the first division interval into a division interval of a real number domain according to the characteristics of the log domain. That is, the number of frequency points included in the successively adjacent division intervals tends to increase approximately exponentially. The partition limit of each frequency bin can be determined, for example, by the following equation (1):

fcband (k) ═ (10^ (1/N)) ^ (1.5 ^ k) formula (1);

wherein fcband (K) represents a division frequency point (i.e. division limit) of the K-th frequency segment and the K + 1-th frequency segment; n represents the density of the entire band division, i.e., the first division interval, which can be preset by the skilled person, and a larger value represents a finer division.

After the K divided frequency points are obtained, the frequency points in the preset frequency range may be divided according to the divided frequency points in the preset frequency range to obtain a plurality of frequency point sets. Wherein the center frequency point in each set of frequency points may be denoted as fc (k) ═ (10^ (1/N)) ^ k.

The second method comprises the following steps: obtaining a logarithmic value of each frequency point in a preset frequency range by using a logarithm of each frequency point in the preset frequency range, for example, a base-10 logarithm, then dividing the logarithmic values of the plurality of frequency points at a fixed numerical interval (a first division interval) to obtain a plurality of division boundaries (i.e., division values), then determining a division value N, determining the power N of 10 as a division frequency point (i.e., a division boundary), and then dividing each frequency point in the preset frequency range by using the division boundary to obtain a plurality of frequency point sets.

After the multiple frequency point sets are obtained, averaging may be performed according to the amplitude gain corresponding to the frequency point of each frequency point set to obtain a mean value corresponding to each frequency point set, and then averaging is performed on the mean values corresponding to the multiple frequency point sets to obtain an amplitude gain mean value.

In another possible scheme, in order to obtain a more accurate amplitude gain average value, fitting processing may be performed on the amplitude gain of each frequency point, where the fitting processing may be smoothing processing, and then the amplitude gain average value is determined according to the amplitude gain after the smoothing processing, and the corresponding processing may be as follows:

step 1021, based on the first division interval, equally dividing each frequency point in the preset frequency range in the logarithmic domain to obtain a plurality of first frequency point sets.

Step 1022, averaging the amplitude gains corresponding to the frequency points in each first frequency point set to obtain a first average value corresponding to each first frequency point set.

The processing in steps 1021 and 1022 is the same as the first mean processing procedure corresponding to each first frequency point set, and will not be further described here.

And 1023, fitting the first mean values to obtain fitting results corresponding to the first mean values.

In an implementation, after obtaining the plurality of first mean values, the first mean values may be subjected to a fitting process. For example, the frequency is used as an abscissa value, the first mean value is used as an ordinate value, and a smooth curve corresponding to the first mean value is determined according to a cubic spline interpolation method. The first mean value is used, and then the fitting processing is carried out on the first mean value, so that the influence of the amplitude gain of a large error value of an individual frequency point due to the measurement error on the fitting result can be reduced.

And 1024, determining the amplitude gain after fitting processing corresponding to each frequency point based on the fitting result corresponding to each first mean value, and dividing each frequency point in a preset frequency range in a logarithmic domain based on a second division interval to obtain a plurality of second frequency point sets.

Wherein the second predetermined division density is greater than the first predetermined division density.

In implementation, after obtaining the smooth curve corresponding to the first average value, the amplitude gain corresponding to each frequency point on the smooth curve may be determined, and then dividing each frequency point of the preset frequency range in the logarithmic domain in the step 102 may be performed in the same processing procedure through two dividing methods, and each frequency point of the preset frequency range may be divided through two dividing methods, so as to obtain a plurality of second frequency point sets.

When the division is performed according to the first method, the value N in the formula (1) may be set to be larger, and other processing steps are the same as those in the first method, and are not described herein again. When the division is performed according to the second mode, the first division interval in the second mode may be set to be smaller, and other processing steps are the same as those in the second mode, and are not described again here.

And 1025, averaging the amplitude gains corresponding to the frequency points in each second frequency point set to obtain a second average value corresponding to each second frequency point set, and averaging the second average values corresponding to the multiple frequency point sets to obtain an amplitude gain average value.

In implementation, after the multiple frequency point sets are obtained, averaging may be performed according to the amplitude gain corresponding to the frequency point of each second frequency point set to obtain an average value corresponding to each second frequency point set, and then averaging the average values corresponding to the multiple frequency second frequency point sets to obtain an amplitude gain average value.

Step 103, determining the gain adjustment proportion of the audio playing device at each frequency point based on the amplitude gain of the audio playing device at each frequency point and the reference amplitude gain, and forming a gain adjustment proportion vector.

And for each frequency point, determining the ratio of the reference amplitude gain to the amplitude gain corresponding to each frequency point as the gain adjustment proportion corresponding to each frequency point.

In an implementation, after determining the amplitude gain of the audio playing device at each frequency point, a ratio of the reference amplitude gain to the amplitude gain at each frequency point may be determined as a gain adjustment proportion for each frequency point, and then the gain adjustment proportions for each frequency point may be determined to form a vector. Since the frequency points processed in the above steps 101 to 102 are each frequency point in Freq, only half of the frequency points obtained by sampling are included. Therefore, after the gain adjustment proportion component vector of each frequency point is obtained, a complete gain adjustment proportion vector can be constructed by the gain adjustment proportion component vector of each frequency point according to the characteristic of conjugate symmetry according to the fact that the real number signal has conjugate symmetry in the frequency domain, as follows:

the gain adjustment proportional vector, comp al; flip (conj (compenAll'))) ];

the function flip () represents that all elements of the vector are arranged in a reverse direction, the function conj () represents that a conjugate is taken, and the compenAll' represents that the first and last elements of the vector are formed by eliminating the gain adjustment proportion of each frequency point. After the composital is obtained, fourier transform may be performed on the composital to obtain a gain adjustment scaling vector compositt corresponding to a time domain, as follows:

compenT＝circshift(ifft(compenAll,Nfft),Nfft/2)；

wherein the function circshift () represents a cyclic shift of Nfft/2 points to the vector and the function ift () represents an inverse fast fourier transform. The variable Nfft represents the subdivision degree of the frequency domain, and the larger the value of Nfft, the more the number of points corresponding to the frequency domain.

In the actual calculation of the compensation vector, Nfft may take a larger value to obtain a more refined calculation result. Optionally, after the gain adjustment scaling vector is obtained, the gain adjustment scaling vector may be truncated, and the length of the time domain impulse response is set to be L, then windowing truncation needs to be performed on the vector complex, as follows:

compenTWin＝hanning(L).*compenT”

the function tuning (L) represents a hanning window with a length of L, the symbol represents point-by-point multiplication between vectors, and the symbol compenT "represents that the center point of the vector compenT is used as a reference point, and L/2 points are taken before and after the vector compenT, so as to finally form a gain adjustment proportion vector with a length of L.

Optionally, the maximum adjustment proportion may be set for the frequency points of different frequency bands, and then an adjustment proportion vector is obtained according to the maximum adjustment proportion corresponding to each frequency band, where the corresponding processing is as follows:

and 1031, determining the frequency section to which the frequency point belongs for each frequency point, and determining the maximum gain adjustment proportion corresponding to the frequency point according to the corresponding relation between the preset frequency interval and the maximum gain adjustment proportion.

Wherein the frequency bands include a low frequency band, a mid frequency band, and a high frequency band.

In implementation, the respective frequency points may be divided into a low frequency band, a middle frequency band, and a high frequency band. The frequency point of the intermediate frequency band may be each frequency point within the preset frequency range. Before determining the gain adjustment ratio corresponding to each frequency point, the frequency segment to which the corresponding frequency point belongs may be determined. And then determining the maximum gain adjustment proportion of the frequency point according to the corresponding relation between the preset frequency interval and the maximum gain adjustment proportion.

For example, the frequency range of 0-fs/2 can be divided into 3 parts according to frequency limits fLow and fHigh: [0, fLow ], [ fLow, fHigh ], [ fHgih, fs/2], which are respectively a low frequency band, a medium frequency band and a high frequency band, the corresponding maximum adjustment ratios are maxLow, maxMid and maxHigh.

Step 1032, determining the gain adjustment proportion corresponding to the frequency band according to the magnitude of the corresponding amplitude gain in the frequency point and the reference amplitude gain, and can be divided into two cases:

the first condition is as follows: when the corresponding amplitude gain in the frequency point is larger than the reference amplitude gain, if the ratio of the reference amplitude gain to the amplitude gain corresponding to the frequency point is larger than or equal to the reciprocal of the maximum gain adjustment ratio, determining the ratio as the gain adjustment ratio corresponding to the frequency point; and if the ratio is smaller than the reciprocal of the maximum gain adjustment ratio, determining the reciprocal of the maximum gain adjustment ratio as the gain adjustment ratio corresponding to the frequency point.

In implementation, when the corresponding amplitude gain in the frequency point is greater than the reference amplitude gain, a relationship between a ratio of the reference amplitude gain to the amplitude gain corresponding to the frequency point and an inverse of the maximum gain adjustment ratio may be determined. And if the corresponding ratio is greater than or equal to the reciprocal of the maximum gain adjustment ratio, determining the ratio of the reference amplitude gain to the amplitude gain corresponding to the frequency point as the gain adjustment ratio. And if the corresponding ratio is smaller than the reciprocal of the maximum gain adjustment ratio, determining the reciprocal of the maximum gain adjustment ratio as the gain adjustment ratio corresponding to the frequency point.

Taking the frequency band [ fLow, fHigh ] as an example, if HAbsNew (w) is greater than meanLow high, it indicates that the amplitude response of the frequency point needs to be reduced. The comp mid is a gain adjustment ratio, the habsnew (w) is an amplitude gain corresponding to the frequency point, and the meanLowHigh is a reference amplitude gain.

When meanlow high/habsnew (w) >1/maxMid, comprming (w) ═ meanlow high/habsnew (w);

when habsnew (w)/meanLowHigh < maxMid, compenmid (w) ═ 1/maxMid.

Case two: when the amplitude gain corresponding to the frequency point is smaller than the reference amplitude gain, if the ratio of the reference amplitude gain to the amplitude gain corresponding to the frequency point is smaller than the maximum gain adjustment ratio, determining the ratio as the gain adjustment ratio corresponding to the frequency point; and if the ratio is larger than the reciprocal of the maximum gain adjustment ratio, determining the maximum gain adjustment ratio as the gain adjustment ratio corresponding to the frequency point.

In implementation, when the corresponding amplitude gain in the frequency point is smaller than the reference amplitude gain, a relationship between a ratio of the reference amplitude gain to the amplitude gain corresponding to the frequency point and the maximum gain adjustment ratio may be determined. And if the corresponding ratio is smaller than the maximum gain adjustment ratio, determining the ratio as the gain adjustment ratio corresponding to the frequency point, and if the ratio is larger than the reciprocal of the maximum gain adjustment ratio, determining the maximum gain adjustment ratio as the gain adjustment ratio corresponding to the frequency point.

If HAbsNew (w) is less than meanLow high, it indicates that the magnitude response of the bin needs to be enhanced (compensated).

When meanlow high/habsnew (w) < maxMid, comprminid (w) ═ meanlow high/habsnew (w);

when meanLowHigh/habsnew (w) > ═ maxMid, componid (w) ═ maxMid;

similarly, the gain adjustment ratios compenLow and compenHigh in the frequency bands of [0, fLow ], [ fHgih, fs/2] can be calculated respectively. The compensation quantities of the three frequency bands are combined together to form a gain adjustment scale vector compnals covering the whole frequency band range. Namely: comp all ═ comp low, comp mid, comp high.

And 104, when the target audio data to be played through the audio playing equipment is obtained, adjusting the target audio data based on the gain adjustment proportion vector to obtain adjusted target audio data, and playing the adjusted target audio data through the audio playing equipment.

In implementation, after obtaining the gain adjustment scale vector corresponding to the audio playing device, the gain adjustment scale vector may be set in the audio playing application program corresponding to the audio playing device. When the target audio data is played through the audio playing device, the target audio data may be obtained first, then the target audio data is preprocessed according to the gain adjustment proportion vector to obtain adjusted target audio data, and then the audio playing device may play the adjusted target audio data.

The processing of adjusting the target audio data according to the gain adjustment scale vector may be as follows:

the first method is as follows: converting the target audio data into a frequency domain to obtain frequency domain data corresponding to the target audio data; and multiplying the frequency domain data by the gain adjustment vector, and converting the multiplication result into a time domain to obtain the adjusted target audio data.

In implementation, when the gain adjustment scale vector is frequency domain data, the target audio data may be subjected to fourier transform to obtain frequency domain data corresponding to the target audio data, then the frequency domain data corresponding to the target audio data is multiplied by the gain adjustment scale vector to obtain a multiplication result, and then the multiplication result is converted into time domain data to obtain the adjusted target audio data.

The second method comprises the following steps: converting the gain adjustment vector into a time domain to obtain a converted gain adjustment vector; and carrying out convolution processing on the gain adjustment vector and the target audio data to obtain the adjusted target audio data.

In implementation, when the gain adjustment scale vector is frequency-domain data, the gain adjustment scale vector may be converted into a time-domain gain adjustment scale vector through fourier transform, and then the time-domain gain adjustment scale vector is convolved with the target audio data to obtain adjusted target audio data.

According to the embodiment of the application, the corresponding reference amplitude gain is determined by detecting the amplitude gain of the audio playing device at each frequency, then the gain adjustment proportion of each frequency point is determined according to the reference amplitude gain, and the corresponding gain adjustment proportion vector is determined. Therefore, before the audio playing equipment plays the audio data, the audio data are adjusted according to the gain adjustment proportion vector, the amplitude of each frequency point corresponding to the audio data can be adjusted in advance according to the actual gain of each frequency point by the audio playing equipment, and the problem that the energy is increased or reduced when the audio playing equipment plays the audio signals of individual frequency points or frequency bands can be solved.

All the above optional technical solutions may be combined arbitrarily to form the optional embodiments of the present disclosure, and are not described herein again.

Fig. 2 is a device for playing audio according to an embodiment of the present application, where the device may be a terminal in the above embodiment, and the device includes:

an obtaining module 210, configured to obtain a frequency-amplitude response of an audio playing device when playing detected audio data, and determine, based on the frequency-amplitude response, an amplitude gain of the audio playing device at each frequency point;

a determining module 220, configured to determine a mean value of the amplitude gains of the frequency points in the preset frequency range, and determine the mean value of the amplitude gains as a reference amplitude gain corresponding to the audio playing device; determining the gain adjustment proportion of the audio playing equipment at each frequency point based on the amplitude gain of the audio playing equipment at each frequency point and the reference amplitude gain to form a gain adjustment proportion vector;

the adjusting module 230 is configured to, when target audio data to be played by the audio playing device is acquired, adjust the target audio data based on the gain adjustment proportion vector to obtain adjusted target audio data, and play the adjusted target audio data by the audio playing device.

Optionally, the determining module 220 is configured to:

Optionally, the adjusting module 230 is configured to:

It should be noted that: in the device for playing audio provided in the foregoing embodiment, when playing audio, only the division of the functional modules is illustrated, and in practical applications, the functions may be distributed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above. In addition, the apparatus for playing audio and the method for playing audio provided by the above embodiments belong to the same concept, and specific implementation processes thereof are detailed in the method embodiments and will not be described herein again.

Fig. 3 shows a block diagram of a terminal 300 according to an exemplary embodiment of the present application. The terminal 300 may be: a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III, motion video Experts compression standard Audio Layer 3), an MP4 player (Moving Picture Experts Group Audio Layer IV, motion video Experts compression standard Audio Layer 4), a notebook computer, or a desktop computer. The terminal 300 may also be referred to by other names such as user equipment, portable terminal, laptop terminal, desktop terminal, etc.

Generally, the terminal 300 includes: a processor 301 and a memory 302.

The processor 301 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and so on. The processor 301 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 301 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 301 may be integrated with a GPU (Graphics Processing Unit), which is responsible for rendering and drawing the content required to be displayed on the display screen. In some embodiments, the processor 301 may further include an AI (Artificial Intelligence) processor for processing computing operations related to machine learning.

Memory 302 may include one or more computer-readable storage media, which may be non-transitory. Memory 302 may also include high speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 302 is used to store at least one instruction for execution by processor 301 to implement a method of playing audio as provided by method embodiments herein.

In some embodiments, the terminal 300 may further include: a peripheral interface 303 and at least one peripheral. The processor 301, memory 302 and peripheral interface 303 may be connected by a bus or signal lines. Each peripheral may be connected to the peripheral interface 303 by a bus, signal line, or circuit board. Specifically, the peripheral device includes: at least one of radio frequency circuitry 304, touch display screen 305, camera 306, audio circuitry 307, positioning components 308, and power supply 309.

The peripheral interface 303 may be used to connect at least one peripheral related to I/O (Input/Output) to the processor 301 and the memory 302. In some embodiments, processor 301, memory 302, and peripheral interface 303 are integrated on the same chip or circuit board; in some other embodiments, any one or two of the processor 301, the memory 302 and the peripheral interface 303 may be implemented on a separate chip or circuit board, which is not limited by the embodiment.

The Radio Frequency circuit 304 is used for receiving and transmitting RF (Radio Frequency) signals, also called electromagnetic signals. The radio frequency circuitry 304 communicates with communication networks and other communication devices via electromagnetic signals. The rf circuit 304 converts an electrical signal into an electromagnetic signal to transmit, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 304 comprises: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and so forth. The radio frequency circuitry 304 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocols include, but are not limited to: metropolitan area networks, various generation mobile communication networks (2G, 3G, 4G, and 5G), Wireless local area networks, and/or WiFi (Wireless Fidelity) networks. In some embodiments, the rf circuit 304 may further include NFC (Near Field Communication) related circuits, which are not limited in this application.

The display screen 305 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display screen 305 is a touch display screen, the display screen 305 also has the ability to capture touch signals on or over the surface of the display screen 305. The touch signal may be input to the processor 301 as a control signal for processing. At this point, the display screen 305 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, the display 305 may be one, providing the front panel of the terminal 300; in other embodiments, the display screens 305 may be at least two, respectively disposed on different surfaces of the terminal 300 or in a folded design; in still other embodiments, the display 305 may be a flexible display disposed on a curved surface or on a folded surface of the terminal 300. Even further, the display screen 305 may be arranged in a non-rectangular irregular figure, i.e. a shaped screen. The Display screen 305 may be made of LCD (Liquid Crystal Display), OLED (Organic Light-Emitting Diode), and the like.

The camera assembly 306 is used to capture images or video. Optionally, camera assembly 306 includes a front camera and a rear camera. Generally, a front camera is disposed at a front panel of the terminal, and a rear camera is disposed at a rear surface of the terminal. In some embodiments, the number of the rear cameras is at least two, and each rear camera is any one of a main camera, a depth-of-field camera, a wide-angle camera and a telephoto camera, so that the main camera and the depth-of-field camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize panoramic shooting and VR (Virtual Reality) shooting functions or other fusion shooting functions. In some embodiments, camera assembly 306 may also include a flash. The flash lamp can be a monochrome temperature flash lamp or a bicolor temperature flash lamp. The double-color-temperature flash lamp is a combination of a warm-light flash lamp and a cold-light flash lamp, and can be used for light compensation at different color temperatures.

Audio circuitry 307 may include a microphone and a speaker. The microphone is used for collecting sound waves of a user and the environment, converting the sound waves into electric signals, and inputting the electric signals to the processor 301 for processing or inputting the electric signals to the radio frequency circuit 304 to realize voice communication. The microphones may be provided in plural numbers, respectively, at different portions of the terminal 300 for the purpose of stereo sound collection or noise reduction. The microphone may also be an array microphone or an omni-directional pick-up microphone. The speaker is used to convert electrical signals from the processor 301 or the radio frequency circuitry 304 into sound waves. The loudspeaker can be a traditional film loudspeaker or a piezoelectric ceramic loudspeaker. When the speaker is a piezoelectric ceramic speaker, the speaker can be used for purposes such as converting an electric signal into a sound wave audible to a human being, or converting an electric signal into a sound wave inaudible to a human being to measure a distance. In some embodiments, audio circuitry 307 may also include a headphone jack.

The positioning component 308 is used to locate the current geographic Location of the terminal 300 to implement navigation or LBS (Location Based Service). The Positioning component 308 may be a Positioning component based on the Global Positioning System (GPS) in the united states, the beidou System in china, the graves System in russia, or the galileo System in the european union.

The power supply 309 is used to supply power to the various components in the terminal 300. The power source 309 may be alternating current, direct current, disposable batteries, or rechargeable batteries. When the power source 309 includes a rechargeable battery, the rechargeable battery may support wired or wireless charging. The rechargeable battery may also be used to support fast charge technology.

In some embodiments, the terminal 300 also includes one or more sensors 310. The one or more sensors 310 include, but are not limited to: acceleration sensor 311, gyro sensor 312, pressure sensor 313, fingerprint sensor 314, optical sensor 315, and proximity sensor 316.

The acceleration sensor 311 may detect the magnitude of acceleration in three coordinate axes of a coordinate system established with the terminal 300. For example, the acceleration sensor 311 may be used to detect components of the gravitational acceleration in three coordinate axes. The processor 301 may control the touch display screen 305 to display the user interface in a landscape view or a portrait view according to the gravitational acceleration signal collected by the acceleration sensor 311. The acceleration sensor 311 may also be used for acquisition of motion data of a game or a user.

The gyro sensor 312 may detect a body direction and a rotation angle of the terminal 300, and the gyro sensor 312 may cooperate with the acceleration sensor 311 to acquire a 3D motion of the user on the terminal 300. The processor 301 may implement the following functions according to the data collected by the gyro sensor 312: motion sensing (such as changing the UI according to a user's tilting operation), image stabilization at the time of photographing, game control, and inertial navigation.

The pressure sensor 313 may be disposed on a side bezel of the terminal 300 and/or an underlying layer of the touch display screen 305. When the pressure sensor 313 is disposed on the side frame of the terminal 300, the holding signal of the user to the terminal 300 can be detected, and the processor 301 performs left-right hand recognition or shortcut operation according to the holding signal collected by the pressure sensor 313. When the pressure sensor 313 is disposed at the lower layer of the touch display screen 305, the processor 301 controls the operability control on the UI interface according to the pressure operation of the user on the touch display screen 305. The operability control comprises at least one of a button control, a scroll bar control, an icon control and a menu control.

The fingerprint sensor 314 is used for collecting a fingerprint of the user, and the processor 301 identifies the identity of the user according to the fingerprint collected by the fingerprint sensor 314, or the fingerprint sensor 314 identifies the identity of the user according to the collected fingerprint. Upon identifying that the user's identity is a trusted identity, processor 301 authorizes the user to perform relevant sensitive operations including unlocking the screen, viewing encrypted information, downloading software, paying, and changing settings, etc. The fingerprint sensor 314 may be disposed on the front, back, or side of the terminal 300. When a physical button or a vendor Logo is provided on the terminal 300, the fingerprint sensor 314 may be integrated with the physical button or the vendor Logo.

The optical sensor 315 is used to collect the ambient light intensity. In one embodiment, the processor 301 may control the display brightness of the touch screen display 305 based on the ambient light intensity collected by the optical sensor 315. Specifically, when the ambient light intensity is high, the display brightness of the touch display screen 305 is increased; when the ambient light intensity is low, the display brightness of the touch display screen 305 is turned down. In another embodiment, the processor 301 may also dynamically adjust the shooting parameters of the camera head assembly 306 according to the ambient light intensity collected by the optical sensor 315.

A proximity sensor 316, also known as a distance sensor, is typically provided on the front panel of the terminal 300. The proximity sensor 316 is used to collect the distance between the user and the front surface of the terminal 300. In one embodiment, when the proximity sensor 316 detects that the distance between the user and the front surface of the terminal 300 gradually decreases, the processor 301 controls the touch display screen 305 to switch from the bright screen state to the dark screen state; when the proximity sensor 316 detects that the distance between the user and the front surface of the terminal 300 gradually becomes larger, the processor 301 controls the touch display screen 305 to switch from the breath screen state to the bright screen state.

Those skilled in the art will appreciate that the configuration shown in fig. 3 is not intended to be limiting of terminal 300 and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components may be used.

In an exemplary embodiment, a computer-readable storage medium, such as a memory, including instructions executable by a processor in a terminal to perform the method of playing audio in the above embodiments is also provided. The computer readable storage medium may be non-transitory. For example, the computer-readable storage medium may be a ROM (Read-Only Memory), a RAM (Random Access Memory), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The above description is only exemplary of the present application and should not be taken as limiting the present application, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. A method of playing audio, the method comprising:

determining the average value of the amplitude gain of each frequency point in a preset frequency range, and determining the average value of the amplitude gain as a reference amplitude gain corresponding to the audio playing equipment;

when target audio data to be played through the audio playing equipment are acquired, adjusting the target audio data based on the gain adjustment proportion vector to obtain adjusted target audio data, and playing the adjusted target audio data through the audio playing equipment;

the adjusting the target audio data based on the gain adjustment proportion vector to obtain adjusted target audio data includes:

multiplying the frequency domain data by the gain adjustment proportion vector, and converting the multiplication result into a time domain to obtain adjusted target audio data; or the like, or, alternatively,

converting the gain adjustment proportion vector into a time domain to obtain a converted gain adjustment proportion vector;

and performing convolution processing on the gain adjustment proportion vector and the target audio data to obtain adjusted target audio data.

2. The method of claim 1, wherein the determining the average of the amplitude gain at each frequency point in the preset frequency range comprises:

3. The method of claim 1, wherein the determining the average of the amplitude gain at each frequency point in the preset frequency range comprises:

dividing each frequency point of the preset frequency range in a logarithmic domain based on a second division interval to obtain a plurality of second frequency point sets, wherein the second division interval is greater than the first division interval;

and averaging second average values corresponding to the plurality of second frequency point sets to obtain the amplitude gain average value.

4. The method of claim 1, wherein the determining the gain adjustment ratio of the audio playing device at each frequency point based on the amplitude gain of the audio playing device at each frequency point and the reference amplitude gain comprises:

5. The method of claim 1, wherein the determining the gain adjustment ratio of the audio playing device at each frequency point based on the amplitude gain of the audio playing device at each frequency point and the reference amplitude gain comprises:

when the corresponding amplitude gain in the frequency point is larger than the reference amplitude gain, if the ratio of the reference amplitude gain to the amplitude gain corresponding to the frequency point is larger than or equal to the reciprocal of the maximum gain adjustment ratio, determining the ratio as the gain adjustment ratio corresponding to the frequency point; if the ratio is smaller than the reciprocal of the maximum gain adjustment ratio, determining the reciprocal of the maximum gain adjustment ratio as the gain adjustment ratio corresponding to the frequency point;

6. An apparatus for playing audio, the apparatus comprising:

the determining module is used for determining the average value of the amplitude gain of each frequency point in a preset frequency range, and determining the average value of the amplitude gain as a reference amplitude gain corresponding to the audio playing equipment; determining the gain adjustment proportion of the audio playing equipment at each frequency point based on the amplitude gain of the audio playing equipment at each frequency point and the reference amplitude gain to form a gain adjustment proportion vector;

7. The apparatus of claim 6, wherein the determining module is configured to:

8. The apparatus of claim 6, wherein the determining module is configured to:

9. A computer device comprising a processor and a memory, wherein at least one instruction is stored in the memory, and wherein the at least one instruction is loaded and executed by the processor to perform operations performed by the method of playing audio of any one of claims 1 to 5.