CN114546325A - Audio processing method, electronic device and readable storage medium - Google Patents

Audio processing method, electronic device and readable storage medium Download PDF

Info

Publication number
CN114546325A
CN114546325A CN202011331956.1A CN202011331956A CN114546325A CN 114546325 A CN114546325 A CN 114546325A CN 202011331956 A CN202011331956 A CN 202011331956A CN 114546325 A CN114546325 A CN 114546325A
Authority
CN
China
Prior art keywords
sound effect
user
audio
sound
parameters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011331956.1A
Other languages
Chinese (zh)
Other versions
CN114546325B (en
Inventor
苏霞
林宇轩
陈翼翼
张晓玲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN202011331956.1A priority Critical patent/CN114546325B/en
Priority claimed from CN202011331956.1A external-priority patent/CN114546325B/en
Priority to PCT/CN2021/131621 priority patent/WO2022111381A1/en
Publication of CN114546325A publication Critical patent/CN114546325A/en
Application granted granted Critical
Publication of CN114546325B publication Critical patent/CN114546325B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/10Digital recording or reproducing
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/10Digital recording or reproducing
    • G11B20/10009Improvement or modification of read or write signals
    • G11B20/10018Improvement or modification of read or write signals analog processing for digital recording or reproduction

Abstract

An embodiment of the application provides an audio processing method, an electronic device and a readable storage medium, wherein the method comprises the following steps: receiving an audio playing request input by a user, and acquiring sound effect parameters according to sound effect setting information or historical audio playing information of the user, wherein the audio playing request is used for requesting to play audio; and playing audio by adopting the sound effect parameters. In the embodiment of the application, the terminal equipment can play audio with different sound effects, so that the requirement of a user on sound effect diversification of the audio is met, and the user experience is improved.

Description

Audio processing method, electronic device and readable storage medium
Technical Field
Embodiments of the present disclosure relate to audio processing technologies, and in particular, to an audio processing method, an electronic device, and a readable storage medium.
Background
With the intelligent development of terminal equipment, users can use the terminal equipment as a learning machine, a game machine, an audio and video player and the like. When the terminal equipment plays the audio, the sound effect of the audio is the style of the audio felt by the user.
At present, terminal equipment can play audio with a fixed sound effect, and the sound effect of the audio felt by a user is only the fixed sound effect. At present, the mode can not meet the requirement of the user on the sound effect diversification of the audio.
Disclosure of Invention
The embodiment of the application provides an audio processing method, an electronic device and a readable storage medium, which can adopt different sound effects to play audio, thereby improving user experience.
In a first aspect, an execution subject for executing the audio processing method may be a terminal device or a chip in the terminal device, and the following description takes the execution subject as the terminal device as an example. In the audio processing method, when receiving an audio playing request input by a user, a terminal device acquires sound effect parameters according to sound effect setting information or user history audio playing information, wherein the audio playing request is used for requesting to play audio. On one hand, the user can set sound effects on the terminal equipment, and the terminal equipment can store the setting information of the sound effects. It should be understood that the setting information of the sound effect may include information such as the sound effect set by the user, and the setting time. On the other hand, the information of the audio played by the user in the history can comprise the audio played by the user in the history or a sound effect tag corresponding to the audio played by the user in the history. Wherein the sound effect tag is used to characterize sound effects.
In the embodiment of the application, the terminal equipment can acquire the sound effect parameter of the played audio according to the setting information of the sound effect or the information of the user history played audio, and then the audio is played by adopting the sound effect parameter. Wherein, the sound effect parameters may include at least one of the following: dynamic range control DRC parameters, equalizer EQ parameters, active noise reduction ANC parameters.
In the embodiment of the application, on the one hand, the user can set different sound effects, and then the terminal equipment can acquire sound effect parameters corresponding to different sound effects. On the other hand, the user history playing audio can correspond to different sound effects, and then the terminal equipment can obtain sound effect parameters corresponding to different sound effects according to the information of the user history playing audio. Combine above-mentioned two aspects, terminal equipment can adopt the audio parameter broadcast audio frequency that the audio that the different audios correspond, and then realizes playing the audio frequency with the audio of difference to satisfy the demand of user to the audio diversification of audio, improve user experience. In addition, in the embodiment of the application, the terminal device can play the audio with the sound effect preferred by the user by combining the setting of the sound effect by the user or the historical audio playing information of the user, so that the user experience is further improved.
In a possible implementation manner, before the terminal device acquires the sound effect parameters according to the setting information of the sound effect or the information of the user history playing audio, it may also determine whether the user has set the sound effect according to the setting information of the sound effect.
In an implementation manner, because the sound effect set by the user can be included in the setting information of the sound effect and the setting time, the terminal device can determine whether the user has set the sound effect according to the current time and the setting time of the sound effect. And if the setting moment closest to the current moment does not correspond to the sound effect set by the user, determining that the sound effect is not set by the user.
In another implementation manner, the setting information of the sound effect includes the sound effect set by the last user, if the sound effect set by the last user is none, it is determined that the user does not set the sound effect, and if the sound effect set by the last user is any preset sound effect, it is determined that the user has set the sound effect. It should be understood that in this implementation, the preset sound effect may be a super bass sound effect, a clear human voice, a warm and soft sound effect, and a clear melody effect. The sound effect set by the user can be any one of preset sound effects, and can also be sound effect set by the user.
In another implementation manner, when the user sets the sound effect, at least one application program of the sound effect application may be set correspondingly, where the at least one application program is an application program associated with the sound effect. The terminal device may determine an application program that a user requests to play audio, and then determine that the user has set a sound effect if the at least one first application program includes the application program that the user requests to play audio, and determine that the user has not set the sound effect if the at least one first application program does not include the application program that the user requests to play audio.
If the sound effect set by the user is determined, the terminal equipment acquires sound effect parameters corresponding to the set sound effect according to the setting information of the sound effect; and if the user is determined not to set the sound effect, obtaining sound effect parameters corresponding to the sound effect preferred by the user according to the information of the user historical playing audio within a preset time period. In one possible implementation, the set sound effect may be used as the user's preference sound effect. In order to distinguish the setting information of sound effect adopted by the terminal equipment or the preference sound effect of the user obtained by the information of the user history playing audio in the preset time period, the preference sound effect of the user obtained by the setting information of sound effect is explained as the set sound effect.
In the embodiment of the application, terminal equipment can be according to the setting information of audio confirms whether the user has set up the audio, and then adopts different modes to obtain the audio parameter that the audio corresponds, user experience is improved to user's demand that can more laminate. Illustratively, if the user sets the sound effect, it is described that the sound effect set by the user is the sound effect required by the user, and then the sound effect parameter corresponding to the set sound effect is obtained according to the setting information of the sound effect. If the user does not set the sound effect, it is indicated that no special requirement is required for the sound effect, in the embodiment of the application, the terminal device can predict the sound effect parameters corresponding to the sound effect preferred by the user according to the information of the user history playing audio in the preset time period, and the user experience can also be improved.
It should be noted that in the embodiment of the application, the preference sound effect of the user can be acquired by using the information of the user history playing audio in the preset time period, and the sound effect can be adjusted at any time along with the preference of the user, so that the method is more intelligent.
In a possible implementation manner, the terminal device may set a storage duration for the setting information of the sound effect, where the storage duration is a period of time from the moment when the user sets the sound effect. When the setting information of the sound effect is in the storage time length, the terminal equipment can acquire the sound effect parameters corresponding to the set sound effect according to the setting information of the sound effect. However, the setting information of the sound effect exceeds the storage time length, and the terminal equipment can acquire the preference sound effect of the user or the sound effect parameter corresponding to the preference sound effect according to the information of the user history playing audio in the preset time period. It should be understood that the duration of the save time may be predefined or set by the user when setting the sound effects.
In the embodiment of the application, the terminal equipment adopts the method for storing the time length of the sound effect setting set by the user, and when the preference sound effect of the user changes, the terminal equipment can adopt the sound effect parameters to be adjusted in time, and then adopts the sound effect parameters corresponding to the preference sound effect to play the audio. This kind of mode is more intelligent, and the user demand of more laminating can improve user experience.
The following describes a process of acquiring sound effect parameters by a terminal device from the following possible implementation manners:
the first mode is as follows: and the terminal equipment acquires the sound effect parameters according to the setting information of the sound effect. The setting information of the sound effect comprises the set sound effect, the terminal equipment can acquire the sound effect parameters corresponding to the set sound effect according to the sound effect parameter set and the set sound effect. Wherein, including the sound effect parameter that each audio corresponds in the sound effect parameter set, in the embodiment of this application, terminal equipment can be with the sound effect parameter that corresponds with the audio that has set up the audio the same in the sound effect parameter set as the sound effect parameter that has set up the audio and correspond.
The second mode is as follows: and the terminal equipment acquires the sound effect parameters according to the information of the audio played by the user history. The information of the user historical playing audio is historical playing audio in a preset time period. In the embodiment of the application, the terminal equipment can input the historical playing audio of the user to the sound effect prediction model to obtain the preference sound effect of the user, and then the sound effect parameter corresponding to the preference sound effect is obtained according to the sound effect parameter set and the preference sound effect of the user. In the embodiment of the application, the terminal equipment can take the sound effect parameters corresponding to the sound effect with the same preference sound effect in the sound effect parameter set as the sound effect parameters corresponding to the preference sound effect.
The third mode is as follows: and the terminal equipment acquires the sound effect parameters according to the information of the audio played by the user history. The information of the user history playing audio is a sound effect label of the user history playing audio in a preset time period, and the sound effect label is used for representing sound effects. It should be understood that the terminal device can collect the user history playing audio, input the user history playing audio into the sound effect recognition model, and acquire the sound effect label of the user history playing audio. In the embodiment of the application, the terminal equipment can take the sound effect corresponding to the sound effect tags with the largest quantity as the preference sound effect of the user, and then according to the sound effect parameter set and the preference sound effect of the user, obtain the sound effect parameter corresponding to the preference sound effect. The method for acquiring the sound effect parameters by the terminal device according to the sound effect parameter set and the sound effect preference of the user can refer to the related description in the second method.
The fourth mode is that: and the terminal equipment acquires the sound effect parameters according to the information of the audio played by the user history. The information of the user history playing audio is the user history playing audio in a preset time period. In the embodiment of the application, the terminal equipment can input the historical playing audio of the user into the sound effect parameter prediction model to obtain the sound effect parameters corresponding to the preference sound effect of the user. Compared with the three modes, the method has the advantages that the terminal equipment can directly acquire the sound effect parameters without acquiring the preference sound effect of the user, and further the audio processing rate can be improved.
After the terminal equipment acquires the sound effect parameters according to any one of the above modes, the current sound effect parameters can be modified into the sound effect parameters; or selecting the sound effect parameters from multiple preset groups of sound effect parameters. The terminal equipment can play audio by the sound effect parameters. Wherein, every group sound effect parameter in the multiunit sound effect parameter of predetermineeing corresponds a sound effect.
In the first to third modes, the sound effect parameter set adopted by the terminal device is preset in the terminal device, and the following process of acquiring the sound effect parameter set by the server is described by taking the server as an execution subject of acquiring the sound effect parameter set:
the first mode is as follows: the server acquires the standard audio of the first sound effect and the first frequency response of the standard audio of the first sound effect. Wherein, the first sound effect is each sound effect. The standard audio of the first audio effect can be used as a basis for identifying whether other audio is the first audio effect. The server can convert the wav file of the standard audio of the first sound effect into a frequency response curve by adopting Fourier transform in a simulation tool, so that the first frequency response of the standard audio of the first sound effect is obtained. The server can adopt a simulation tool to simulate a DRC module, an EQ module and an ANC module in the terminal equipment so as to generate DRC parameters, EQ parameters and ANC parameters with different sound effect parameters. The server can continuously adjust the sound effect parameters in the simulation tool, and then continuously process the standard audio of the first sound effect with the adjusted sound effect parameters to obtain a second frequency response of the standard audio of the first sound effect. The server obtains a difference value between a first frequency response and a second frequency response of a standard audio of a first sound effect, and then takes a sound effect parameter corresponding to the second frequency response, of which the difference value with the first frequency response is smaller than a preset difference value, as a sound effect parameter of the first sound effect, so as to obtain a sound effect parameter set.
The second mode is as follows: the server can randomly generate a plurality of groups of sound effect parameters, and the sound effect parameters are input into the sound effect classification scoring model to obtain the score of each group of sound effect parameters belonging to the first sound effect. The server takes the sound effect parameter with the highest score of the first sound effect as the sound effect parameter of the first sound effect so as to obtain a sound effect parameter set.
The second mode is compared with the first mode, and the server does not need to acquire standard audio corresponding to the first sound effect in advance. It should be understood that, under the condition that the server cannot obtain the standard audio corresponding to the first audio, the server may obtain the audio parameter corresponding to the audio according to the randomly generated audio parameter, and the second method is more applicable.
In a second aspect, an embodiment of the present application provides an electronic device for playing audio, where the electronic device includes a sound effect component. The electronic equipment is used for receiving an audio playing request input by a user, acquiring sound effect parameters according to sound effect setting information or user history audio playing information, wherein the audio playing request is used for requesting to play audio; and the sound effect component is used for adopting the sound effect parameters to play the audio.
In a possible implementation manner, the electronic device is further configured to determine whether the user has set a sound effect according to the setting information of the sound effect; the method comprises the steps that specifically, if the sound effect set by the user is determined, sound effect parameters corresponding to the set sound effect are obtained according to the setting information of the sound effect; and if the user is determined not to set the sound effect, obtaining sound effect parameters corresponding to the sound effect preferred by the user according to the information of the user historical playing audio within a preset time period.
In a possible implementation manner, the setting information of the sound effect includes the set sound effect and at least one first application program associated with the set sound effect; the electronic device is specifically configured to determine whether the at least one first application includes an application that the user requests to play audio; and if the at least one first application program comprises the application program which is requested by the user to play the audio, determining that the sound effect is set by the user.
In a possible implementation manner, the electronic device is further configured to, if it is determined that the setting information of the sound effect exceeds the storage duration, obtain sound effect parameters corresponding to the preferred sound effect of the user according to the information of the user history playing audio within a preset time period.
In a possible implementation manner, the electronic device is specifically configured to acquire the sound effect parameters corresponding to the set sound effects according to the sound effect parameter set and the set sound effects, and the sound effect parameter set includes sound effect parameters corresponding to each sound effect.
In a possible implementation manner, the electronic device is specifically configured to obtain a preference sound effect of the user according to the information of the user history playing audio; according to the sound effect parameter set and the preference sound effect of the user, obtaining the sound effect parameters corresponding to the preference sound effect, wherein the sound effect parameter set comprises the sound effect parameters corresponding to the sound effects.
In a possible implementation manner, the information of the user history playing audio is the user history playing audio; the electronic equipment is specifically used for inputting the user historical playing audio into a sound effect prediction model and acquiring the preference sound effect of the user.
In a possible implementation manner, the information of the user history playing audio is a sound effect label of the user history playing audio, and the sound effect label is used for representing sound effect; the electronic equipment is specifically used for taking the sound effect corresponding to the sound effect label with the largest number as the preference sound effect of the user.
In a possible implementation manner, the electronic device is further configured to collect the user history playing audio, input the user history playing audio to the sound effect recognition model, and obtain the sound effect tag of the user history playing audio.
In a possible implementation manner, the information of the user history playing audio is the user history playing audio; the electronic equipment is further used for inputting the historical playing audio of the user to a sound effect parameter prediction model and obtaining sound effect parameters corresponding to the preferred sound effect of the user.
In a possible implementation manner, the electronic device is further configured to modify the current sound effect parameter into the sound effect parameter; or selecting the sound effect parameters from multiple preset groups of sound effect parameters, wherein each group of sound effect parameters corresponds to one sound effect.
In one possible implementation, the sound effect component includes at least one of: the system comprises a dynamic range control DRC module, an equalizer EQ module and an active noise reduction ANC module, wherein the sound effect parameter of the DRC module is a DRC parameter, the sound effect parameter of the EQ module is an EQ parameter, and the sound effect parameter of the ANC module is an ANC parameter.
In one possible implementation manner, the electronic device in the embodiment of the present application may further include a processor, a memory, where the memory is configured to store computer-executable program code, and the program code includes instructions; when executed by a processor, the instructions cause the electronic device to perform a method as provided by the first aspect or each possible implementation of the first aspect.
In a third aspect, an embodiment of the present application provides an electronic device for playing audio, which includes a unit, a module, or a circuit for performing the method provided in the first aspect or each possible implementation manner of the first aspect. The electronic device playing the audio may be a terminal device, or may be a module applied to the terminal device, for example, a chip applied to the terminal device.
In a fourth aspect, embodiments of the present application provide a computer program product containing instructions, which when executed on a computer, cause the computer to perform the method of the first aspect or the various possible implementations of the first aspect.
In a fifth aspect, embodiments of the present application provide a computer-readable storage medium, which stores instructions that, when executed on a computer, cause the computer to perform the method of the first aspect or each possible implementation manner of the first aspect.
An embodiment of the application provides an audio processing method, an electronic device and a readable storage medium, wherein the method comprises the following steps: receiving an audio playing request input by a user, and acquiring sound effect parameters according to sound effect setting information or historical audio playing information of the user, wherein the audio playing request is used for requesting to play audio; and playing audio by adopting the sound effect parameters. In the embodiment of the application, different sound effect parameters can be adopted to play audio, so that a user can feel the audio played with different sound effects, the demand of the user on the sound effect diversification of the audio is further met, and the user experience is improved.
Drawings
FIG. 1 is a simplified flow chart of a training sound effect recognition model;
fig. 2 is a schematic view of an interface change of a terminal device according to an embodiment of the present disclosure;
FIG. 3 is a schematic diagram of a setup page provided by an embodiment of the present application;
fig. 4 is a flowchart illustrating an embodiment of an audio processing method according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of a terminal device according to an embodiment of the present application;
fig. 6 is a schematic view of another interface change of the terminal device according to the embodiment of the present application;
fig. 7 is a schematic flowchart of another embodiment of an audio processing method according to an embodiment of the present application;
fig. 8 is a schematic flow chart illustrating a process of acquiring a sound effect parameter set according to an embodiment of the present disclosure;
FIG. 9 is a schematic flow chart illustrating audio processing by a simulation tool according to an embodiment of the present application;
FIG. 10 is a schematic flow chart illustrating audio processing by the simulation tool according to the embodiment of the present application;
fig. 11 is another schematic flow chart illustrating a process of acquiring a sound-effect parameter set according to an embodiment of the present disclosure;
fig. 12 is another schematic structural diagram of a terminal device according to an embodiment of the present application.
Detailed Description
At present, terminal equipment can adopt fixed audio to play audio, but along with the diversified demand of user to the audio of audio, fixed audio can not satisfy user's demand yet. In the embodiment of the present application, the audio effect may include, but is not limited to: super bass, clear voice, warm and soft, clear melody. The sound effect of the overweight and bass can be characterized in that the medium-low frequency ratio of the audio is large, and the user is shocked. The characteristic of the sound effect of clear voice can highlight the voice audio in the audio and weaken the background audio. The sound effect is warm and soft, and has the characteristics of balanced high and low sound of the whole audio and comfortable listening feeling. The characteristics of the sound effect of clear melody can highlight the background audio frequency in the audio frequency and weaken the human voice audio frequency. The sound effect of the audio is related to the sound effect parameters in the terminal equipment, and because the sound effect parameters in the current terminal equipment are preset, the terminal equipment can only realize one sound effect by adopting the preset sound effect parameters to play the audio. The embodiment of the application provides an audio processing method, which changes the sound effect of audio by changing the sound effect parameters in terminal equipment, and further can provide audio with different sound effects for a user, so as to meet the demand of the user on sound effect diversification and improve user experience.
It should be understood that the sound effect parameters in the embodiment of the present application may include, but are not limited to: a Dynamic Range Control (DRC) parameter, an Equalizer (EQ) parameter, an Active Noise Cancellation (ANC) parameter, an abnormal noise cancellation parameter, a low frequency gain of a noise threshold, a subwoofer intensity, a subwoofer center frequency, a 3D intensity, and a 3D effect center frequency. The DRC parameters may include: the number of frequency bands of the audio signal, the cut-off frequency of the frequency bands, the gain of the audio signal, the compression ratio, the amplitude threshold value, the compression speed, the gain duration and the bottom noise threshold value. The equalizer may be comprised of a plurality of filters, the equalizer parameters may include parameters of the filters, and the parameters of the filters may include a type of the filter, a center frequency, a gain, and a Q value, the Q value being related to a frequency of the audio signal. ANC parameters may include: type of filter, center frequency, full band gain, Q value, and single band gain.
The terms in the examples of the present application are explained here:
a sound effect identification model: for identifying the audio effects of the audio. The terminal equipment can input the audio into the audio recognition model, and the audio recognition model can output the audio effect of the audio. For example, if the audio a is input into the audio recognition model, and the audio recognition model may output the audio "subwoofer", the terminal device may determine that the audio of the audio a is "subwoofer". In an embodiment, the sound effect recognition model may be obtained by training a data set through a Deep Learning (DL) method according to a long-short-term memory artificial neural network (LSTM) structure, a Convolutional Neural Network (CNN) structure, a Recurrent Neural Network (RNN) structure, or other neural network structures. The data set for training the sound effect recognition model can be a large number of audios and a sound effect label of each audio. The sound effect tag is used for representing the sound effect of the audio, and the sound effect tag can be overweight bass, clear human voice and the like. Alternatively, the sound effect label may be represented by a number, such as "0", "1". Where "0" represents "subwoofer" and "1" represents "clear human voice".
The following is briefly described by taking an example that the terminal device obtains the sound effect recognition model according to the training of the LSTM network structure, where the LSTM network structure includes an input layer, at least one hidden layer, and an output layer. The input layer is used for receiving the data set and distributing the data in the data set to the neurons of the hidden layer. And the neurons of the hidden layer are used for calculating according to the data and outputting the calculation result to the output layer. The output layer is used for outputting the operation result. FIG. 1 is a simplified flow chart of training a sound effect recognition model. As shown in fig. 1, a method for training a sound effect recognition model in an embodiment of the present application may include:
s101, initializing weight values of neurons of all hidden layers in an LSTM network structure.
For example, the terminal device may initialize the weight values of the neurons of the respective hidden layers to random weight values that follow a gaussian distribution.
S102, dividing the preprocessed data set into N batches.
In the embodiment of the present application, the data set used by the terminal device is a preprocessed data set, the mean value of the data in the data set is 0, and the variance is 1. Illustratively, the data set may include a plurality of audios and a sound effect tag for each audio, and the audio in the data set may be a waveform sound (wav) file. Taking the example that the terminal device preprocesses the data set, the terminal device may convert the wav file into a spectrogram, such as a mel spectrogram, to obtain a spectral value (such as a mel feature and a mel feature) of the audio, and then perform normalization processing on the spectral value of the audio to obtain data with a mean value of 0 and a variance of 1.
The terminal device may divide the preprocessed data set into N batches, so as to perform iterative training on the LSTM network structure using the N batches of data. For example, the terminal device may divide the data set into N batches on average according to the data amount. Wherein N is an integer greater than 1.
S103, inputting the data of the ith batch into an LSTM network structure to obtain the cross entropy loss of the data of the ith batch.
Illustratively, at the beginning of training, i.e., when i is 1, the end device inputs the 1 st batch of data into the LSTM network structure, which may output a cross entropy loss (cross entropy loss) of the 1 st batch of data. It should be understood that the cross entropy loss is used for representing the similarity between the sound effect label of the audio predicted by the terminal device by using the LSTM network structure and the real sound effect label of the audio. The smaller the cross entropy loss, the more accurate the weight values characterizing the neurons of each hidden layer in the LSTM network structure. Wherein i is an integer greater than or equal to 1 and less than or equal to N.
And S104, updating the weight values of the neurons of each hidden layer in the LSTM network structure according to the cross entropy loss of the ith batch of data.
For example, the terminal device may update the initial weight values of the neurons of each hidden layer according to the cross entropy loss of the 1 st batch of data. The terminal equipment can determine the error between the similarity of the sound effect label of the audio predicted by the LSTM network structure and the real sound effect label of the audio and 100% according to the cross entropy loss of the data of the 1 st batch, and then update the weight value of the neuron of each hidden layer according to the error. For example, the terminal device may update the weight value of the neuron of each hidden layer by using a gradient descent method (gradient degree) or a stochastic gradient descent method (stochastic gradient degree).
S105, judging whether i is smaller than N. If yes, add 1 to i and execute S103. If not, go to S106.
The terminal device may determine whether i is less than N to determine whether training of N batches of data of the data set is complete. If i is smaller than N, the terminal device may add 1 to i, and continue to execute S103. For example, when i is 1 and N is 10, the terminal device determines that i is smaller than N, and further inputs the data of the 2 nd batch into the LSTM network structure for updating the weight value, so as to obtain the cross entropy loss of the data of the 2 nd batch. Similarly, the terminal device may update the weight values of the neurons in each hidden layer by using a gradient descent method or a random gradient descent method according to the cross entropy loss of the data in the 2 nd batch. And continuously iterating in the above way until i is equal to N, inputting the Nth batch of data into the LSTM network structure by the terminal equipment, and updating the weight values of the neurons of each hidden layer by the terminal equipment according to the cross entropy loss of the Nth batch of data. Wherein, when i is equal to N, the terminal device may perform S106 described below.
And S106, determining whether the training is converged according to the target cross entropy loss and the cross entropy loss of the Nth batch of data. If the training is converged, S107 is executed, and if the training is not converged, the process returns to S102.
The user can preset the target cross entropy loss as the convergence basis of the training. When the terminal device obtains the cross entropy loss of the nth batch of data, it may determine whether the training is converged according to the cross entropy loss of the nth batch of data and the target cross entropy loss. If the cross entropy loss of the nth batch of data is less than or equal to the target cross entropy loss, the representation is close to the real sound effect label of the audio according to the sound effect label of the audio predicted by adopting the LSTM network structure, and the terminal equipment can determine the training convergence. And if the cross entropy loss of the Nth batch of data output by the LSTM network structure is greater than the target cross entropy loss, determining that the training is not converged. When the training is not converged, the terminal device returns to execute step S102, that is, the preprocessed data set is continuously divided into N batches, and the LSTM network structure is continuously trained by using the N batches of data until the training is converged.
And S107, ending.
And if the terminal equipment determines that the training is converged, ending the training to obtain the sound effect recognition model. The sound effect recognition model can be: and the terminal equipment updates the LSTM network structure after the weighted value of the neuron of each hidden layer according to the cross entropy loss of the data of the Nth batch.
A sound effect prediction model: the method is used for predicting the preference sound effect of the user according to the audio played by the user history. The terminal equipment can input the audio played by the user history into the sound effect prediction model, and the sound effect prediction model can output the preference sound effect of the user. The training mode of the sound effect prediction model can be the same as that of the sound effect recognition model, and the data set for training the sound effect prediction model is a large number of audios and sound effect labels of each audio. The audio played by the user history may be a song, audio in a video file, broadcast audio, a recording, and the like.
Sound effect parameter prediction model: and the method is used for predicting sound effect parameters corresponding to the preference sound effect of the user according to the audio played by the user history. The terminal equipment can input the audio played by the user history into the sound effect parameter prediction model, and the sound effect parameter prediction model can output sound effect parameters corresponding to the preference sound effect of the user. The training mode of the sound effect parameter prediction model may be the same as the training mode of the sound effect recognition model, and the difference is that the data set for training the sound effect parameter prediction model is a large number of audios, sound effect labels of each audio, and sound effect parameters corresponding to each sound effect label (or sound effect labels corresponding to each audio).
The sound effect prediction model is different from the sound effect prediction model in structure, in the embodiment of the application, a mapping layer can be added after the last hidden layer of the LSTM network structure, and the mapping layer is used for mapping sound effects and sound effect parameters, so that the sound effect parameter prediction model can predict sound effect parameters corresponding to the sound effects preferred by users.
Frequency response: also called frequency response curve, refers to the variation curve of gain with frequency.
Sound effect classification scoring model: and the score of each sound effect to which the sound effect parameter belongs is obtained according to the randomly generated sound effect parameter. The terminal equipment can input the randomly generated sound effect parameters into the sound effect classification scoring model, and the sound effect classification scoring model can output the sound effect parameters to be scores of the sound effects. The higher the score is, the closer the sound effect when the representation terminal equipment adopts the sound effect parameter to play the audio is to the corresponding sound effect. Similarly, the terminal equipment can input the audio of the sound effect to be determined into the sound effect classification scoring model, and the sound effect classification scoring model can output the score of the sound effect to be determined, wherein the audio belongs to each sound effect. The training mode of the sound effect classification scoring model can be the same as that of the sound effect identification model, and the difference is that the data set for training the sound effect classification scoring model is a large number of audios, sound effect labels of each audio and sound effect parameters (or sound effect labels corresponding to each audio) corresponding to each sound effect label.
In an embodiment, the sound effect classification scoring model corresponding to each sound effect can be trained, and then the sound effect classification scoring model corresponding to each sound effect is obtained. In the embodiment of the application, randomly generated sound effect parameters can be input into each sound effect classification scoring model, and scores of sound effect parameters belonging to each sound effect can also be obtained. For example, as the sound effect classification scoring model including "super bass" and the sound effect classification scoring model including "clear human voice" in the embodiment of the present application, the sound effect parameters generated randomly may be input into the sound effect classification scoring model of "super bass" and the sound effect classification scoring model of "clear human voice", respectively, so as to obtain the score that the sound effect parameters generated randomly belong to "super bass" and the score that belongs to "clear human voice". Different from the sound effect classification scoring model, the data set of the sound effect classification scoring model corresponding to each sound effect in the embodiment can be the audio frequency of the sound effect and the sound effect parameter corresponding to the sound effect.
A tag database: a sound effect tag for storing audio, such as a neural synthesizer database (NSynth Dataset). Illustratively, for example, the tag database may store the sound effect tag "subwoofer" of audio a, and the tag database may store the identification of audio a or audio a, which corresponds to the sound effect tag "subwoofer" of audio a. The identifier of the audio a may be a name of the audio a, an audio feature of the audio a, and the like, and is used for uniquely indicating the audio a.
In an embodiment, a "sound effect setting" control may be displayed on a setting page of the terminal device, or a "sound effect setting" control may be displayed on an interface of a setting option in the setting page. Alternatively, a "sound effect setting" control may be displayed on the setting page of the application. When the user needs to set the sound effect, the sound effect can be selected through the sound effect setting control on the setting interface of the terminal equipment or the setting interface of the application program, and then the terminal equipment plays the audio frequency with the sound effect when playing the audio frequency. Fig. 2 is a schematic view of an interface change of a terminal device according to an embodiment of the present application. Fig. 2 illustrates an example in which a "sound effect setting" control is included in a setting option 4, such as a "sound and vibration" option, in a setting page. Interface 201 is a setup page of the terminal device, which may include multiple setup selections, such as setup option 1-setup option 7. When the user needs to set the audio effect, he can click the setting option 4 (sound and vibration) option and the interface 201 jumps to the interface 202. The interface 202 is a setting page of "sound and vibration", a "sound effect setting" control is displayed on the interface 202, and when the user clicks the "sound effect setting" control, the interface 202 jumps to the interface 203. The interface 203 is a page of sound effect settings with a variety of selectable sound effects displayed on the interface 203, such as subwoofer, clear voice, warm and soft, clear melody, and vocal antiquity. The user can click the control of the corresponding sound effect to select the sound effect. Illustratively, the user may click on the "clear voice" control, selecting the sound effect "clear voice". Optionally, the interface 203 may also display the characteristics of each sound effect.
It should be appreciated that after the user selects the audio effects, the terminal device may record the audio effects selected by the user. Optionally, a "no" control may be displayed on the interface 203, and if the user clicks the "no" control, the terminal device may cancel the sound effect that has been selected. Correspondingly, the terminal equipment can record the sound effect which is not currently selected by the user. The terminal device may store the setting information of the sound effect, for example, the setting information of the sound effect may be stored in an operation log of the user, and the operation log may be stored in a memory of the terminal device. In the user's operation log. The setting information of the sound effect may include a sound effect tag and a setting time of the sound effect. For example, the setting information of the sound effect stored in the terminal device may be as shown in the following table. Note that, table one is a format in which the terminal device stores setting information of the audio effect, and the terminal device may further store the setting information of the audio effect in an extensible markup language (XML) format or a database format.
Watch 1
Sound effect label Setting time of day
Clear human voice Year 2020, 1, 30, 8:00
Is free of 3/2/10: 00 in 2020
Super bass Year 2020, 5 month, 1 day, 21:00
As shown in Table one, the user sets the sound effect to "clear human voice" at 8:00 on 30/1/2020, cancels the set sound effect at 10:00 on 2/3/2020, and then sets the sound effect to "subwoofer" at 21:00 on 1/5/2020. In one possible implementation, the terminal device may also store setting information of the sound effect that was set last by the user, such as "subwoofer" in the above table i and setting time "21: 00/5/1/2020" of "subwoofer" as the setting information of the sound effect stored in the terminal device.
In an embodiment, when the user selects a sound effect on the page of the sound effect setting, an application program of a sound effect application may also be selected, and the application program of the sound effect application selected by the user is described as a first application program, it should be understood that the first application program may be understood as an application program associated with the sound effect selected by the user. It should be understood that the first application may be an application that can play audio in the terminal device, and may be, but is not limited to, a music playing application, a video application, and a social application. Fig. 3 is a schematic diagram of a setting page provided in an embodiment of the present application. Unlike the interface 203, the sound effect setting page shown in fig. 3 may also display an identifier of an application program. The identification of the application may be an icon or name of the application. As shown in fig. 3, the user selects to apply the sound effect "subwoofer" in "application 1" and "application 2", and "application 1" and "application 2" may be referred to as a first application. In this embodiment, the setting information of the sound effect stored in the terminal device may further include a first application program associated with the sound effect, as shown in table two below. It should be understood that table two is an example of one format of setting information in which sound effects are stored for the terminal device.
Watch two
Sound effect label Setting time of day Sound effect associated first application program
Clear human voice Year 2020, 1, 30, 8:00 Application program 1
Is free of 3/2/10: 00 in 2020 Is free of
Is overweight and lowSound Year 2020, 5 month, 1 day, 21:00 Application 1 and application 2
The following embodiments may be combined with each other and are not described in detail with respect to the same or similar concepts or processes. Fig. 4 is a flowchart illustrating an embodiment of an audio processing method according to the present application. As shown in fig. 4, an audio processing method provided in an embodiment of the present application may include:
s401, receiving an audio playing request input by a user, and determining whether the user sets a sound effect, wherein the audio playing request is used for requesting to play audio.
The user can perform clicking or other operations on the interface of the terminal device, and interact with the terminal device to input the audio playing request. Alternatively, the user may perform voice interaction with the terminal device to input an audio play request to the terminal device. In the embodiment of the present application, there is no limitation on the way in which the user requests the terminal device to play the audio. Wherein, the audio playing request is used for requesting to play audio.
When the terminal device receives an audio play request input by a user, it can be determined whether the user has set a sound effect. In one possible implementation, the user may set sound effects on a settings page as shown in FIG. 2 above. In the embodiment of the application, optionally, the terminal device may determine whether the user has set the sound effect according to the setting information of the sound effect, such as table one. For example, the terminal device may determine that the user has set the sound effect "subwoofer".
In one possible implementation, the user may set sound effects on the settings page described above and shown in FIG. 3. The sound effect setting information may include a sound effect set by the user and at least one first application program associated with the sound effect set by the user, and the sound effect set by the user may be referred to as a set sound effect. And the terminal equipment determines the application program of which the user requests to play the audio, and determines whether the user sets the sound effect for the application program according to the stored sound effect setting information, such as the table II. The terminal device may determine whether the at least one application program of the setting information of the sound effect includes a first application program, and when the at least one application program includes the first application program, the terminal device determines that the sound effect has been set by the user, that is, the sound effect has been set for the application program by the user. When the first application program is not included in the at least one application program, the terminal equipment determines that the sound effect is not set by the user, namely the sound effect is not set for the application program by the user. For example, if the user requests to play audio at application 1, the terminal device may determine that the user has set the sound effect "super bass" for application 1 according to table two.
S402, if the sound effect is determined to be set by the user, the set sound effect is used as the preference sound effect of the user.
If the terminal equipment determines that the user sets the sound effect through the setting information of the sound effect, the terminal equipment can take the sound effect set by the user as the preference sound effect of the user. Wherein, the preference sound effect of the user can be understood as the sound effect of the user's favorite sound effect. For example, the terminal device may use "subwoofer" as a preferred sound effect for the user.
In one embodiment, the above S401-S402 may be replaced by: and when an audio playing request input by a user is received, according to the setting information of the sound effect, taking the sound effect set by the user in the setting information of the sound effect as the preference sound effect of the user. In this way, the terminal device can query the setting information of the sound effect when receiving the audio playing request input by the user, and the sound effect set by the user in the setting information of the sound effect is used as the preference sound effect of the user.
S403, acquiring sound effect parameters corresponding to the preference sound effect, and adjusting the current sound effect parameters to the sound effect parameters corresponding to the preference sound effect.
It should be understood that the sound effect parameter set may be stored in the memory of the terminal device in advance. The sound effect parameter set comprises sound effect parameters corresponding to various sound effects. Optionally, the sound effect parameter set may include a sound effect tag and a sound effect parameter corresponding to the sound effect tag, where the sound effect tag is used to indicate a sound effect. The interpretation of the sound effect parameters can be referred to the above-mentioned related description. The following sound effect parameters including DRC parameters, EQ parameters, and ANC parameters are described as examples. The sound-effect parameter set stored in the terminal device can be represented by the following table three, which is an example of a format for storing the sound-effect parameter set.
It should be understood that in the embodiment of the present application, if the sound effect that has been set by the user is not used as the preference sound effect of the user, S402 and S403 may be replaced by: and if the sound effect set by the user is determined, obtaining sound effect parameters corresponding to the set sound effect, and adjusting the current sound effect parameters to the sound effect parameters corresponding to the set sound effect.
It should be understood that the sound effect parameters corresponding to the sound effect preferred by the user may also be referred to as "sound effect parameters", and in the embodiment of the present application, the sound effect parameters corresponding to the sound effect preferred by the user are taken as an example to be described so as to be distinguished from the sound effect parameters in the sound effect parameter set.
Watch III
Figure BDA0002796066670000101
Figure BDA0002796066670000111
In the DRC parameter, "2,2000, 2.1,0.8,1000,1.1,10,0.1 ]" takes the sound effect "super bass" as an example, and the numerical values in the square brackets respectively represent the number of frequency bands of the audio signal, the cutoff frequency of the frequency bands, the gain of the audio signal, the compression ratio, the amplitude threshold, the compression speed, the gain duration, and the background noise threshold. The EQ parameters include parameters of 8 filters, each of which is distinguished by brackets, and taking parameters (2,1000,2.1,3.5) as an example, values in brackets represent the type, center frequency, gain, and Q value of the filter, respectively. The ANC parameters include parameters of 16 filters, and the parameters of each filter are divided by brackets, and taking parameters (4,43.0,7.5,4630,0.0) as an example, the values in the brackets are the type of the filter, the center frequency, the full-band gain, the Q value, and the single-band gain.
As shown in the third table, if the preferred sound effect of the user is super bass, the terminal device may determine the sound effect parameter corresponding to the preferred sound effect "super bass" according to the sound effect parameter set. Based on the preference sound effect of the user, the terminal equipment can adjust the current sound effect parameters of the terminal equipment into sound effect parameters corresponding to the preference sound effect. The current sound effect parameter may be a sound effect parameter corresponding to a sound effect last set by the user. For example, as shown in the above table one, if the current time is 20:00 at 3/5/2020, the terminal device may determine that the current sound effect parameter may be a sound effect parameter of the terminal device when the user does not set a sound effect. Or, if the current time is 20:00 at 5/6/5/2020, the terminal device may determine that the current sound effect parameter is the sound effect parameter corresponding to the "clear voice".
In an embodiment, fig. 5 is a schematic structural diagram of a terminal device provided in the embodiment of the present application. As shown in fig. 5, the terminal device in the embodiment of the present application may include: digital-to-analog converter, analog-to-digital converter and audio effect component. The sound effect assembly may include at least one of the following modules: DRC module, EQ module, ANC module. In the embodiment of the present application, the sound effect component may include a DRC module, an EQ module, and an ANC module. The sound effect component can be connected with the digital-to-analog converter and the analog-to-digital converter respectively, the digital-to-analog converter can be connected with a loudspeaker or external equipment (such as an earphone) in the terminal equipment, and the analog-to-digital converter can be connected with a microphone in the terminal equipment.
And the sound effect component is used for adjusting the audio signal so as to change the sound effect corresponding to the audio signal. Wherein the DRC module is adapted to compress or expand the audio signal such that the sound in the audio sounds softer or louder, i.e. to adjust the amplitude of the audio signal. And the EQ module is used for correcting the amplitude frequency characteristic and the phase frequency characteristic of a transmission channel of the audio signal so as to compensate the audio signal and reduce the interference on the audio signal. And the ANC module is used for generating reverse sound waves equal to external noise and neutralizing the external noise so as to realize the noise reduction effect. And the digital-to-analog converter is used for converting the digital audio signal into an analog audio signal to output. And the analog-to-digital converter is used for converting the input analog audio signal into a digital audio signal. In other embodiments of the present application, the terminal device may include more or fewer components than those shown in the drawings to process the audio signal, fig. 5 does not form a structural limitation on the terminal device, and it is understood that fig. 5 illustrates an example in which the audio effect component includes a DRC module, an EQ module, and an ANC module, and other modules for processing the audio signal may also be included in the audio effect component.
The DRC module sound effect parameters are DRC parameters, the EQ module sound effect parameters are EQ parameters, the sound effect parameters of the ANC module are ANC parameters, and the parameters of all the modules in the sound effect assembly influence the sound effect of the terminal equipment for playing the audio. In a possible implementation manner, a preset code and a current sound effect parameter are stored in the terminal device. The preset code may be a code written by a developer to enable the terminal device to execute the sound effect parameter to play the audio, and the preset code may be stored in a system installation package in the terminal device. The terminal equipment can revise the current sound effect parameter into the sound effect parameter that the preference sound corresponds, and then reach the purpose of adjusting the current sound effect parameter to the sound effect parameter that the preference sound corresponds.
Or, in a possible implementation manner, multiple sets of sound effect parameters are pre-stored in the terminal device, and each set of sound effect parameters includes a DRC parameter, an EQ parameter, and an ANC parameter. At least one parameter in each group of sound effect parameters is different, and each group of sound effect parameters corresponds to one sound effect. After the terminal equipment determines the preference sound effect of the user, a target sound effect group corresponding to the preference sound effect can be selected from multiple groups of sound effect parameters, and then the sound effect parameters corresponding to the preference sound effect are obtained. Different from the possible implementation modes, the terminal equipment can select the sound effect parameters corresponding to the preference sound effect from the multiple groups of sound effect parameters without modifying the sound effect parameters. Optionally, each group of sound effect parameters has a corresponding identifier to represent a corresponding sound effect, and if each group of sound effect parameters has a corresponding identifier, the identifier may be a number or a sound effect tag. Illustratively, the current sound effect parameter is identified as 1, the sound effect representing "clear voice", the preferred sound effect is "super bass", and the terminal device may determine the sound effect parameter representing "super bass" to be identified as 2.
In one embodiment, the above S401-S403 may be replaced by: when an audio playing request input by a user is received, sound effect parameters corresponding to the sound effect preferred by the user are obtained according to the setting information of the sound effect. In a possible implementation manner, when the terminal device stores the setting information of the sound effect in the first table or the second table, the terminal device may add the sound effect parameter corresponding to the sound effect set by the user to the first table or the second table according to the sound effect parameter set, and if the first table can be replaced by the fourth table:
watch four
Figure BDA0002796066670000121
Figure BDA0002796066670000131
In this kind of mode, when terminal equipment can receive the audio playback request of user input, the setting information of inquiry audio, and then can regard as the audio parameter that the preference audio of user corresponds with the audio parameter that the audio that the user set up corresponds.
S404, playing the audio by adopting the sound effect parameters corresponding to the preference sound effect.
After the sound effect parameters are adjusted, the terminal equipment can play audio by adopting the sound effect parameters corresponding to the preference sound effect.
After the user sets the sound effect, the sound effect set by the user can be displayed on the interface of the application program related to the sound effect. Or, when the user clicks an application program associated with the trigger sound effect (for example, triggering to play music) in the pull-down status bar of the terminal device, the sound effect set by the user may be displayed in the pull-down status bar. The following description will be given by taking an example in which after a user sets a sound effect, when the user opens an application associated with the sound effect for the first time, the sound effect set by the user is displayed on an interface of the application. Fig. 6 is a schematic view of another interface change of the terminal device according to the embodiment of the present application. The interface 601 is a music playing page of the application 1 (e.g., a music playing application), and a music list 601a and a music playing bar 601b are displayed on the page. Music list 601a may include a plurality of song names, and music play bar 601B may include an identification of a song 601c, which is the name of the song, such as song B, and a play control 601 d. The playing control 601d is used for triggering the terminal device to play song B. It should be appreciated that Song B, which is the song played by the application the last time the user exited the application, may be located at the top of the music list 601 a.
When the user selects Song B in music list 601a, or the user clicks on music bar 601B, interface 601 may jump to Song B's play page interface 602, or play Song B directly. Displayed on interface 602 are song options 602a, song B information 602B, user-set audio effects 602c (e.g., "subwoofer"), a play progress bar 602d, a fast-rewind (previous) control 602e, a pause control 602f, and a fast-forward (next) control 602 g. The information 602B for song B in interface 602 may include the name of song B, the artist of song B, and the lyrics of song B, which are indicated by numbers in fig. 6. Song options 602a are associated with interface 602. Song options 602a association interface 602 means that the user selects a song option in the menu bar and the terminal device jumps to the display interface 602. The user can see the audio effects on the play page for song B.
In the embodiment of the present application, one possible implementation manner for the terminal device to play audio by using the sound effect parameter corresponding to the preference sound effect is as follows: and the terminal equipment executes the preset codes to enable the terminal equipment to play audio by adopting the sound effect parameters corresponding to the preference sound effect. Or, another possible implementation manner that the terminal device plays the audio by using the sound effect parameter corresponding to the preference sound effect is as follows: after the terminal equipment determines the identifier of the sound effect parameter corresponding to the preference sound effect, the preset code can be executed, so that the terminal equipment can play the audio by adopting the sound effect parameter corresponding to the identifier.
In the embodiment of the application, the user can set up the audio in advance, and the audio that the user set up is user's preference audio, and terminal equipment can adjust the audio parameter for the audio parameter that the preference audio corresponds, and then adopts the audio parameter broadcast audio that the preference audio corresponds, realizes the audio that the audio is diversified, improves user experience.
In the above embodiment, the user needs to set the sound effect in advance, so that the terminal device can play the audio with the sound effect. In an embodiment, when receiving an audio playing request input by a user, the terminal device in the embodiment of the application can obtain the preference sound effect of the user according to the information of the user history playing audio in a preset time period, and then play the audio by adopting the sound effect parameter corresponding to the preference sound effect of the user, so that the user can be prevented from manually setting the sound effect, and the user experience is improved. The process may refer to the relevant description in S405.
In an embodiment, as shown in fig. 4, after S401 described above, the audio processing method provided in an embodiment of the present application may further include:
s405, if the sound effect is determined not to be set by the user, obtaining the sound effect preferred by the user according to the information of the user history playing audio in the preset time period.
It should be understood that S402 and S405 are alternatively performed steps.
The preset time period may be a time period before the time when the user inputs the audio play request (the time when the user inputs the audio play request), and may be, but is not limited to, a day, a week, or a month. The user history playing audio may include, but is not limited to: and the user plays music, songs, broadcasts and audio in videos on the terminal equipment. The information of the user's history playing audio may be: the user has historically played the audio or the user has historically played the audio's sound effect label. It should be understood that the terminal device may store the user history played audio while the user history played audio. Or when the user plays the audio historically, the terminal equipment can collect the user played audio historically and input the user played audio historically into the sound effect recognition model to obtain the sound effect of the user played audio historically, and then the sound effect label of the user played audio historically is stored. In a possible implementation manner, the terminal device may delete the information of the user history playing audio before the preset time period of the current time according to the current time, so as to save the memory space of the terminal device.
In the embodiment of the application, if the terminal equipment determines that the user does not set the sound effect according to the setting information of the sound effect, the terminal equipment can acquire the preference sound effect of the user according to the information of the historical playing audio of the user. In a possible implementation manner, when the information of the user history playing audio is the user history playing audio, the terminal device may input the user history playing audio to the sound effect prediction model to obtain the preference sound effect of the user predicted by the sound effect prediction model. In a possible implementation manner, when the information of the user history playing audio is the sound effect tag of the user history playing audio, the terminal device may use the sound effect corresponding to the sound effect tag with the largest number as the preference sound effect of the user.
It should be noted that, when the information of the user history playing audio is the user history playing audio, in a possible implementation, the terminal device may input the user history playing audio to the sound effect parameter prediction model, and may predict the sound effect parameter corresponding to the sound effect preferred by the user. Different from the terminal equipment which obtains the preference sound effect of the user according to the sound effect prediction model, the terminal equipment can directly obtain the sound effect parameter corresponding to the preference sound effect of the user according to the sound effect parameter prediction model. Fig. 7 is a flowchart illustrating an audio processing method according to another embodiment of the present application. In this manner, as shown in fig. 7, the audio processing method provided in the embodiment of the present application may further include:
s701, if the fact that the user does not set the sound effect is determined, sound effect parameters corresponding to the sound effect preferred by the user are obtained according to the information of the audio frequency played by the user history.
It should be understood that "S402-S403" and S701 are alternatively performed steps, and the terminal device may perform S701 after performing S401 and may perform S404 after performing S701.
In a possible implementation manner, when the terminal device receives an audio playing request input by a user, the sound effect parameters corresponding to the preference sound effect of the user can be acquired according to the information of the user history playing audio within a preset time period. The process may refer to the related description of S701 above.
In the embodiment of the application, the terminal equipment can acquire the preference sound effect of the user or the sound effect parameter corresponding to the preference sound effect of the user according to the information of the historical playing sound of the user, so that the sound effect is played according to the preference sound effect of the user, the purpose of sound effect diversification can be achieved, and the user can be prevented from manually setting the sound effect. In addition, the information of the user historical playing audio in the preset time period can be adopted to obtain the preference sound effect of the user, the sound effect can be adjusted at any time along with the preference of the user, and the method and the device are more intelligent.
In an embodiment, the terminal device may set a saving duration for the sound effect set by the user, that is, the setting information of the user corresponds to the saving duration, and the saving duration is a period of time from the moment when the user sets the sound effect. When the sound effect set by the user is within the saving time period, the terminal device may play the audio with the sound effect set by the user, for example, the terminal device may execute S401, S402, S403, and S404. However, if the sound effect set by the user exceeds the storage duration, the terminal device may obtain a preferred sound effect of the user or a sound effect parameter corresponding to the preferred sound effect according to the information of the user history playing sound within the preset time period, and then play the sound using the sound effect parameter corresponding to the preferred sound effect, for example, the terminal device may execute S401, S405, S403 (or S406), and S404. Illustratively, as shown in the above table one, the setting time of "super bass" is 21:00 of 5/1/2020, the storage time is 5 days, and then the audio is played with the audio set by the user in the audio "super bass" of 21:00 of 5/1/2020 and 21:00 of 5/6/2020, and the audio set by the user is within the storage time. After 21:00 of 5/6/2020, the sound effect set by the user is not in the storage duration, and the terminal device may obtain the preference sound effect of the user or the sound effect parameter corresponding to the preference sound effect according to the information of the user history playing sound within the preset time period before 21:00 of 5/6/2020, and then play the sound by using the sound effect parameter corresponding to the preference sound effect.
In one scenario, if the user sets the sound effect "subwoofer," and the user sets the sound effect associated application to application 1. When the user uses other application programs, the terminal device may not play audio with the sound effect "super bass", and the sound effect preferred by the user may change. However, the user forgets to close the sound effect set on the setting page, and when the user uses the application program 1, the terminal device still uses the sound effect "super bass" to play the audio, which causes trouble to the user. In the embodiment of the application, the terminal equipment adopts a method for storing the time length of the sound effect setting set by the user, and when the preference sound effect of the user changes, the terminal equipment can adopt sound effect parameters to be adjusted in time, and then adopts the sound effect parameters corresponding to the preference sound effect to play the audio. This kind of mode is more intelligent, and the user demand of more laminating can improve user experience.
In the above embodiment, the sound effect parameter set shown in table three above may be stored in the terminal device in advance, and the sound effect parameter set may be preset in the terminal device. The following describes a process of acquiring a sound-effect parameter set. Fig. 8 is a schematic flow chart illustrating a process of acquiring a sound effect parameter set according to an embodiment of the present disclosure. As shown in fig. 8, the method for acquiring a sound effect parameter set according to the embodiment of the present application may include:
s801, acquiring a standard audio of a first sound effect and a first frequency response of the standard audio of the first sound effect.
It should be understood that, in the embodiment, the execution subject for acquiring the sound effect parameter set is taken as an example for description, and the execution subject may also be an electronic device with computing capability, such as a computer, a terminal device, and the like. The standard audio of the first audio is standard audio of various audio, and the various audio is audio included in the audio parameter set. The standard audio of the first sound effect can be the audio set for the first sound effect in advance, and the standard audio of the first sound effect can be used as a basis for identifying whether other audio is the first sound effect.
In a possible implementation manner, the server may obtain standard audio corresponding to the first sound effect from the tag database. It should be understood that a large number of audios, and sound effect tags for each audio, may be included in the tag database. For example, the server may select the audio of the sound effect tag of the first sound effect as the standard audio of the first sound effect according to the sound effect tag in the tag database.
Although the standard audio of the first audio can be obtained in the above manner, because the audio belonging to the same audio tag in the tag database is multiple, in order to improve the reference accuracy of the standard audio of the first audio, in a possible implementation manner, the server may input the test audio to the audio classification scoring model, so as to obtain a score that the test audio belongs to the first audio. The test audio may be locally stored audio, or audio crawled from the network, or audio that is recorded by developers. The server may use the test audio with the highest score as the standard audio of the first sound effect.
After the server obtains the standard audio of the first sound effect, the server can send the standard audio of the first sound effect to the terminal equipment. Or, the developer can import the standard audio of the first sound effect into the terminal device, and the terminal device can play the standard audio of the first sound effect to obtain the wav file of the standard audio of the first sound effect. The server can adopt a simulation tool to acquire a first frequency response of the standard audio of the first sound effect according to the wav file of the standard audio. Wherein, the simulation tool may use fourier transform (fourier transform) to convert the wav file of the standard audio into a frequency response curve, i.e. the first frequency response of the standard audio, as shown in fig. 9. It should be understood that the frequency response may be a frequency response curve, and the terminal device playing the standard audio may be a device in a test phase.
S802, adjusting the sound effect parameters, processing the standard audio of the first sound effect according to the adjusted sound effect parameters, and acquiring a second frequency response of the standard audio of the first sound effect.
The simulation tool includes a simulation module of the sound effect component as shown in fig. 5, and the simulation module can simulate the DRC parameters of the DRC module, the EQ parameters of the EQ module, and the ANC parameters of the ANC module in the sound effect component. In the embodiment of the application, the server can continuously adjust the sound effect parameters in the simulation tool, and then continuously process the standard audio frequency of the first sound effect with the adjusted sound effect parameters. Specifically, the server may process the first frequency response by using the adjusted sound effect parameters, and further obtain a second frequency response of the standard audio of the first sound effect, so as to determine whether the second frequency response is close to the first frequency response, as shown in fig. 10.
It should be understood that, in the embodiment of the present application, the server may modify the sound effect parameters of each module in the simulation module. Optionally, the server may determine an adjustment sequence in the sound effect parameters of each module in the simulation module according to the priority of the parameters in the sound effect parameters. Illustratively, the priority of the parameters is from high to low, the EQ parameter, the DRC parameter and the ANC parameter. The server may first adjust the EQ parameters and leave the DRC parameters and ANC parameters unchanged. After the EQ parameters are adjusted within the preset adjustment range, the server may keep the EQ parameters and ANC parameters unchanged, and adjust the DRC parameters. After the DRC parameters are adjusted within the preset adjustment range, the server may adjust the ANC parameters while keeping the EQ parameters and DRC parameters unchanged. The simulation module can be used for processing the first frequency response once when the server adjusts the sound effect parameters in the simulation module once so as to obtain a second frequency response of the standard audio of the first sound effect. The server continuously adjusts the sound effect parameters, and then can obtain the second frequency response of the standard audio of the first sound effect corresponding to the multiple groups of sound effect parameters.
And S803, taking the sound effect parameter corresponding to the second frequency response with the difference value smaller than the preset difference value with the first frequency response as the sound effect parameter of the first sound effect to obtain a sound effect parameter set.
The terminal equipment can acquire a second frequency response of the first frequency response processed by adopting different sound effect parameters, and further acquire a difference value between the second frequency response and the first frequency response. The difference between the second frequency response and the first frequency response can represent the similarity between the sound effect of the standard audio played by the sound effect parameter corresponding to the second frequency response and the first sound effect. Wherein, the smaller the difference value is, the closer the sound effect representing the standard audio played by the sound effect parameter is to the first sound effect. The larger the difference value is, the farther the sound effect representing the standard audio played by the sound effect parameter is from the first sound effect. In the embodiment of the application, the server can use the sound effect parameter corresponding to the second frequency response with the difference value smaller than the preset difference value with the first frequency response as the sound effect parameter of the first sound effect. For different sound effects, sound effect parameters corresponding to the different sound effects can be obtained by adopting the method, and then a sound effect parameter set is obtained. Optionally, if there are a plurality of second frequency responses whose difference with the first frequency response is smaller than the preset difference, the sound effect parameter with the smallest difference between the first frequency response and the second frequency response may be used as the sound effect parameter of the first sound effect. It should be noted that the preset difference value may be predefined by the developer.
It should be understood that the first frequency response and the second frequency response are both frequency response curves. In the embodiment of the application, the server may obtain a mean value of absolute values of differences between the vertical coordinates of the first frequency response and the second frequency response in the same horizontal coordinate, and then use the mean value of the absolute values of the differences as a difference value between the first frequency response and the second frequency response. Wherein, the abscissa of the frequency response curve is frequency, and the ordinate is gain. Illustratively, the values of the ordinate on the first frequency response curve at the same frequency are [1, 4, 6, 7, 8], the values of the ordinate on the second frequency response curve are [3, 2, 4, 5, 6], and the difference between the first frequency response and the second frequency response is the average of the absolute values of the differences between the gains, such as 2.
It should be understood that, after acquiring the sound-effect parameter set, the server may preset the sound-effect parameter set in the terminal device, for example, the sound-effect parameter set shown in table three above may be stored in a memory of the terminal device.
Fig. 11 is another schematic flow chart illustrating obtaining a sound-effect parameter set according to an embodiment of the present disclosure. As shown in fig. 11, the method for acquiring a sound effect parameter set according to the embodiment of the present application may include:
s1101, randomly generating a plurality of groups of sound effect parameters, and inputting the plurality of groups of sound effect parameters into the sound effect classification scoring model to obtain the score of each group of sound effect parameters belonging to the first sound effect.
Each set of sound effect parameters may include DRC parameters, EQ parameters, and ANC parameters, with at least one sound effect parameter in different sets of sound effect parameters being different. Terminal equipment can be with multiunit sound effect parameter input to the categorised model of scoring of sound effect, and the categorised model of scoring of sound effect can output every group sound effect parameter and belong to the score of first sound effect. Wherein, the higher the score is, the closer the sound effect representing the audio played by the set of sound effect parameters is to the first sound effect. It should be understood that the first sound effect is used to characterize various sound effects.
Illustratively, the set of sound effect parameters generated randomly by the server are "DRC parameters: [2,2000,2.1,0.8,1000,1.1,10,0.1 ]; EQ parameters: [ (2,1000,2.1,3.5), (3,1200,2.4,3.6), (2,1800,2.1,3.5), (1,800,0.1,3.5), (2,500,4.9, 1.5') (0,1788,2.3,3.2), (2,3000, -2.8,3.5), (2,5000,2.9,3.5) ]; ANC parameters: [ (4,43.0,7.5,4630,0.0), (3,0,4,1200,0), (4,22.5,1.5,8540,0.0), (3,0,4,1200,0), (4, -56.0,6.0,8820, 0.0'), (3,0,4,1200,0), (4, -23.5,3.5,15030,0.0), (3,0,4,1200,0), (2, -42.5,7.0,15700,0.0), (3,0,4,1200,0), (4,11.5,8.0,8890,0.0), (3,0,4,1200,0), (4, -1.5,4.0,15210,0.0), (3,0,4,1200,0), (4, -11.0,6.0,2530,0.0), (3,0,4,1200,0) ] ", the classification model may output a sound effect score of (0.72, 0.05, 0.06, 0.72, 14,14), the score is used to represent that the set of sound effect parameters belongs to the score of the first sound effect, if the score of the set of sound effect parameters belonging to the sound effect "super bass" is 0.72, the score of the set of sound effect parameters belonging to the sound effect "clear human sound" is 0.05 … …. For example, the score of the set of sound effect parameters belonging to "subwoofer" is the highest, and the sound effect corresponding to the set of sound effect parameters is closer to the sound effect "subwoofer".
S1102, the sound effect parameter with the highest score of the first sound effect is used as the sound effect parameter of the first sound effect, so as to obtain a sound effect parameter set.
In the embodiment of the application, the terminal device can acquire the sound effect parameter with the highest score belonging to the first sound effect, and the sound effect parameter with the highest score is used as the sound effect parameter of the first sound effect. If the highest score in each group of sound effect parameters corresponding to the sound effect "super bass" is 0.98, the sound effect parameter corresponding to the 0.98 is used as the sound effect parameter of the sound effect "super bass", and accordingly, the server can obtain a sound effect parameter set.
Compared with the method shown in fig. 8, in the method of acquiring the sound effect parameter set shown in fig. 11, the server may not acquire the standard audio corresponding to the first sound effect in advance. That is to say, under the condition that the server cannot acquire the standard audio corresponding to the first audio, the server may obtain the audio parameter corresponding to the audio according to the randomly generated audio parameter, and the method shown in fig. 11 has wider applicability.
In the embodiment of the present application, an execution subject for executing the audio processing method may be a terminal device, a chip or a processor in the terminal device, or the like. It should be understood that the terminal device in the embodiment of the present application may be referred to as a User Equipment (UE), a mobile terminal (mobile terminal), a terminal (terminal), and the like. The terminal device may be a Personal Digital Assistant (PDA), a handheld device with a wireless communication function, a computing device, a vehicle-mounted device, or a wearable device, a Virtual Reality (VR) terminal device, an Augmented Reality (AR) terminal device, a wireless terminal in industrial control (industrial control), a wireless terminal in self driving (self driving), a wireless terminal in smart city (smart city), a wireless terminal in smart home (smart home), or the like. The form of the terminal device is not particularly limited in the embodiment of the present application.
Fig. 12 is another schematic structural diagram of a terminal device according to an embodiment of the present application. As shown in fig. 12, the terminal apparatus 1200 may include: processor 1210, memory 1220, communications module 1230, display screen 1240, sensors 1250, audio module 1260. It is to be understood that the structure illustrated in fig. 12 does not constitute a specific limitation of the terminal apparatus 1200. In other embodiments of the present application, terminal device 1200 may include more or fewer components than shown, or some components may be combined, some components may be split, or a different arrangement of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware. The interface connection relationship between the modules in the embodiment of the present application is only schematically illustrated, and does not limit the structure of the terminal device 1200. In other embodiments of the present application, the terminal device 1200 may also adopt different interface connection manners or a combination of multiple interface connection manners in the foregoing embodiments.
Processor 1210 may include one or more processing units, such as: processor 1210 may include an Application Processor (AP), a Digital Signal Processor (DSP), a Display Processing Unit (DPU), and/or a neural-Network Processing Unit (NPU), among others. The different processing units may be separate devices or may be integrated into one or more processors. In some embodiments, terminal device 1200 can also include one or more processors 1210. The processor may be, among other things, the neural center and the command center of the terminal device 1200. In some embodiments, processor 1210 may include one or more interfaces. The interface may include an integrated circuit (I2C) interface, and/or a Universal Serial Bus (USB) interface, and/or the like. The USB interface is an interface conforming to the USB standard specification, and may specifically be a Mini USB interface, a Micro USB interface, a USB Type C interface, or the like. The USB interface may be used to connect a charger to charge the terminal device 1200, and may also be used to transmit data between the terminal device 1200 and a peripheral device. And the earphone can also be used for connecting an earphone and playing audio through the earphone.
Memory 1220 may be used to store one or more computer programs, including instructions. Processor 1210 may cause terminal device 1200 to execute various functional applications, data processing, and the like by executing the above-described instructions stored in memory 1220. The memory 1220 may include a program storage area and a data storage area. Wherein, the storage program area can store an operating system; the storage program area may also store one or more application programs (e.g., a gallery, contacts, etc.), and the like. In some embodiments, processor 1210 may cause terminal device 1200 to perform various functional applications and data processing by executing instructions stored in memory 1220 and/or instructions stored in a memory disposed in processor 1210.
The communication module 1230 may provide communication modules including 2G/3G/4G/5G, etc. applied to the terminal device 1200, and/or communication modules including Wireless Local Area Networks (WLAN), bluetooth, Global Navigation Satellite System (GNSS), Frequency Modulation (FM), NFC, infrared technology (IR), etc. applied to the terminal device 1200. The communication module 1230 is used for implementing communication between the terminal device 1200 and other devices.
The terminal device 1200 can implement a display function by a Graphics Processing Unit (GPU), a display screen 1240, an application processor, and the like. The GPU may interface a display screen 1240 and an application processor. The GPU is used to perform mathematical and geometric calculations for graphics rendering. Processor 1210 may include one or more GPUs that execute instructions to generate or change display information.
The display screen 1240 is used to display images, video, and the like. The display screen 1240 includes a display panel. The display panel may adopt a Liquid Crystal Display (LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode (active-matrix organic light-emitting diode, AMOLED), a flexible light-emitting diode (FLED), a miniature, a Micro-oeld, a quantum dot light-emitting diode (QLED), and the like. In some embodiments, the terminal device 1200 may include 1 or N display screens 1240, N being a positive integer greater than 1.
The sensors 1250 may include a pressure sensor 1250A, a gyro sensor 1250B, an acceleration sensor 1250C, a distance sensor 1250D, a fingerprint sensor 1250E, a touch sensor 1250F, and the like.
The terminal device 1200 can implement an audio function through the audio module 1260, the speaker 1260A, the receiver 1260B, the microphone 1260C, the earphone interface 1260D, and the application processor. Such as music playing, recording, etc. The audio module 1260 is used for converting digital audio information into an analog audio signal for output, and also used for converting an analog audio input into a digital audio signal. The audio module 1260 may also be used to encode and decode audio signals. In some embodiments, the audio module 1260 may be disposed in the processor 110, or some functional modules of the audio module 1260 may be disposed in the processor 110. The speaker 1260A, also known as a "horn," is used to convert audio electrical signals into acoustic signals. The electronic apparatus 100 can listen to music through the speaker 1260A or listen to a hands-free call. The receiver 1260B, also known as "earpiece", is used to convert the electrical audio signal into an acoustic signal. When the electronic apparatus 100 receives a call or voice information, it can receive a voice by bringing the receiver 1260B close to the ear of the person. The microphone 1260C, also known as a "microphone," converts sound signals into electrical signals. When making a call or sending voice information, the user can input a voice signal into the microphone 1260C by speaking into the mouth of the user near the microphone 1260C. The electronic device 100 may be provided with at least one microphone 1260C. In other embodiments, the electronic device 100 may be provided with two microphones 1260C to achieve noise reduction functions in addition to collecting sound signals. In other embodiments, the electronic device 100 may further include three, four or more microphones 1260C to collect sound signals, reduce noise, identify sound sources, perform directional recording, and so on. The earphone interface 1260D is used for connecting a wired earphone. The earphone interface 1260D may be the USB interface 130, may be an open mobile electronic device platform (OMTP) standard interface of 3.5mm, and may also be a CTIA (cellular telecommunications industry association) standard interface.
The term "plurality" in the embodiments of the present application means two or more. The term "and/or" herein is merely an association describing an associated object, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship; in the formula, the character "/" indicates that the preceding and following related objects are in a relationship of "division".
It is to be understood that the various numerical references referred to in the embodiments of the present application are merely for convenience of description and distinction and are not intended to limit the scope of the embodiments of the present application. It should be understood that, in the embodiment of the present application, the sequence numbers of the above-mentioned processes do not mean the execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiment of the present application.

Claims (25)

1. An audio processing method, comprising:
receiving an audio playing request input by a user, and acquiring sound effect parameters according to sound effect setting information or historical audio playing information of the user, wherein the audio playing request is used for requesting to play audio;
and playing the audio by adopting the sound effect parameters.
2. The method of claim 1, wherein before obtaining the sound effect parameters according to the setting information of the sound effect or the information of the audio played by the user history, the method further comprises:
determining whether the user sets the sound effect or not according to the setting information of the sound effect;
the method for acquiring sound effect parameters according to the setting information of the sound effect or the information of the audio played by the user history comprises the following steps:
if the sound effect set by the user is determined, obtaining sound effect parameters corresponding to the set sound effect according to the setting information of the sound effect;
and if the user is determined not to set the sound effect, obtaining sound effect parameters corresponding to the sound effect preferred by the user according to the information of the user historical playing audio within a preset time period.
3. The method of claim 2, wherein the setting information of the sound effects comprises the set sound effects and at least one first application program associated with the set sound effects, and the determining whether the user has set sound effects according to the setting information of the sound effects comprises:
determining whether the at least one first application includes an application for which the user requests to play audio;
if yes, determining that the sound effect is set by the user.
4. A method according to claim 2 or 3, characterized in that the method further comprises:
and if the setting information of the sound effect exceeds the storage time length, obtaining sound effect parameters corresponding to the sound effect preferred by the user according to the information of the user history playing audio in the preset time period.
5. The method according to any one of claims 2-4, wherein the obtaining of sound effect parameters corresponding to the set sound effects according to the setting information of the sound effects comprises:
according to the sound effect parameter set and the set sound effect, obtaining the sound effect parameters corresponding to the set sound effect, wherein the sound effect parameter set comprises the sound effect parameters corresponding to the sound effects.
6. The method according to any one of claims 2 to 4, wherein the obtaining of sound effect parameters corresponding to the sound effect preferred by the user according to the information of the user history playing audio in the preset time period comprises:
acquiring the preference sound effect of the user according to the information of the user history playing audio;
according to the sound effect parameter set and the preference sound effect of the user, obtaining the sound effect parameters corresponding to the preference sound effect, wherein the sound effect parameter set comprises the sound effect parameters corresponding to the sound effects.
7. The method according to claim 6, wherein the information of the user history playing audio is user history playing audio, and the obtaining of the user preference sound effect according to the information of the user history playing audio comprises:
and inputting the historical playing audio of the user into a sound effect prediction model to obtain the preference sound effect of the user.
8. The method according to claim 6, wherein the information of the user history playing audio is an audio tag of the user history playing audio, the audio tag is used for representing audio, and the obtaining of the user preference audio according to the information of the user history playing audio comprises:
and taking the sound effect corresponding to the sound effect label with the largest quantity as the preference sound effect of the user.
9. The method of claim 8, further comprising:
and collecting the historical playing audio of the user, inputting the historical playing audio of the user to a sound effect recognition model, and obtaining a sound effect label of the historical playing audio of the user.
10. The method according to any one of claims 1 to 4, wherein the information of the user history playing audio is the user history playing audio, and the obtaining the sound effect parameter according to the information of the user history playing audio comprises:
and inputting the historical playing audio of the user into a sound effect parameter prediction model, and acquiring sound effect parameters corresponding to the sound effect preferred by the user.
11. The method according to any one of claims 1-10, wherein before playing the audio using the sound-effect parameters, further comprising:
modifying the current sound effect parameter into the sound effect parameter; alternatively, the first and second electrodes may be,
and selecting the sound effect parameters from the preset multiple groups of sound effect parameters, wherein each group of sound effect parameters corresponds to one sound effect.
12. The method according to any of claims 1-11, wherein the sound-effect parameters comprise at least one of: dynamic range control DRC parameters, equalizer EQ parameters, active noise reduction ANC parameters.
13. An electronic device for playing audio, comprising:
the electronic equipment is used for receiving an audio playing request input by a user and acquiring a sound effect parameter according to sound effect setting information or user history audio playing information, wherein the audio playing request is used for requesting to play audio;
and the sound effect component is used for adopting the sound effect parameters to play the audio.
14. The electronic device of claim 13,
the system is also used for determining whether the user sets the sound effect according to the setting information of the sound effect;
the method comprises the steps that specifically, if the sound effect set by the user is determined, sound effect parameters corresponding to the set sound effect are obtained according to the setting information of the sound effect; and if the user is determined not to set the sound effect, obtaining sound effect parameters corresponding to the sound effect preferred by the user according to the information of the user historical playing audio within a preset time period.
15. The electronic device according to claim 14, wherein the setting information of the sound effects comprises the set sound effects and at least one first application program associated with the set sound effects;
in particular for determining whether the at least one first application comprises an application for which the user requests to play audio; and if the at least one first application program comprises an application program which is requested by the user to play audio, determining that the sound effect is set by the user.
16. The electronic device of claim 14 or 15,
and if the setting information of the sound effect exceeds the storage time, obtaining sound effect parameters corresponding to the preference sound effect of the user according to the information of the user history playing audio in the preset time period.
17. The electronic device of any one of claims 14-16,
the sound effect parameter setting method comprises the steps of specifically acquiring sound effect parameters corresponding to set sound effects according to a sound effect parameter set and the set sound effects, and including sound effect parameters corresponding to the sound effects in the sound effect parameter set.
18. The electronic device of any of claims 14-16,
the method is specifically used for acquiring the preference sound effect of the user according to the information of the user history playing audio; according to the sound effect parameter set and the preference sound effect of the user, obtaining the sound effect parameters corresponding to the preference sound effect, wherein the sound effect parameter set comprises the sound effect parameters corresponding to the sound effects.
19. The electronic device of claim 18, wherein the information of the user history playing audio is the user history playing audio;
the method is specifically used for inputting the user historical playing audio into a sound effect prediction model to obtain the preference sound effect of the user.
20. The electronic device of claim 18, wherein the information of the user history playing audio is an audio effect tag of the user history playing audio, and the audio effect tag is used for representing an audio effect;
and the sound effect corresponding to the sound effect label with the largest number is used as the preference sound effect of the user.
21. The electronic device of claim 20,
and the sound effect identification module is also used for collecting the historical playing audio of the user, inputting the historical playing audio of the user to the sound effect identification module and acquiring the sound effect label of the historical playing audio of the user.
22. The electronic device of any of claims 13-16, wherein the information of the user historically played audio is user historically played audio;
and the system is also used for inputting the user historical playing audio into a sound effect parameter prediction model to obtain sound effect parameters corresponding to the preference sound effect of the user.
23. The electronic device of any one of claims 13-22,
the system is also used for modifying the current sound effect parameter into the sound effect parameter; or selecting the sound effect parameters from multiple preset groups of sound effect parameters, wherein each group of sound effect parameters corresponds to one sound effect.
24. The electronic device of any of claims 13-23 wherein the sound assembly comprises at least one of: the system comprises a dynamic range control DRC module, an equalizer EQ module and an active noise reduction ANC module, wherein the sound effect parameter of the DRC module is a DRC parameter, the sound effect parameter of the EQ module is an EQ parameter, and the sound effect parameter of the ANC module is an ANC parameter.
25. A computer-readable storage medium having stored thereon computer instructions which, when executed by a computer, cause the computer to perform the method of any one of claims 1-12.
CN202011331956.1A 2020-11-24 2020-11-24 Audio processing method, electronic device, and readable storage medium Active CN114546325B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202011331956.1A CN114546325B (en) 2020-11-24 Audio processing method, electronic device, and readable storage medium
PCT/CN2021/131621 WO2022111381A1 (en) 2020-11-24 2021-11-19 Audio processing method, electronic device and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011331956.1A CN114546325B (en) 2020-11-24 Audio processing method, electronic device, and readable storage medium

Publications (2)

Publication Number Publication Date
CN114546325A true CN114546325A (en) 2022-05-27
CN114546325B CN114546325B (en) 2024-04-16

Family

ID=

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116453492A (en) * 2023-06-16 2023-07-18 成都小唱科技有限公司 Method and device for switching jukebox airport scenes, computer equipment and storage medium
CN116743913A (en) * 2022-09-02 2023-09-12 荣耀终端有限公司 Audio processing method and device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1976114A1 (en) * 2007-03-13 2008-10-01 Vestel Elektronik Sanayi ve Ticaret A.S. Automatic equalizer adjustment method
CN105959483A (en) * 2016-06-16 2016-09-21 广东欧珀移动通信有限公司 Audio stream processing method and mobile terminal
CN106488311A (en) * 2016-11-09 2017-03-08 微鲸科技有限公司 Audio method of adjustment and user terminal
CN108989871A (en) * 2018-06-27 2018-12-11 广州视源电子科技股份有限公司 Parameter adjusting method, device, readable storage medium storing program for executing and video playback apparatus
CN109271128A (en) * 2018-09-04 2019-01-25 Oppo广东移动通信有限公司 Audio setting method, device, electronic equipment and storage medium
CN111556198A (en) * 2020-04-24 2020-08-18 深圳传音控股股份有限公司 Sound effect control method, terminal equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1976114A1 (en) * 2007-03-13 2008-10-01 Vestel Elektronik Sanayi ve Ticaret A.S. Automatic equalizer adjustment method
CN105959483A (en) * 2016-06-16 2016-09-21 广东欧珀移动通信有限公司 Audio stream processing method and mobile terminal
CN106488311A (en) * 2016-11-09 2017-03-08 微鲸科技有限公司 Audio method of adjustment and user terminal
CN108989871A (en) * 2018-06-27 2018-12-11 广州视源电子科技股份有限公司 Parameter adjusting method, device, readable storage medium storing program for executing and video playback apparatus
CN109271128A (en) * 2018-09-04 2019-01-25 Oppo广东移动通信有限公司 Audio setting method, device, electronic equipment and storage medium
CN111556198A (en) * 2020-04-24 2020-08-18 深圳传音控股股份有限公司 Sound effect control method, terminal equipment and storage medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116743913A (en) * 2022-09-02 2023-09-12 荣耀终端有限公司 Audio processing method and device
CN116743913B (en) * 2022-09-02 2024-03-19 荣耀终端有限公司 Audio processing method and device
CN116453492A (en) * 2023-06-16 2023-07-18 成都小唱科技有限公司 Method and device for switching jukebox airport scenes, computer equipment and storage medium

Also Published As

Publication number Publication date
WO2022111381A1 (en) 2022-06-02

Similar Documents

Publication Publication Date Title
CN110870201B (en) Audio signal adjusting method, device, storage medium and terminal
CN107509153B (en) Detection method and device of sound playing device, storage medium and terminal
CN108845673B (en) Sound-to-haptic effect conversion system using mapping
US20170316718A1 (en) Converting Audio to Haptic Feedback in an Electronic Device
US11514923B2 (en) Method and device for processing music file, terminal and storage medium
CN103440862A (en) Method, device and equipment for synthesizing voice and music
CN113823250B (en) Audio playing method, device, terminal and storage medium
CN101271722A (en) Music broadcasting method and device
US11133024B2 (en) Biometric personalized audio processing system
CN108449506A (en) Voice communication data processing method, device, storage medium and mobile terminal
US11611840B2 (en) Three-dimensional audio systems
CN107371102A (en) Control method, device and the storage medium and mobile terminal of audio broadcast sound volume
CN108449502A (en) Voice communication data processing method, device, storage medium and mobile terminal
CN114245271A (en) Audio signal processing method and electronic equipment
WO2020228226A1 (en) Instrumental music detection method and apparatus, and storage medium
CN107483732A (en) Method for controlling volume, device and the storage medium and mobile terminal of mobile terminal
CN114546325B (en) Audio processing method, electronic device, and readable storage medium
CN114546325A (en) Audio processing method, electronic device and readable storage medium
CN114501297A (en) Audio processing method and electronic equipment
CN112307161B (en) Method and apparatus for playing audio
CN115273808A (en) Sound processing method, storage medium and electronic device
CN112748897A (en) Volume debugging method, device and equipment for vehicle-mounted system
KR102650763B1 (en) Psychoacoustic enhancement based on audio source directivity
WO2022228174A1 (en) Rendering method and related device
CN117014539B (en) Volume adjusting method and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant