CN114546325B - Audio processing method, electronic device, and readable storage medium - Google Patents

Audio processing method, electronic device, and readable storage medium Download PDF

Info

Publication number
CN114546325B
CN114546325B CN202011331956.1A CN202011331956A CN114546325B CN 114546325 B CN114546325 B CN 114546325B CN 202011331956 A CN202011331956 A CN 202011331956A CN 114546325 B CN114546325 B CN 114546325B
Authority
CN
China
Prior art keywords
sound effect
audio
user
sound
parameters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011331956.1A
Other languages
Chinese (zh)
Other versions
CN114546325A (en
Inventor
苏霞
林宇轩
陈翼翼
张晓玲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN202011331956.1A priority Critical patent/CN114546325B/en
Priority to PCT/CN2021/131621 priority patent/WO2022111381A1/en
Publication of CN114546325A publication Critical patent/CN114546325A/en
Application granted granted Critical
Publication of CN114546325B publication Critical patent/CN114546325B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/10Digital recording or reproducing
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/10Digital recording or reproducing
    • G11B20/10009Improvement or modification of read or write signals
    • G11B20/10018Improvement or modification of read or write signals analog processing for digital recording or reproduction

Abstract

The embodiment of the application provides an audio processing method, electronic equipment and a readable storage medium, wherein the method comprises the following steps: receiving an audio playing request input by a user, and acquiring sound effect parameters according to sound effect setting information or user history audio playing information, wherein the audio playing request is used for requesting audio playing; and playing the audio by adopting the sound effect parameters. In the embodiment of the application, the terminal equipment can play the audio with different sound effects, so that the requirement of users on sound effect diversification of the audio is met, and the user experience is improved.

Description

Audio processing method, electronic device, and readable storage medium
Technical Field
Embodiments of the present application relate to audio processing technology, and in particular, to an audio processing method, an electronic device, and a readable storage medium.
Background
With the intelligent development of the terminal device, the user can use the terminal device as a learning machine, a game machine, an audio/video player and the like. When the terminal equipment plays the audio, the sound effect of the audio is the style of the audio perceived by the user.
At present, the terminal equipment can play the audio with a fixed sound effect, and the sound effect of the audio felt by the user is only the fixed sound effect. At present, the method can not meet the requirement of users on the diversification of the sound effects of the audio.
Disclosure of Invention
The embodiment of the application provides an audio processing method, electronic equipment and a readable storage medium, which can adopt different sound effects to play audio and improve user experience.
In a first aspect, an embodiment of the present application provides an audio processing method, where an execution body for executing the audio processing method may be a terminal device or a chip in the terminal device, and the following description takes the execution body as an example of the terminal device. In the audio processing method, when receiving an audio playing request input by a user, a terminal device acquires audio parameters according to setting information of audio effects or information of audio playing history of the user, wherein the audio playing request is used for requesting audio playing. On one hand, a user can set sound effects on the terminal device, and the terminal device can store setting information of the sound effects. It should be understood that the setting information of the sound effect may include the sound effect set by the user, and information such as the setting time. On the other hand, the information of the user history playing audio may include the audio of the user history playing, or an audio tag corresponding to the user history playing audio. The sound effect label is used for representing sound effects.
In the embodiment of the application, the terminal device can acquire the sound effect parameter of the played audio according to the setting information of the sound effect or the information of the user history played audio, and further play the audio by adopting the sound effect parameter. Wherein the sound effect parameters may include at least one of: dynamic range control DRC parameters, equalizer EQ parameters, active noise reduction ANC parameters.
In the embodiment of the application, on one hand, the user can set different sound effects, and further the terminal equipment can acquire sound effect parameters corresponding to the different sound effects. On the other hand, the user history playing audio can correspond to different sound effects, and then the terminal equipment can obtain sound effect parameters corresponding to different sound effects according to the information of the user history playing audio. By combining the two aspects, the terminal equipment can adopt the sound effect parameters corresponding to different sound effects to play the audio, so that the audio can be played with different sound effects, the requirement of users on sound effect diversification of the audio is met, and the user experience is improved. In addition, in the embodiment of the application, the terminal equipment combines the setting of the sound effect by the user or the information of the audio played by the user in the history, so that the audio can be played by the sound effect preferred by the user, and the user experience is further improved.
In one possible implementation manner, before the terminal device obtains the sound effect parameters according to the setting information of the sound effect or the information of the audio played by the user in history, it may also determine whether the user has set the sound effect according to the setting information of the sound effect.
In one implementation, because the setting information of the sound effect may include the sound effect set by the user and the setting time, the terminal device may determine whether the user has set the sound effect according to the current time and the setting time of the sound effect. And if the setting time closest to the current time does not correspond to the sound effect set by the user, determining that the sound effect is not set by the user.
In another implementation manner, the setting information of the sound effect includes the sound effect set by the last time, if the sound effect set by the last time is none, the sound effect not set by the user is determined, and if the sound effect set by the last time is any one preset sound effect, the sound effect set by the user is determined. It should be appreciated that in such an implementation, the preset sound effect may be an overweight bass, a clear human voice, a warm and soft, clear melody. The sound effect set by the user can be any one of preset sound effects, and can also be the sound effect 'none' set by the user.
In another implementation, the user may correspond to at least one application of the sound effect application when setting the sound effect, the at least one application being an application associated with the sound effect. The terminal device may determine an application program that the user requests to play audio, further determine that the user has set an audio if the at least one first application program includes the application program that the user requests to play audio, and determine that the user has not set an audio if the at least one first application program does not include the application program that the user requests to play audio.
If the user is determined to have set the sound effect, the terminal equipment acquires sound effect parameters corresponding to the set sound effect according to the setting information of the sound effect; if the user is determined not to set the sound effect, obtaining sound effect parameters corresponding to the preference sound effect of the user according to the information of the user history playing audio in the preset time period. In one possible implementation, the set sound effect may be used as a preference sound effect of the user. In order to distinguish the preference sound effects of the user, which are acquired by the terminal equipment by adopting the setting information of the sound effects or the information of the user for historically playing the audio in a preset time period, the preference sound effects of the user, which are acquired by adopting the setting information of the sound effects, are described as the set sound effects.
In the embodiment of the application, the terminal equipment can determine whether the user has set the sound effect according to the setting information of the sound effect, and further acquire the sound effect parameters corresponding to the sound effect in different modes, so that the requirements of the user can be met more, and the user experience is improved. For example, if the user has set the sound effect, the sound effect set by the user is described as the sound effect required by the user, and further, according to the setting information of the sound effect, the sound effect parameters corresponding to the set sound effect are obtained. If the user does not set the sound effect, the fact that the sound effect is not particularly required is indicated, and in the embodiment of the application, the terminal device can predict the sound effect parameters corresponding to the preference sound effect of the user according to the information of the user history playing audio in the preset time period, and can also improve user experience.
It should be noted that in the embodiment of the present application, the preference sound effect of the user may be obtained by using the information of the user history playing audio in the preset time period, so that the sound effect can be adjusted at any time along with the preference of the user, and the method is more intelligent.
In one possible implementation, the terminal device may set a storage duration for the setting information of the sound effect, where the storage duration is a period of time from when the user sets the sound effect. When the setting information of the sound effect is within the storage duration, the terminal equipment can acquire the sound effect parameters corresponding to the set sound effect according to the setting information of the sound effect. However, the setting information of the sound effect exceeds the storage duration, and the terminal equipment can acquire the preference sound effect of the user or the sound effect parameter corresponding to the preference sound effect according to the information of the user history playing audio in the preset time period. It should be appreciated that the save time period may be predefined or set by the user when setting the sound effect.
In the embodiment of the application, the terminal equipment adopts a method for setting the storage duration for the sound effect set by the user, and when the preference sound effect of the user changes, the terminal equipment can timely adjust the sound effect parameters, and further play the audio by adopting the sound effect parameters corresponding to the preference sound effect. This kind of mode is more intelligent, more laminating user demand, can improve user experience.
The process of obtaining sound effect parameters by the terminal device is described below from the following possible implementation manners:
the first way is: and the terminal equipment acquires the sound effect parameters according to the sound effect setting information. The setting information of the sound effect comprises the set sound effect, and the terminal equipment can acquire the sound effect parameters corresponding to the set sound effect according to the sound effect parameter set and the set sound effect. In this embodiment, the terminal device may use the sound effect parameter corresponding to the sound effect that is the same as the set sound effect in the sound effect parameter set as the sound effect parameter corresponding to the set sound effect.
The second way is: and the terminal equipment acquires sound effect parameters according to the information of the audio played by the user history. The information of the user historical playing audio is the historical playing audio in a preset time period. In the embodiment of the application, the terminal device may input the historical playing audio of the user to the sound effect prediction model to obtain the preference sound effect of the user, and further obtain the sound effect parameter corresponding to the preference sound effect according to the sound effect parameter set and the preference sound effect of the user. In the embodiment of the present application, the terminal device may use, as the sound effect parameter corresponding to the preferred sound effect, the sound effect parameter corresponding to the sound effect that is the same as the preferred sound effect in the sound effect parameter set.
Third mode: and the terminal equipment acquires sound effect parameters according to the information of the audio played by the user history. The information of the user history playing audio is an audio tag of the user history playing audio in a preset time period, and the audio tag is used for representing audio. It should be understood that the terminal device may collect the user history play audio, and input the user history play audio to the sound effect recognition model to obtain the sound effect tag of the user history play audio. In the embodiment of the application, the terminal device can take the sound effect corresponding to the sound effect label with the largest number as the preference sound effect of the user, and further acquire the sound effect parameter corresponding to the preference sound effect according to the sound effect parameter set and the preference sound effect of the user. The method for the terminal device to obtain the sound effect parameters according to the sound effect parameter set and the preference sound effect of the user can refer to the related description in the second mode.
Fourth mode: and the terminal equipment acquires sound effect parameters according to the information of the audio played by the user history. The information of the user history playing audio is the user history playing audio in a preset time period. In this embodiment of the present application, the terminal device may input the user history playing audio to an audio parameter prediction model, to obtain an audio parameter corresponding to the user's preference audio. Compared with the three modes, the method has the advantages that the terminal equipment can directly acquire the sound effect parameters without acquiring the preference sound effect of the user, and the processing rate of the audio can be further improved.
After the terminal equipment acquires the sound effect parameters according to any mode, the current sound effect parameters can be modified into the sound effect parameters; or selecting the sound effect parameters from a plurality of preset sound effect parameters. The terminal device can play the audio with the sound effect parameters. Wherein, each set of sound effect parameters in the preset plurality of sets of sound effect parameters corresponds to one sound effect.
In the first to third modes, the sound effect parameter set adopted by the terminal device is preset in the terminal device, and the following description describes a process of obtaining the sound effect parameter set by the server by taking the server as an execution main body for obtaining the sound effect parameter set as an example:
the first way is: the server acquires standard audio of the first sound effect and first frequency response of the standard audio of the first sound effect. Wherein the first sound effect is each sound effect. The standard audio of the first sound effect may be used as a basis for identifying whether other audio is the first sound effect. The server can adopt Fourier transformation in the simulation tool to convert the wav file of the standard audio of the first sound effect into a frequency response curve, so as to obtain the first frequency response of the standard audio of the first sound effect. The server may employ a simulation tool to simulate DRC, EQ, and ANC modules in the terminal device to generate DRC, EQ, and ANC parameters with different sound effect parameters. The server can continuously adjust the sound effect parameters in the simulation tool, and further process the standard audio of the first sound effect with the adjusted sound effect parameters to obtain the second frequency response of the standard audio of the first sound effect. The server obtains the difference value between the first frequency response and the second frequency response of the standard audio of the first sound effect, and further takes the sound effect parameter corresponding to the second frequency response of which the difference value of the first frequency response is smaller than the preset difference value as the sound effect parameter of the first sound effect to obtain a sound effect parameter set.
The second way is: the server may randomly generate a plurality of sets of sound effect parameters, and input the plurality of sets of sound effect parameters into the sound effect classification scoring model to obtain a score of each set of sound effect parameters belonging to the first sound effect. The server takes the sound effect parameter with the highest score of the first sound effect as the sound effect parameter of the first sound effect to obtain a sound effect parameter set.
The above two modes can obtain the sound effect parameter set, and the second mode is compared with the first mode, the server can not obtain the standard audio corresponding to the first sound effect in advance. It should be understood that, in the case that the server cannot obtain the standard audio corresponding to the first audio, the server may obtain the audio parameter corresponding to the audio according to the randomly generated audio parameter, and the second mode has wider applicability.
In a second aspect, embodiments of the present application provide an electronic device for playing audio, the electronic device including an audio component. The electronic equipment is used for receiving an audio playing request input by a user, and acquiring sound effect parameters according to sound effect setting information or user history audio playing information, wherein the audio playing request is used for requesting audio playing; and the sound effect component is used for playing the audio by adopting the sound effect parameters.
In a possible implementation manner, the electronic device is further configured to determine, according to the setting information of the sound effect, whether the user has set the sound effect; the method is particularly used for acquiring sound effect parameters corresponding to the set sound effect according to the setting information of the sound effect if the sound effect set by the user is determined; if the user is determined not to set the sound effect, obtaining sound effect parameters corresponding to the preference sound effect of the user according to the information of the user history playing audio in the preset time period.
In one possible implementation, the setting information of the sound effects includes the set sound effects and at least one first application associated with the set sound effects; the electronic equipment is specifically used for determining whether the at least one first application program comprises an application program for requesting the user to play audio; and if the at least one first application program comprises an application program for requesting the user to play the audio, determining that the user has set the sound effect.
In one possible implementation manner, the electronic device is further configured to obtain, if it is determined that the setting information of the sound effect exceeds the storage duration, sound effect parameters corresponding to the preferred sound effect of the user according to information of the user playing audio historically in a preset time period.
In one possible implementation manner, the electronic device is specifically configured to obtain, according to a sound effect parameter set and the set sound effect, a sound effect parameter corresponding to the set sound effect, where the sound effect parameter set includes sound effect parameters corresponding to each sound effect.
In a possible implementation manner, the electronic device is specifically configured to obtain a preference sound effect of the user according to information of the user playing audio historically; and acquiring sound effect parameters corresponding to the preference sound effect according to the sound effect parameter set and the preference sound effect of the user, wherein the sound effect parameter set comprises sound effect parameters corresponding to each sound effect.
In one possible implementation manner, the information of the user history playing audio is the user history playing audio; the electronic equipment is specifically used for inputting the historical playing audio of the user into the sound effect prediction model to acquire the preference sound effect of the user.
In one possible implementation manner, the information of the user history playing audio is an audio tag of the user history playing audio, and the audio tag is used for representing audio; the electronic device is specifically configured to use the sound effect corresponding to the sound effect tag with the largest number as the preference sound effect of the user.
In one possible implementation manner, the electronic device is further configured to collect the user historical play audio, input the user historical play audio to a sound effect recognition model, and obtain a sound effect tag of the user historical play audio.
In one possible implementation manner, the information of the user history playing audio is the user history playing audio; the electronic equipment is also used for inputting the historical playing audio of the user into an audio parameter prediction model to obtain audio parameters corresponding to the preference audio of the user.
In one possible implementation, the electronic device is further configured to modify a current sound effect parameter to the sound effect parameter; or selecting the sound effect parameters from a plurality of preset sound effect parameters, wherein each sound effect parameter corresponds to one sound effect.
In one possible implementation, the sound effect component includes at least one of: the dynamic range control DRC module, the equalizer EQ module and the active noise reduction ANC module, wherein the sound effect parameter of the DRC module is DRC parameter, the sound effect parameter of the EQ module is EQ parameter, and the sound effect parameter of the ANC module is ANC parameter.
In a possible implementation manner, the electronic device in the embodiment of the application may further include a processor and a memory, where the memory is used to store computer executable program code, and the program code includes instructions; the instructions, when executed by a processor, cause the electronic device to perform the method as provided by the first aspect or each possible implementation of the first aspect.
In a third aspect, embodiments of the present application provide an electronic device for playing audio, including a unit, a module or a circuit for performing the method provided by the first aspect or each possible implementation manner of the first aspect. The electronic device for playing audio may be a terminal device, or may be a module applied to the terminal device, for example, may be a chip applied to the terminal device.
In a fourth aspect, embodiments of the present application provide a computer program product comprising instructions which, when run on a computer, cause the computer to perform the method of the first aspect or various possible implementations of the first aspect.
In a fifth aspect, embodiments of the present application provide a computer-readable storage medium having instructions stored therein, which when run on a computer, cause the computer to perform the method of the first aspect or various possible implementations of the first aspect.
The embodiment of the application provides an audio processing method, electronic equipment and a readable storage medium, wherein the method comprises the following steps: receiving an audio playing request input by a user, and acquiring sound effect parameters according to sound effect setting information or user history audio playing information, wherein the audio playing request is used for requesting audio playing; and playing the audio by adopting the sound effect parameters. According to the embodiment of the application, different sound effect parameters can be adopted to play the audio, so that a user can feel the audio played with different sound effects, the requirement of the user on sound effect diversification of the audio is met, and the user experience is improved.
Drawings
FIG. 1 is a flow diagram of training an audio recognition model;
fig. 2 is an interface change schematic diagram of a terminal device provided in an embodiment of the present application;
fig. 3 is a schematic diagram of a setting page provided in an embodiment of the present application;
fig. 4 is a flowchart illustrating an embodiment of an audio processing method according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of a terminal device provided in an embodiment of the present application;
fig. 6 is another schematic diagram of interface change of the terminal device according to the embodiment of the present application;
fig. 7 is a flowchart of another embodiment of an audio processing method according to an embodiment of the present application;
fig. 8 is a schematic flow chart of acquiring an audio parameter set according to an embodiment of the present application;
FIG. 9 is a schematic flow chart of processing audio by the simulation tool according to the embodiment of the present application;
FIG. 10 is a schematic diagram of another flow of processing audio by the simulation tool provided in an embodiment of the present application;
FIG. 11 is a flowchart of another embodiment of obtaining a set of sound parameters;
fig. 12 is another schematic structural diagram of a terminal device according to an embodiment of the present application.
Detailed Description
At present, a terminal device can play audio by adopting fixed sound effects, but with the diversified requirements of users on the sound effects of the audio, the fixed sound effects cannot meet the requirements of the users. In the embodiment of the present application, the audio effects of the audio may include, but are not limited to: an overweight bass, a clear human voice, a warm, soft, clear melody. The characteristic of the sound effect of 'super-bass' can be that the low frequency duty ratio in the audio is large, and the user is shocked. The sound effect of clear voice can be characterized as highlighting voice audio in audio and weakening background audio. The characteristic of the sound effect of warmth and softness can be that the whole audio frequency is balanced in high and low sound and comfortable in hearing. The sound effect of the clear melody can be characterized by highlighting background audio in the audio and weakening voice audio. The sound effect of the audio is related to the sound effect parameter in the terminal equipment, and because the sound effect parameter in the current terminal equipment is preset, the terminal equipment can only realize one sound effect by adopting the preset sound effect parameter to play the audio. The embodiment of the application provides an audio processing method, which changes the sound effect of audio through changing the sound effect parameters in terminal equipment, so that different sound effects of audio can be provided for users, the requirement of the users on sound effect diversification is met, and the user experience is improved.
It should be appreciated that the sound effect parameters in embodiments of the present application may include, but are not limited to,: dynamic range control (dynamic range control, DRC) parameters, equalizer (EQ) parameters, active noise reduction (active noise cancellation, ANC) parameters, noise cancellation parameters, low frequency gain of noise threshold, heavy bass intensity, heavy bass center frequency, 3D intensity, 3D effect center frequency. The DRC parameters may include: the number of frequency bands of the audio signal, the cut-off frequency of the frequency bands, the gain of the audio signal, the compression rate, the amplitude threshold, the compression speed, the gain duration and the noise floor threshold. The equalizer may be composed of a plurality of filters, and the equalizer parameters may include parameters of the filters, which may include a type of the filters, a center frequency, a gain, and a Q value, which is related to a frequency of the audio signal. The ANC parameters may include: type of filter, center frequency, full band gain, Q value, and single band gain.
Terms in the embodiments of the present application are explained here:
and (3) an audio recognition model: for identifying the sound effects of the audio. The terminal device may input the audio into a sound effect recognition model, which may output the sound effect of the audio. For example, if audio a is input into the sound effect recognition model, which may output the sound effect "subwoofer", the terminal device may determine that the sound effect of audio a is "subwoofer". In one embodiment, the sound effect recognition model may be obtained by training the data set by Deep Learning (DL) according to a long-short-term memory artificial neural network (long-short term memory, LSTM) structure, or a convolutional neural network (convolutional neural networks, CNN) structure, or a cyclic neural network (recurrent neural networks, RNN) structure. The data set for training the sound effect recognition model can be a large amount of audio and a sound effect label of each audio. The sound effect label is used for representing the sound effect of the audio, and can be 'super bass', 'clear human voice', and the like. Alternatively, the sound effect tags may be represented by numbers, such as "0", "1". Where "0" means "subwoofer" and "1" means "clear human voice".
In the following, a brief description will be given by taking an example that a terminal device trains to obtain an audio recognition model according to an LSTM network structure, where the LSTM network structure includes an input layer, at least one hidden layer, and an output layer. The input layer is used for receiving the data set and distributing the data in the data set to neurons of the hidden layer. Neurons of the hidden layer are used for calculating according to the data, and the calculation result is output to the output layer. The output layer is used for outputting the operation result. FIG. 1 is a flow chart diagram of training an audio recognition model. As shown in fig. 1, a method for training an audio recognition model in an embodiment of the present application may include:
s101, initializing the weight value of each hidden layer neuron in the LSTM network structure.
For example, the terminal device may initialize the weight values of neurons of each hidden layer to a random weight value that is a gaussian distribution.
S102, dividing the preprocessed data set into N batches.
In the embodiment of the present application, the data set used by the terminal device is a preprocessed data set, the mean value of data in the data set is 0, and the variance is 1. By way of example, the data set may include a plurality of audio and an audio tag for each audio, and the audio in the data set may be a wave sound (wav) file. Taking the terminal device as an example of preprocessing the data set, the terminal device can convert the wav file into a spectrogram, such as a mel spectrogram, to obtain a frequency spectrum value (such as mel feature, mel feature) of the audio, and then perform normalization processing on the frequency spectrum value of the audio to obtain data with a mean value of 0 and a variance of 1.
The terminal device may divide the preprocessed data set into N batches, so as to iteratively train the LSTM network structure by using the N batches of data. For example, the terminal device may divide the data set into N batches according to the data volume. Wherein N is an integer greater than 1.
S103, inputting the data of the ith batch into the LSTM network structure to obtain the cross entropy loss of the data of the ith batch.
Illustratively, at the beginning of training, i.e., when i is 1, the terminal device inputs the 1 st lot of data into the LSTM network structure, which may output a cross entropy loss of the 1 st lot of data (cross entropy loss). It should be appreciated that cross entropy loss is used to characterize the similarity of the audio effect tag of the audio predicted by the terminal device using the LSTM network structure to the real audio effect tag of the audio. The smaller the cross entropy loss, the more accurate the weight values characterizing neurons of each hidden layer in the LSTM network structure. Wherein i is an integer greater than or equal to 1 and less than or equal to N.
And S104, updating the weight value of the neuron of each hidden layer in the LSTM network structure according to the cross entropy loss of the data of the ith batch.
Illustratively, the terminal device may update the initial weight values of neurons of each hidden layer according to the cross entropy loss of the data of lot 1. The terminal equipment can determine the error between the similarity of the sound effect label of the audio predicted by adopting the LSTM network structure and the real sound effect label of the audio and 100% according to the cross entropy loss of the data of the 1 st batch, and further update the weight value of the neuron of each hidden layer according to the error. For example, the terminal device may update the weight values of neurons of each hidden layer using a gradient descent method (gradient descent method) or a random gradient descent method (stochastic gradient decent).
S105, judging whether i is smaller than N. If yes, add 1 to i, and execute S103. If not, S106 is performed.
The terminal device may determine whether i is less than N to determine whether training of N batches of data of the data set is complete. If i is smaller than N, the terminal device may add 1 to i, and continue to execute S103. If i is 1 and N is 10, the terminal device determines that i is smaller than N, and then inputs the data of the 2 nd batch into the LSTM network structure with updated weight value, so as to obtain the cross entropy loss of the data of the 2 nd batch. Similarly, the terminal device can update the weight value of the neuron of each hidden layer by adopting a gradient descent method or a random gradient descent method according to the cross entropy loss of the data of the 2 nd batch. And iterating continuously until i is equal to N, inputting the data of the N batch into the LSTM network structure by the terminal equipment, and updating the weight value of the neuron of each hidden layer by the terminal equipment according to the cross entropy loss of the data of the N batch. Wherein when i is equal to N, the terminal device may perform S106 described below.
S106, determining whether training is converged according to the target cross entropy loss and the cross entropy loss of the data of the nth batch. If the training is converged, S107 is executed, and if the training is not converged, S102 is returned to.
The user can preset the target cross entropy loss as the convergence basis of training. When the terminal equipment obtains the cross entropy loss of the data of the nth batch, whether training is converged or not can be determined according to the cross entropy loss of the data of the nth batch and the target cross entropy loss. If the cross entropy loss of the data of the nth batch is smaller than or equal to the target cross entropy loss, the sound effect label of the audio predicted according to the LSTM network structure is represented to be close to the real sound effect label of the audio, and the terminal equipment can determine training convergence. If the cross entropy loss of the data of the nth batch output by the LSTM network structure is larger than the target cross entropy loss, determining that the training is not converged. And when the training is not converged, the terminal equipment returns to execute the S102, namely the preprocessed data set is divided into N batches, and the LSTM network structure is continuously trained by adopting the data of the N batches until the training is converged.
S107, end.
And if the terminal equipment determines that the training converges, ending the training to obtain the sound effect recognition model. The sound effect recognition model may be: and the terminal equipment updates the LSTM network structure after the weight values of the neurons of each hidden layer according to the cross entropy loss of the data of the nth batch.
Sound effect prediction model: and the method is used for predicting the preference sound effect of the user according to the audio played by the user in the history. The terminal device may input audio played by the user in history into a sound effect prediction model, and the sound effect prediction model may output a preference sound effect of the user. The training mode of the sound effect prediction model can be the same as that of the sound effect identification model, and the data set for training the sound effect prediction model is a large number of audios and sound effect labels of each audio. The audio played by the user in the history may be songs, audio in a video file, broadcast audio, recording, etc.
Sound effect parameter prediction model: and the audio parameter prediction module is used for predicting the audio parameters corresponding to the preference audio of the user according to the audio played by the user in the history. The terminal equipment can input the audio played by the user in the history into the sound effect parameter prediction model, and the sound effect parameter prediction model can output the sound effect parameters corresponding to the preference sound effect of the user. The training manner of the sound effect parameter prediction model may be the same as that of the above-described sound effect recognition model, except that the data set for training the sound effect parameter prediction model is a large number of audios, and sound effect labels of each audio, and sound effect parameters corresponding to each sound effect label (or sound effect labels corresponding to each audio).
Different from the structure of the sound effect prediction model, in the embodiment of the present application, a mapping layer may be added after the last hidden layer of the LSTM network structure, where the mapping layer is used to map the sound effect and the sound effect parameters, so that the sound effect parameter prediction model may predict the sound effect parameters corresponding to the preference sound effect of the user.
Frequency response: also referred to as a frequency response curve, refers to the gain versus frequency curve.
Sound effect classification scoring model: and the score of each sound effect is obtained according to the randomly generated sound effect parameters. The terminal device can input the randomly generated sound effect parameters into a sound effect classification scoring model, and the sound effect classification scoring model can output the sound effect parameters as scores of the sound effects. The higher the score, the closer the sound effect is to the corresponding sound effect when the terminal equipment plays the audio by adopting the sound effect parameter. Similarly, the terminal device may input the audio to be determined of the sound effect into the sound effect classification scoring model, and the sound effect classification scoring model may output the score of each sound effect to which the audio to be determined of the sound effect belongs. The training mode of the sound effect classification scoring model may be the same as the training mode of the sound effect identification model, except that the data set for training the sound effect classification scoring model is a large number of audios, sound effect labels of each audio, and sound effect parameters corresponding to each sound effect label (or sound effect labels corresponding to each audio).
In one embodiment, the sound effect classification scoring model corresponding to each sound effect can be trained, so that the sound effect classification scoring model corresponding to each sound effect is obtained. According to the embodiment of the application, the randomly generated sound effect parameters can be input into each sound effect classification scoring model, and the score of the sound effect parameters belonging to each sound effect can be obtained. For example, in the embodiment of the present application, the "subwoofer" sound effect classification scoring model and the "clear voice" sound effect classification scoring model are included, and the randomly generated sound effect parameters may be input to the "subwoofer" sound effect classification scoring model and the "clear voice" sound effect classification scoring model respectively, so that the score of the randomly generated sound effect parameters belonging to the "subwoofer" and the score of the randomly generated sound effect parameters belonging to the "clear voice" may be obtained. Unlike the above-mentioned sound effect classification scoring model, the data set of the sound effect classification scoring model corresponding to each sound effect in this embodiment may be the audio frequency of the sound effect and the sound effect parameter corresponding to the sound effect.
Tag database: an audio tag for storing audio, such as a neural synthesizer database (neural synthesizer Dataset, NSynth database). For example, the audio label "subwoofer" of audio a may be stored in the label database, and the audio a or the identification of audio a may be stored in the label database, where the audio a or the identification of audio a corresponds to the audio label "subwoofer" of audio a. The identifier of the audio a may be the name of the audio a, the audio feature of the audio a, etc. and is used to uniquely indicate the audio a.
In one embodiment, a "sound effect setting" control may be displayed on a setting page of the terminal device, or on an interface of a setting option in the setting page. Alternatively, the settings page of the application may have a "sound settings" control displayed thereon. When the user needs to set the sound effect, the sound effect can be selected through the sound effect setting control on the setting interface of the terminal equipment or the setting interface of the application program, so that the terminal equipment plays the audio with the sound effect when playing the audio. Fig. 2 is an interface change schematic diagram of a terminal device according to an embodiment of the present application. The setting option 4 in the setting page, such as the "sound and vibration" option, including the "sound effect setting" control, is illustrated in fig. 2. The interface 201 is a settings page of the terminal device, which may include a plurality of settings selections, such as settings option 1-settings option 7. The user may click on the set option 4 (sound and vibration) option when he or she desires to set the audio effect, interface 201 jumps to interface 202. The interface 202 is a setting page of "sound and vibration", the "sound effect setting" control is displayed on the interface 202, the user clicks the "sound effect setting" control, and the interface 202 jumps to the interface 203. The interface 203 is a page with sound effects, and various selectable sound effects such as an overweight bass, a clear voice, a warm and soft, a clear melody, and a vocal gust are displayed on the interface 203. The user may click on the control of the corresponding sound effect to select the sound effect. For example, the user may click on a control of "clear voice," selecting the sound effect "clear voice. Optionally, the interface 203 may also have features for each sound effect displayed thereon.
It should be appreciated that after the user selects the sound effect, the terminal device may record the sound effect selected by the user. Optionally, a "none" control may be displayed on the interface 203, and if the user clicks the "none" control, the terminal device may cancel the selected sound effect. Correspondingly, the terminal device can record that the user does not select the sound effect currently. The terminal device may store setting information of the sound effect, for example, the setting information of the sound effect may be stored in an operation log of the user, and the operation log may be stored in a memory of the terminal device. In the user's operation log. The setting information of the sound effect may include a sound effect tag and a setting time of the sound effect. For example, the setting information of the sound effect stored in the terminal device may be as shown in table one below. It should be noted that the table one is a format for storing setting information of sound effects for the terminal device, and the terminal device may also store setting information of sound effects in an extensible markup language (extensible markup language, XML) format or a database format.
List one
Sound effect label Setting the time of day
Clear human voice 30 days of 1 month of 2020, 8:00
Without any means for 3/2/2020, 10:00
Super bass 2020, 5 months, 1 day, 21:00
As shown in Table one, the user sets the sound effect to "clear human voice" at 8:00 on 30 months of 2020, cancels the set sound effect at 10:00 on 2 months of 2020, and then sets the sound effect to "subwoofer" at 21:00 on 1 month of 2020. In one possible implementation manner, the terminal device may also store the setting information of the sound effect set by the user last time, for example, the setting information of the sound effect stored in the terminal device is "super bass" in the above table, and the setting time of "super bass" is "5 months and 1 days in 2020, 21:00".
In one embodiment, when the user selects an audio effect on the page of the audio effect setting, the user may also select an application program of the audio effect application, and the following description will be given with reference to the application program of the audio effect application selected by the user as a first application program, and it should be understood that the first application program may be understood as an application program associated with the audio effect selected by the user. It should be understood that the first application may be an application in the terminal device that may play audio, and may be, but is not limited to, a music playing application, a video application, a social application. Fig. 3 is a schematic diagram of a setting page provided in an embodiment of the present application. Unlike the interface 203 described above, the audio setting page shown in fig. 3 may also have an identification of an application program displayed thereon. The identification of the application may be an icon or name of the application. As shown in fig. 3, the user selects to apply the sound effect "subwoofer" in "application 1" and "application 2", and "application 1" and "application 2" may be referred to as first application. In such an embodiment, the setting information of the sound effect stored in the terminal device may further include a first application program associated with the sound effect, as shown in the following table two. It should be understood that table two is an example of one format of setting information for storing sound effects for the terminal device.
Watch II
Sound effect label Setting the time of day First application program associated with sound effects
Clear human voice 30 days of 1 month of 2020, 8:00 Application 1
Without any means for 3/2/2020, 10:00 Without any means for
Super bass 2020, 5 months, 1 day, 21:00 Application 1 and application 2
The following embodiments may be combined with each other, and the same or similar concepts or processes will not be described again. Fig. 4 is a flowchart illustrating an embodiment of an audio processing method according to an embodiment of the present application. As shown in fig. 4, the audio processing method provided in the embodiment of the present application may include:
s401, receiving an audio playing request input by a user, and determining whether the user has set sound effects, wherein the audio playing request is used for requesting to play audio.
The user may perform a click or other operation on the interface of the terminal device, interacting with the terminal device to enter an audio play request. Alternatively, the user may interact with the terminal device in voice to input an audio play request to the terminal device. In the embodiment of the application, the mode that the user requests the terminal equipment to play the audio is not limited. Wherein the audio play request is for requesting to play audio.
When the terminal device receives an audio play request input by a user, it can be determined whether the user has set an audio effect. In one possible implementation, the user may set the sound effects on a settings page as shown in FIG. 2 above. In this embodiment, optionally, the terminal device may determine whether the user has set the sound effect according to the setting information of the sound effect, as shown in table one. For example, the terminal device may determine that the user has set the sound effect "subwoofer".
In one possible implementation, the user may set the sound effects on the setup page shown in fig. 3 above. The set information of the sound effects may include sound effects set by the user and at least one first application program associated with the sound effects set by the user, and the sound effects set by the user may be referred to as set sound effects. The terminal device determines an application program of which the user requests to play audio, and determines whether the user has set the audio for the application program according to the stored setting information of the audio, such as a second table. The terminal device may determine whether the at least one application program of the setting information of the sound effect includes a first application program, and when the at least one application program includes the first application program, the terminal device determines that the user has set the sound effect, that is, the user has set the sound effect for the application program. When the first application program is not included in the at least one application program, the terminal device determines that the sound effect is not set by the user, namely, the sound effect is not set by the user for the application program. For example, if the user requests to play audio at application 1, the terminal device may determine from table two that the user has set the sound effect "subwoofer" for application 1.
S402, if the sound effect set by the user is determined, the set sound effect is used as the preference sound effect of the user.
If the terminal equipment determines that the user has set the sound effect through the sound effect setting information, the terminal equipment can take the sound effect set by the user as the preference sound effect of the user. The preferred sound effects of the user can be understood as sound effects favored by the user. For example, the terminal device may use "subwoofers" as the user's preference sound effects.
In one embodiment, the above S401-S402 may be replaced by: when an audio playing request input by a user is received, taking the sound effect set by the user in the sound effect setting information as a preference sound effect of the user according to the sound effect setting information. In this way, when receiving an audio playing request input by a user, the terminal device may query the setting information of the sound effect, and take the sound effect set by the user in the setting information of the sound effect as the preference sound effect of the user.
S403, obtaining sound effect parameters corresponding to the preferred sound effect, and adjusting the current sound effect parameters to the sound effect parameters corresponding to the preferred sound effect.
It should be understood that the set of sound effect parameters may be pre-stored in the memory of the terminal device. The sound effect parameter set comprises sound effect parameters corresponding to various sound effects. Optionally, the sound effect parameter set may include a sound effect tag, and a sound effect parameter corresponding to the sound effect tag, where the sound effect tag is used to indicate a sound effect. The definition of the sound effect parameters can be referred to the above-mentioned related description. The following sound effect parameters include DRC parameters, EQ parameters, and ANC parameters as examples. The sound effect parameter set stored in the terminal device may be as follows, and it should be understood that the third table is an example of a format stored in the sound effect parameter set.
It should be understood that, in the embodiment of the present application, if the sound effect set by the user is not used as the preference sound effect of the user, S402 and S403 may be replaced by: if the user is determined to set the sound effect, acquiring the sound effect parameter corresponding to the set sound effect, and adjusting the current sound effect parameter to the sound effect parameter corresponding to the set sound effect.
It should be understood that, the sound effect parameter corresponding to the preferred sound effect of the user may also be referred to as "sound effect parameter", and in this embodiment of the present application, the sound effect parameter corresponding to the preferred sound effect of the user is taken as an example, so as to facilitate distinguishing from the sound effect parameter in the sound effect parameter set.
Watch III
/>
Taking the sound effect of "subwoofer" as an example, the value in the square brackets in "[2,2000,2.1,0.8,1000,1.1,10,0.1]" in the DRC parameter indicates the frequency band number of the audio signal, the cut-off frequency of the frequency band, the gain, the compression rate, the amplitude threshold, the compression speed, the gain duration and the background noise threshold of the audio signal respectively. The EQ parameters include parameters of 8 filters, and the parameters of each filter are distinguished by brackets, and the values in the brackets represent the type, center frequency, gain, and Q value of the filter, respectively, using the parameter (2,1000,2.1,3.5) as an example. The ANC parameters include parameters of 16 filters, and parameters of each filter are distinguished by brackets, and values in the brackets are respectively type, center frequency, full-band gain, Q value and single-band gain of the filter, taking parameter (4,43.0,7.5,4630,0.0) as an example.
As shown in the above table three, if the preferred sound effect of the user is the subwoofer, the terminal device may determine, according to the sound effect parameter set, the sound effect parameter corresponding to the preferred sound effect "subwoofer". Based on the preferred sound effect of the user, the terminal device can adjust the current sound effect parameter of the terminal device to the sound effect parameter corresponding to the preferred sound effect. The current sound effect parameter may be a sound effect parameter corresponding to a sound effect set by the user last time. For example, as shown in the above table one, if the current time is 20:00 of 3/5/2020, the terminal device may determine that the current sound effect parameter may be the sound effect parameter of the terminal device when the user does not set the sound effect. Or if the current time is 20:00 of 5/6/2020, the terminal device may determine that the current sound effect parameter is the sound effect parameter corresponding to "clear voice".
In an embodiment, fig. 5 is a schematic structural diagram of a terminal device provided in an embodiment of the present application. As shown in fig. 5, the terminal device in the embodiment of the present application may include: digital-to-analog converters, analog-to-digital converters, and audio components. The sound effect assembly may include at least one of the following modules: DRC module, EQ module, ANC module. In the embodiment of the application, the sound effect component may include a DRC module, an EQ module, and an ANC module. The audio component may be connected to a digital-to-analog converter and an analog-to-digital converter, the digital-to-analog converter may be connected to a speaker in the terminal device or an external device (such as an earphone), and the analog-to-digital converter may be connected to a microphone in the terminal device.
And the sound effect component is used for adjusting the audio signal so as to change the sound effect corresponding to the audio signal. Wherein the DRC module is configured to compress or expand the audio signal such that sounds in the audio sound softer or louder, i.e., to adjust the amplitude of the audio signal. And the EQ module is used for correcting the amplitude frequency characteristic and the phase frequency characteristic of a transmission channel of the audio signal so as to compensate the audio signal and reduce the interference on the audio signal. And the ANC module is used for generating reverse sound waves equal to the external noise and neutralizing the external noise, so that the noise reduction effect is realized. And the digital-to-analog converter is used for converting the digital audio signal into an analog audio signal for output. And an analog-to-digital converter for converting the input analog audio signal into a digital audio signal. In other embodiments of the present application, the terminal device may include more or fewer components than those shown in the drawings to process the audio signal, and fig. 5 is not limited to the structure of the terminal device, and it is understood that the audio component in fig. 5 is illustrated by taking an example that the audio component includes a DRC module, an EQ module, and an ANC module, and other modules for processing the audio signal may be further included in the audio component.
The DRC module sound effect parameter is DRC parameter, the EQ module sound effect parameter is EQ parameter, and the ANC module sound effect parameter is ANC parameter, and the parameters of each module in the sound effect component influence the sound effect of the terminal equipment for playing the audio. In one possible implementation, the terminal device stores a preset code and a current sound effect parameter. The preset code may be a code written by a developer to enable the terminal device to execute the audio parameters to play audio, and the preset code may be stored in a system installation package in the terminal device. The terminal equipment can modify the current sound effect parameters into sound effect parameters corresponding to the preference sound effect, so that the purpose of adjusting the current sound effect parameters to the sound effect parameters corresponding to the preference sound effect is achieved.
Alternatively, in one possible implementation manner, multiple sets of sound effect parameters are pre-stored in the terminal device, where each set of sound effect parameters includes a DRC parameter, an EQ parameter, and an ANC parameter. At least one parameter in each set of sound effect parameters is different, and each set of sound effect parameters corresponds to one sound effect. After determining the preferred sound effect of the user, the terminal device can select a target sound effect group corresponding to the preferred sound effect from multiple groups of sound effect parameters, so as to obtain the sound effect parameters corresponding to the preferred sound effect. Unlike the possible implementation manner described above, the terminal device may not modify the sound effect parameters, but select a sound effect parameter corresponding to the preferred sound effect from the multiple sets of sound effect parameters. Optionally, each set of sound effect parameters has a corresponding identifier to characterize a corresponding sound effect, e.g., each set of sound effect parameters has a corresponding identifier that may be a digital or sound effect tag. Illustratively, the current sound effect parameter is identified as 1, the sound effect is characterized as "clear human voice", the preferred sound effect is "subwoofer", and the terminal device may determine that the sound effect parameter is identified as 2.
In one embodiment, the above S401-S403 may be replaced by: when an audio playing request input by a user is received, according to the setting information of the sound effect, obtaining the sound effect parameters corresponding to the preference sound effect of the user. In one possible implementation manner, when the terminal device stores the setting information of the sound effect in the first table or the second table, the terminal device may add the sound effect parameter corresponding to the sound effect set by the user to the first table or the second table according to the sound effect parameter set, and if the first table may be replaced by the fourth table:
table four
In the mode, when the terminal equipment can receive an audio playing request input by a user, the setting information of the sound effect is queried, and further the sound effect parameter corresponding to the sound effect set by the user can be used as the sound effect parameter corresponding to the preference sound effect of the user.
S404, playing the audio by adopting the sound effect parameters corresponding to the preference sound effect.
After the terminal equipment adjusts the sound effect parameters, the terminal equipment can play the audio by adopting the sound effect parameters corresponding to the preference sound effect.
After the user sets the sound effect, the sound effect set by the user can be displayed on the interface of the application program associated with the sound effect. Alternatively, when the user clicks an application program associated with the sound effect (for example, triggers playing music) in a drop-down status bar of the terminal device, the sound effect set by the user may be displayed in the drop-down status bar. In the following, when the user opens the application program associated with the sound effect for the first time after the sound effect is set by the user, the sound effect set by the user is displayed on the interface of the application program. Fig. 6 is another schematic diagram of interface change of the terminal device according to the embodiment of the present application. The interface 601 is a music playing page of the application 1 (e.g., a music playing class application), on which a music list 601a and a music playing bar 601b are displayed. Music list 601a may include names of a plurality of songs, and music play column 601B may include an identification 601c of a song, such as song B, and a play control 601 d. Play control 601d is used to trigger the terminal device to play song B. It should be appreciated that song B, which was the last time the user exited the application, the application played, may be the first song in the music list 601 a.
When the user selects song B in music list 601a, or the user clicks on music bar 601B, interface 601 may jump to play page interface 602 for song B, or directly play song B. The interface 602 has displayed thereon song options 602a, information 602B for song B, user-set sound effects 602c (e.g., "subwoofer"), a play progress bar 602d, a fast-back (last) control 602e, a pause control 602f, and a fast-forward (next) control 602g. The information 602B for song B in interface 602 may include the name of song B, the artist of song B, and the lyrics of song B, which are indicated by numerals in fig. 6. Song options 602a are associated with interface 602. Song options 602a associate interface 602 refers to the user selecting a song option in the menu bar and the terminal device jumps to display interface 602. The user can see the sound effects of the audio on the play page of song B.
In the embodiment of the present application, one possible implementation manner of playing audio by using the sound effect parameters corresponding to the preference sound effect by the terminal device is as follows: the terminal equipment executes a preset code to enable the terminal equipment to play the audio by adopting the sound effect parameters corresponding to the preference sound effect. Or, another possible implementation manner that the terminal equipment plays the audio by adopting the sound effect parameters corresponding to the preference sound effect is as follows: after the terminal equipment determines the identifier of the sound effect parameter corresponding to the preferred sound effect, a preset code can be executed, so that the terminal equipment can play the audio by adopting the sound effect parameter corresponding to the identifier.
In the embodiment of the application, the user can preset the sound effect, the sound effect set by the user is the preference sound effect of the user, the terminal equipment can adjust the sound effect parameter to the sound effect parameter corresponding to the preference sound effect, and further the sound is played by adopting the sound effect parameter corresponding to the preference sound effect, so that the sound effect diversification of the sound is realized, and the user experience is improved.
In the above embodiment, the user needs to preset the audio effect, so that the terminal device plays the audio with the audio effect. In one embodiment, when receiving an audio playing request input by a user, the terminal device in the embodiment of the application can acquire the preference audio of the user according to the information of the user history playing audio in a preset time period, and further play the audio by adopting the audio parameters corresponding to the preference audio of the user, so that the user can be prevented from manually setting the audio, and the user experience is improved. The process may refer to the related description in S405.
In one embodiment, as shown in fig. 4, after S401, the audio processing method provided in the embodiment of the present application may further include:
s405, if the user is determined not to set the sound effect, the preference sound effect of the user is obtained according to the information of the user history playing audio in the preset time period.
It should be understood that S402 and S405 are steps that are alternatively performed.
The preset time period may be a period of time before a time when the user inputs the audio play request (a time when the user inputs the audio play request), and may be, but not limited to, one day, one week, or one month. The user historically played audio may include, but is not limited to,: audio in music, songs, broadcasts, videos that the user plays on the terminal device. The information of the audio played by the user history may be: the user historically plays audio, or the user historically plays audio sound effect tags. It should be appreciated that while the user historically plays audio, the terminal device may store the user historically plays audio. Or when the user historic playing audio, the terminal equipment can collect the user historic playing audio and input the user historic playing audio into the sound effect identification model to obtain the sound effect of the user historic playing audio, so as to store the sound effect label of the user historic playing audio. In one possible implementation manner, the terminal device may delete the information of the user history playing audio before the preset time period at the current time according to the current time, so as to save the memory space of the terminal device.
In the embodiment of the application, if the terminal device determines that the user does not set the sound effect according to the setting information of the sound effect, the terminal device can obtain the preference sound effect of the user according to the information of the user history playing audio. In one possible implementation, when the information of the user history play audio is the user history play audio, the terminal device may input the user history play audio into the sound effect prediction model to obtain the preference sound effect of the user predicted by the sound effect prediction model. In one possible implementation manner, when the information of the audio played by the user in history is the sound effect tags of the audio played by the user in history, the terminal device may use the sound effect corresponding to the sound effect tags with the largest number as the preference sound effect of the user.
It should be noted that, when the information of the user history playing audio is the user history playing audio, in one possible implementation, the terminal device may input the user history playing audio into the sound effect parameter prediction model, and may predict the sound effect parameter corresponding to the preferred sound effect of the user. Different from the above-mentioned terminal equipment obtaining the preference sound effect of the user according to the sound effect prediction model, the terminal equipment can directly obtain the sound effect parameter corresponding to the preference sound effect of the user according to the sound effect parameter prediction model. Fig. 7 is a flowchart of another embodiment of an audio processing method according to an embodiment of the present application. In this manner, as shown in fig. 7, the audio processing method provided in the embodiment of the present application may further include:
S701, if it is determined that the user does not set the sound effect, according to the information of the user history playing audio, obtaining the sound effect parameters corresponding to the preference sound effect of the user.
It should be understood that "S402-S403" and S701 are steps that are alternatively performed, the terminal device may perform S701 after performing S401, and may perform S404 after performing S701.
In one possible implementation manner, when the terminal device receives an audio playing request input by a user, sound effect parameters corresponding to preference sound effects of the user can be obtained according to information of audio playing of the user history in a preset time period. The process may be described with reference to the above-described S701.
According to the embodiment of the application, the terminal equipment can acquire the preference sound effect of the user or the sound effect parameter corresponding to the preference sound effect of the user according to the information of the user history play audio, so that the audio is played according to the preference sound effect of the user, the aim of sound effect diversification can be achieved, and the user can be prevented from manually setting the sound effect. In addition, in the embodiment of the application, the preference sound effect of the user can be obtained by adopting the information of the user history playing audio in the preset time period, the sound effect can be adjusted at any time along with the preference of the user, and the method is more intelligent.
In one embodiment, the terminal device may set a storage duration for the sound effect set by the user, that is, the storage duration corresponds to the setting information of the user, where the storage duration is a period of time from the moment when the sound effect is set by the user. When the sound effect set by the user is within the saving period, the terminal device may play the audio using the sound effect set by the user, e.g., the terminal device may perform S401, S402, S403, and S404. However, if the sound effect set by the user exceeds the storage duration, the terminal device may obtain the preferred sound effect of the user or the sound effect parameter corresponding to the preferred sound effect according to the information of the user history playing audio in the preset time period, and further play the audio by adopting the sound effect parameter corresponding to the preferred sound effect, for example, the terminal device may execute S401, S405, S403 (or S406) and S404. For example, as shown in the above table, the setting time of "overweight bass" is 21:00 of 1 st 5 th year 2020, and the storage time is 5 days, then in 21:00 of 1 st 5 th year 2020 to 21:00 of 6 th 5 th year 2020, the audio effect set by the user is within the storage time, and then the terminal device can play audio with the audio effect "overweight bass" set by the user. After 21:00 of 6 th 5 th 2020, the sound effect set by the user is not within the storage duration, the terminal device can obtain the preferred sound effect of the user or the sound effect parameter corresponding to the preferred sound effect according to the information of the user history playing the audio in the preset time period before 21:00 of 6 th 5 th 2020, and further the audio is played by adopting the sound effect parameter corresponding to the preferred sound effect.
In one scenario, for example, the user sets the sound effect "subwoofer" and the user sets the application associated with the sound effect to application 1. When the user uses other application programs, the terminal device can play the audio without the sound effect of 'super bass', and the preference sound effect of the user can be changed. However, if the user forgets to close the sound effect set on the setting page, the terminal device plays the audio by adopting the sound effect of 'super bass', which causes trouble to the user when the user uses the application program 1. In the embodiment of the application, the terminal equipment adopts a method for setting the storage duration for the sound effect set by the user, and when the preference sound effect of the user changes, the terminal equipment can timely adjust the sound effect parameters, and further play the audio by adopting the sound effect parameters corresponding to the preference sound effect. This kind of mode is more intelligent, more laminating user demand, can improve user experience.
In the above embodiment, the terminal device may have the sound effect parameter set shown in the above table three stored in advance, and the sound effect parameter set may be preset in the terminal device. The process of obtaining the sound effect parameter set is described below. Fig. 8 is a flowchart of acquiring an audio parameter set according to an embodiment of the present application. As shown in fig. 8, a method for obtaining a sound effect parameter set provided in an embodiment of the present application may include:
S801, standard audio of the first sound effect and first frequency response of the standard audio of the first sound effect are obtained.
It should be understood that, in this embodiment, the execution body for obtaining the sound effect parameter set is described as a server, and the execution body may also be an electronic device with computing capability, such as a computer, a terminal device, or the like. The standard audio of the first sound effect is the standard audio of various sound effects, and the various sound effects are sound effects included in the sound effect parameter set. The standard audio of the first sound effect can be audio set for the first sound effect in advance, and the standard audio of the first sound effect can be used as a basis for identifying whether other audio is the first sound effect.
In one possible implementation, the server may obtain standard audio corresponding to the first sound effect from the tag database. It should be appreciated that a large number of audio may be included in the tag database, as well as sound effect tags for each audio. For example, the server may select, according to the sound effect tags in the tag database, the audio of the sound effect tag of the first sound effect as the standard audio of the first sound effect.
Although the above manner can obtain the standard audio of the first sound effect, because there are a plurality of audios belonging to the same sound effect label in the label database, in order to improve the reference accuracy of the standard audio of the first sound effect, in a possible implementation manner, the server may input the test audio to the sound effect classification scoring model to obtain the score that the test audio belongs to the first sound effect. The test audio may be locally stored audio, or audio crawled from a network, or audio recorded by a developer. The server may use the highest scoring test audio as the standard audio for the first sound effect.
After obtaining the standard audio of the first sound effect, the server can send the standard audio of the first sound effect to the terminal equipment. Or, the developer can import the standard audio of the first sound effect into the terminal device, and the terminal device can play the standard audio of the first sound effect to obtain the wav file of the standard audio of the first sound effect. The server may use a simulation tool to obtain the first frequency response of the standard audio of the first sound effect from the wav file of the standard audio. Wherein the simulation tool may use a fourier transform (fourier transform) to convert the wav file of the standard audio into a frequency response curve, i.e. the first frequency response of the standard audio, as shown in fig. 9. It should be understood that the frequency response may be a frequency response curve and the terminal device playing the standard audio may be a device in the testing phase.
S802, adjusting the sound effect parameters, and processing the standard audio of the first sound effect according to the adjusted sound effect parameters to obtain the second frequency response of the standard audio of the first sound effect.
The simulation tool includes a simulation module of the sound effect assembly shown in fig. 5, and the simulation module can simulate DRC parameters of the DRC module, EQ parameters of the EQ module, and ANC parameters of the ANC module in the generated sound effect assembly. In this embodiment of the present application, the server may continuously adjust the sound effect parameters in the simulation tool, and further process the standard audio of the first sound effect with the adjusted sound effect parameters. The server may specifically process the first frequency response by using the adjusted audio parameter, so as to obtain a second frequency response of the standard audio of the first audio, so as to determine whether the second frequency response is close to the first frequency response, as shown in fig. 10.
It should be understood that in the embodiment of the present application, the server may modify the sound effect parameters of each module in the simulation module. Optionally, the server may determine the adjustment sequence in the sound effect parameters of each module in the simulation module according to the priority of the parameters in the sound effect parameters. Illustratively, the parameters are, in order from high to low, the EQ parameter, DRC parameter, ANC parameter. The server may first adjust the EQ parameters, keeping DRC parameters and ANC parameters unchanged. After the EQ parameters are adjusted within the preset adjustment range, the server may keep the EQ parameters and ANC parameters unchanged, and adjust DRC parameters. After the DRC parameter is adjusted within the preset adjustment range, the server can keep the EQ parameter and the DRC parameter unchanged and adjust the ANC parameter. The server can process the first frequency response once by adopting the simulation module to obtain the second frequency response of the standard audio of the first sound effect once by adjusting the sound effect parameters in the simulation module once. The server continuously adjusts the sound effect parameters, so that the second frequency response of the standard audio of the first sound effect corresponding to the plurality of groups of sound effect parameters can be obtained.
S803, taking the sound effect parameter corresponding to the second frequency response with the difference value of the first frequency response smaller than the preset difference value as the sound effect parameter of the first sound effect to obtain a sound effect parameter set.
The terminal device may acquire a second frequency response that uses different sound effect parameters to process the first frequency response, so as to acquire a difference value between the second frequency response and the first frequency response. The difference between the second frequency response and the first frequency response may represent similarity between the sound effect of the standard audio played by the sound effect parameter corresponding to the second frequency response and the first sound effect. The smaller the difference value is, the closer the sound effect of the standard audio played by the sound effect parameter is to the first sound effect. The greater the difference, the further away from the first sound effect the sound effect characterizing the standard audio played with the sound effect parameter. In this embodiment of the present application, the server may use an audio parameter corresponding to a second frequency response, where a difference between the first frequency response and the second frequency response is smaller than a preset difference, as the audio parameter of the first audio. For different sound effects, the method can obtain sound effect parameters corresponding to different sound effects, and further obtain a sound effect parameter set. Optionally, if there are a plurality of second frequency responses with the difference value smaller than the preset difference value with the first frequency response, the sound effect parameter with the smallest difference value between the first frequency response and the second frequency response may be used as the sound effect parameter of the first sound effect. It should be noted that the preset difference value may be predefined for the developer.
It will be appreciated that the first frequency response and the second frequency response are both frequency response curves. In this embodiment of the present application, the server may obtain an average value of absolute values of differences of ordinate coordinates of the first frequency response and the second frequency response when the first frequency response and the second frequency response are on the same abscissa, and further use the average value of the absolute values of differences as a difference value of the first frequency response and the second frequency response. Wherein the abscissa of the frequency response curve is frequency and the ordinate is gain. Illustratively, the values of the ordinate on the first frequency response curve are [1,4,6,7,8] and the values of the ordinate on the second frequency response curve are [3,2,4,5,6] respectively at the same frequency, and the difference between the first frequency response and the second frequency response is the average of the absolute values of the differences of the gains, such as 2.
It should be understood that after the server obtains the sound effect parameter set, the sound effect parameter set may be preset in the terminal device, for example, the sound effect parameter set shown in the table three may be stored in the memory of the terminal device.
Fig. 11 is a schematic flow chart of another embodiment of obtaining an audio parameter set. As shown in fig. 11, the method for obtaining the sound effect parameter set provided in the embodiment of the present application may include:
s1101, randomly generating a plurality of groups of sound effect parameters, and inputting the plurality of groups of sound effect parameters into a sound effect classification scoring model to obtain the score of each group of sound effect parameters belonging to the first sound effect.
Each set of sound effect parameters may include DRC parameters, EQ parameters, and ANC parameters, with at least one sound effect parameter of the different sets of sound effect parameters being different. The terminal device may input a plurality of sets of sound effect parameters into the sound effect classification scoring model, and the sound effect classification scoring model may output a score of each set of sound effect parameters belonging to the first sound effect. Wherein the higher the score, the closer the sound effect characterizing the audio played with the set of sound effect parameters to the first sound effect. It should be appreciated that the first sound effect is used to characterize various sound effects.
An exemplary set of sound effect parameters, as randomly generated by the server, are "DRC parameters: [2,2000,2.1,0.8,1000,1.1,10,0.1]; EQ parameters: [ (2,1000,2.1,3.5), (3,1200,2.4,3.6), (2,1800,2.1,3.5), (1,800,0.1,3.5), (2,500,4.9,1.5,) (0,1788,2.3,3.2), (2,3000, -2.8,3.5), (2,5000,2.9,3.5) ]; ANC parameters: the sound classification scoring model may output (0.72,0.05,0.06,0.72,0.02,0.14) a score that characterizes the set of sound parameters as belonging to the first sound effect, e.g., the set of sound parameters as belonging to the sound effect "overweight" score of 0.72, and the set of sound parameters as belonging to the sound effect "clear human sound" score of 0.05 … …, [ (4,43.0,7.5,4630,0.0), (3,0,4,1200,0), (4,22.5,1.5,8540,0.0), (3,0,4,1200,0), (4, -56.0,6.0,8820,0.0), (3,0,4,1200,0), (4, -23.5,3.5,15030,0.0), (3,0,4,1200,0), (2, -42.5,7.0,15700,0.0), (3,0,4,1200,0), (4,11.5,8.0,8890,0.0), (3,0,4,1200,0), (4, -1.5,4.0,15210,0.0), (3,0,4,1200,0), (4, -11.0,6.0,2530,0.0), (3,0,4,1200,0) ].). Illustratively, the set of sound effect parameters belongs to the highest score of "subwoofer", the closer the corresponding sound effect of the set of sound effect parameters is to the sound effect "subwoofer".
S1102, taking the sound effect parameter with the highest score of the first sound effect as the sound effect parameter of the first sound effect to obtain a sound effect parameter set.
In this embodiment of the present application, the terminal device may obtain an effect parameter with the highest score belonging to the first effect, and use the effect parameter with the highest score as the effect parameter of the first effect. If the highest score in each group of sound effect parameters corresponding to the sound effect 'super bass' is 0.98, the sound effect parameter corresponding to the 0.98 is used as the sound effect parameter of the sound effect 'super bass', and accordingly, the server can obtain a sound effect parameter set.
In the manner of acquiring the audio parameter set shown in fig. 11, compared with the manner shown in fig. 8, the server may not acquire the standard audio corresponding to the first audio in advance. That is, in the case that the server cannot obtain the standard audio corresponding to the first audio, the server may obtain the audio parameter corresponding to the audio according to the randomly generated audio parameter, and the method shown in fig. 11 has wider applicability.
In the embodiment of the present application, the execution body for executing the audio processing method may be a terminal device, a chip or a processor in the terminal device, or the like. It should be understood that the terminal device in the embodiments of the present application may be referred to as a User Equipment (UE), a mobile terminal (mobile terminal), a terminal (terminal), or the like. The terminal device may be a personal digital processing (personal digital assistant, PDA), a handheld device with wireless communication functionality, a computing device, an in-vehicle device or a wearable device, a Virtual Reality (VR) terminal device, an augmented reality (augmented reality, AR) terminal device, a wireless terminal in industrial control (industrial control), a wireless terminal in unmanned (self driving), a wireless terminal in smart city, a wireless terminal in smart home (smart home), etc. The form of the terminal device in the embodiment of the present application is not specifically limited.
Fig. 12 is another schematic structural diagram of a terminal device according to an embodiment of the present application. As shown in fig. 12, the terminal apparatus 1200 may include: processor 1210, memory 1220, communication module 1230, display 1240, sensor 1250, audio module 1260. It is to be understood that the structure illustrated in fig. 12 does not constitute a specific limitation on the terminal apparatus 1200. In other embodiments of the present application, terminal device 1200 may include more or less components than illustrated, or may combine certain components, or may split certain components, or may have a different arrangement of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware. The interfacing relationship between the modules illustrated in the embodiment of the present application is only schematically illustrated, and does not constitute a structural limitation of the terminal apparatus 1200. In other embodiments of the present application, the terminal device 1200 may also use different interfacing manners in the foregoing embodiments, or a combination of multiple interfacing manners.
Processor 1210 may include one or more processing units such as: processor 1210 may include an application processor (application processor, AP), a digital signal processor (digital signal processor, DSP), a display processing unit (display process unit, DPU), and/or a neural network processor (neural-network processing unit, NPU), among others. Wherein the different processing units may be separate devices or may be integrated in one or more processors. In some embodiments, terminal device 1200 can also include one or more processors 1210. The processor may be a neural hub and a command center of the terminal device 1200. In some embodiments, processor 1210 may include one or more interfaces. The interface may include an integrated circuit (inter-integrated circuit, I2C) interface, and/or a universal serial bus (universal serial bus, USB) interface, etc. Wherein the USB interface is an interface conforming to the USB standard specification, specifically, the interface can be Mini USB interface, micro USB interface, USB Type C interface and the like. The USB interface may be used to connect a charger to charge the terminal device 1200, or may be used to transfer data between the terminal device 1200 and a peripheral device. And can also be used for connecting with a headset, and playing audio through the headset.
Memory 1220 may be used to store one or more computer programs, including instructions. The processor 1210 can cause the terminal apparatus 1200 to execute various functional applications, data processing, and the like by executing the above-described instructions stored in the memory 1220. Memory 1220 may include a stored program area and a stored data area. The storage program area can store an operating system; the storage area may also store one or more applications (e.g., gallery, contacts, etc.), and so forth. In some embodiments, processor 1210 may cause terminal device 1200 to perform various functional applications and data processing by executing instructions stored in memory 1220, and/or instructions stored in memory provided in processor 1210.
The communication module 1230 may provide a communication module including 2G/3G/4G/5G and the like applied to the terminal device 1200, and/or a communication module including wireless local area network (wireless local area networks, WLAN), bluetooth, global navigation satellite system (global navigation satellite system, GNSS), frequency modulation (frequency modulation, FM), NFC, infrared technology (IR) and the like applied to the terminal device 1200. The communication module 1230 is used to implement communication between the terminal device 1200 and other devices.
The terminal apparatus 1200 may implement a display function through a graphic processor (graphics processing unit, GPU), a display screen 1240, an application processor, and the like. The GPU may connect the display 1240 and the application processor. The GPU is used to perform mathematical and geometric calculations for graphics rendering. Processor 1210 may include one or more GPUs that execute instructions to generate or change display information.
The display 1240 is used to display images, videos, and the like. Display 1240 includes a display panel. The display panel may employ a liquid crystal display (liquid crystal display, LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode (AMOLED) or an active-matrix organic light-emitting diode (matrix organic light emitting diode), a flexible light-emitting diode (flex), a mini, a Micro led, a Micro-OLED, a quantum dot light-emitting diode (quantum dot light emitting diodes, QLED), or the like. In some embodiments, terminal device 1200 may include 1 or N displays 1240, where N is a positive integer greater than 1.
The sensors 1250 may include a pressure sensor 1250A, a gyroscope sensor 1250B, an acceleration sensor 1250C, a distance sensor 1250D, a fingerprint sensor 1250E, a touch sensor 1250F, and the like.
The terminal device 1200 may implement audio functions through an audio module 1260, a speaker 1260A, a receiver 1260B, a microphone 1260C, an earphone interface 1260D, an application processor, and the like. Such as music playing, recording, etc. Wherein audio module 1260 is used to convert digital audio information to an analog audio signal output and also to convert an analog audio input to a digital audio signal. The audio module 1260 may also be used to encode and decode audio signals. In some embodiments, the audio module 1260 may be disposed in the processor 110, or some functional modules of the audio module 1260 may be disposed in the processor 110. Speaker 1260A, also referred to as a "horn," is used to convert audio electrical signals into sound signals. The electronic device 100 may listen to music, or to hands-free conversations, through the speaker 1260A. A receiver 1260B, also referred to as a "receiver," is used to convert the audio electrical signal into a sound signal. When electronic device 100 is answering a telephone call or voice message, voice may be received by placing receiver 1260B in close proximity to the human ear. Microphone 1260C, also referred to as a "microphone" or "microphone", is used to convert sound signals into electrical signals. When making a call or transmitting voice information, a user can sound near the microphone 1260C through his/her mouth, inputting a sound signal to the microphone 1260C. The electronic device 100 may be provided with at least one microphone 1260C. In other embodiments, the electronic device 100 may be provided with two microphones 1260C, and may implement a noise reduction function in addition to collecting sound signals. In other embodiments, the electronic device 100 may also be provided with three, four, or more microphones 1260C to enable collection of sound signals, noise reduction, identification of sound sources, directional recording functions, etc. The headphone interface 1260D is for connecting wired headphones. The headset interface 1260D may be a USB interface 130, a 3.5mm open mobile electronic device platform (open mobile terminal platform, OMTP) standard interface, or a american cellular telecommunications industry association (cellular telecommunications industry association of the USA, CTIA) standard interface.
The term "plurality" in the embodiments of the present application refers to two or more. The term "and/or" is herein merely an association relationship describing an associated object, meaning that there may be three relationships, e.g., a and/or B, may represent: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship; in the formula, the character "/" indicates that the front and rear associated objects are a "division" relationship.
It will be appreciated that the various numerical numbers referred to in the embodiments of the present application are merely for ease of description and are not intended to limit the scope of the embodiments of the present application. It should be understood that, in the embodiments of the present application, the sequence number of each process does not mean that the execution sequence of each process should be determined by the function and the internal logic, and should not constitute any limitation on the implementation process of the embodiments of the present application.

Claims (21)

1. An audio processing method, comprising:
receiving an audio playing request input by a user, and acquiring sound effect parameters according to sound effect setting information or user history audio playing information, wherein the audio playing request is used for requesting audio playing;
Playing the audio by adopting the sound effect parameters;
before the sound effect parameters are obtained according to the setting information of the sound effect or the information of the audio played by the user in history, the method further comprises the following steps:
determining whether the user has set the sound effect according to the setting information of the sound effect;
the obtaining the sound effect parameters according to the setting information of the sound effect or the information of the audio played by the user history comprises the following steps:
if the user is determined to have set the sound effect, acquiring sound effect parameters corresponding to the set sound effect according to the setting information of the sound effect;
if the user is determined not to set the sound effect, acquiring sound effect parameters corresponding to the preference sound effect of the user according to the information of the user history playing audio in a preset time period;
the method further comprises the steps of:
if the setting information of the sound effect exceeds the storage duration, obtaining sound effect parameters corresponding to the preference sound effect of the user according to the information of the user history playing audio in the preset time period.
2. The method of claim 1, wherein the setting information of the sound effects includes the set sound effects and at least one first application associated with the set sound effects, and wherein determining whether the user has set a sound effect based on the setting information of the sound effects comprises:
Determining whether the at least one first application includes an application for which the user requests to play audio;
if yes, determining that the user has set sound effects.
3. The method according to claim 1 or 2, wherein the obtaining, according to the setting information of the sound effect, the sound effect parameter corresponding to the set sound effect includes:
and acquiring sound effect parameters corresponding to the set sound effects according to the sound effect parameter set and the set sound effects, wherein the sound effect parameter set comprises sound effect parameters corresponding to each sound effect.
4. The method according to claim 1 or 2, wherein the obtaining, according to the information of the user's historical playing audio within the preset period of time, the sound effect parameters corresponding to the preference sound effect of the user includes:
acquiring preference sound effects of the user according to the information of the user history playing audio;
and acquiring sound effect parameters corresponding to the preference sound effect according to the sound effect parameter set and the preference sound effect of the user, wherein the sound effect parameter set comprises sound effect parameters corresponding to each sound effect.
5. The method of claim 4, wherein the information of the user history playing audio is user history playing audio, and the obtaining the preference sound effect of the user according to the information of the user history playing audio comprises:
And inputting the historical playing audio of the user into an audio effect prediction model to acquire the preference audio effect of the user.
6. The method of claim 4, wherein the information of the user history playing audio is an audio tag of the user history playing audio, the audio tag being used for characterizing audio, the obtaining the preference audio of the user according to the information of the user history playing audio comprises:
and taking the sound effects corresponding to the sound effect tags with the largest number as the preference sound effects of the user.
7. The method of claim 6, wherein the method further comprises:
collecting the user historical playing audio, inputting the user historical playing audio into a sound effect identification model, and obtaining a sound effect label of the user historical playing audio.
8. The method according to claim 1 or 2, wherein the information of the user history play audio is user history play audio, and the obtaining the sound effect parameter according to the information of the user history play audio includes:
inputting the historical playing audio of the user to an audio parameter prediction model to obtain audio parameters corresponding to the preference audio of the user.
9. The method of any of claims 1-8, wherein the employing the sound effect parameters, prior to playing the audio, further comprises:
modifying the current sound effect parameter into the sound effect parameter; or,
and selecting the sound effect parameters from a plurality of preset sound effect parameters, wherein each sound effect parameter corresponds to one sound effect.
10. The method according to any one of claims 1-9, wherein the sound effect parameters include at least one of: dynamic range control DRC parameters, equalizer EQ parameters, active noise reduction ANC parameters.
11. An electronic device for playing audio, the electronic device comprising an audio component:
the electronic equipment is used for receiving an audio playing request input by a user, and acquiring sound effect parameters according to sound effect setting information or user history audio playing information, wherein the audio playing request is used for requesting audio playing;
the sound effect component is used for playing the audio by adopting the sound effect parameters;
the electronic equipment is also used for determining whether the user has set the sound effect according to the setting information of the sound effect;
the electronic equipment is specifically configured to, if it is determined that the user has set an audio effect, obtain, according to the setting information of the audio effect, an audio effect parameter corresponding to the set audio effect; if the user is determined not to set the sound effect, acquiring sound effect parameters corresponding to the preference sound effect of the user according to the information of the user history playing audio in a preset time period;
And the electronic equipment is also used for acquiring sound effect parameters corresponding to the preference sound effect of the user according to the information of the user history playing audio in the preset time period if the setting information of the sound effect exceeds the storage duration.
12. The electronic device of claim 11, wherein the setup information for the sound effects includes the set sound effects and at least one first application associated with the set sound effects;
the method comprises the steps of determining whether the at least one first application program comprises an application program for requesting to play audio by the user; and if the at least one first application program comprises an application program for requesting the user to play the audio, determining that the user has set the sound effect.
13. An electronic device as claimed in claim 11 or 12, characterized in that,
the method is specifically used for acquiring sound effect parameters corresponding to the set sound effects according to the sound effect parameter set and the set sound effects, wherein the sound effect parameter set comprises sound effect parameters corresponding to each sound effect.
14. An electronic device as claimed in claim 11 or 12, characterized in that,
the method is particularly used for acquiring the preference sound effect of the user according to the information of the user history playing audio; and acquiring sound effect parameters corresponding to the preference sound effect according to the sound effect parameter set and the preference sound effect of the user, wherein the sound effect parameter set comprises sound effect parameters corresponding to each sound effect.
15. The electronic device of claim 14, wherein the information of the user history play audio is the user history play audio;
the method is particularly used for inputting the historical playing audio of the user into an audio effect prediction model to acquire the preference audio effect of the user.
16. The electronic device of claim 14, wherein the information of the user's historically played audio is an audio tag of the user's historically played audio, the audio tag being used to characterize an audio;
the method is particularly used for taking the sound effects corresponding to the sound effect tags with the largest number as the preference sound effects of the user.
17. The electronic device of claim 16, wherein the electronic device comprises a memory device,
the method is also used for collecting the user historical playing audio, inputting the user historical playing audio into a sound effect identification model and obtaining a sound effect label of the user historical playing audio.
18. The electronic device of claim 11 or 12, wherein the information of the user history play audio is user history play audio;
and the audio parameter prediction module is also used for inputting the historical playing audio of the user to an audio parameter prediction model to obtain the audio parameter corresponding to the preference audio of the user.
19. The electronic device of any of claims 11-18,
the method is also used for modifying the current sound effect parameters into the sound effect parameters; or selecting the sound effect parameters from a plurality of preset sound effect parameters, wherein each sound effect parameter corresponds to one sound effect.
20. The electronic device of any of claims 11-19, wherein the sound effect component comprises at least one of: the dynamic range control DRC module, the equalizer EQ module and the active noise reduction ANC module, wherein the sound effect parameter of the DRC module is DRC parameter, the sound effect parameter of the EQ module is EQ parameter, and the sound effect parameter of the ANC module is ANC parameter.
21. A computer readable storage medium, characterized in that the computer storage medium stores computer instructions, which when executed by a computer, cause the computer to perform the method of any of claims 1-10.
CN202011331956.1A 2020-11-24 2020-11-24 Audio processing method, electronic device, and readable storage medium Active CN114546325B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202011331956.1A CN114546325B (en) 2020-11-24 2020-11-24 Audio processing method, electronic device, and readable storage medium
PCT/CN2021/131621 WO2022111381A1 (en) 2020-11-24 2021-11-19 Audio processing method, electronic device and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011331956.1A CN114546325B (en) 2020-11-24 2020-11-24 Audio processing method, electronic device, and readable storage medium

Publications (2)

Publication Number Publication Date
CN114546325A CN114546325A (en) 2022-05-27
CN114546325B true CN114546325B (en) 2024-04-16

Family

ID=81660287

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011331956.1A Active CN114546325B (en) 2020-11-24 2020-11-24 Audio processing method, electronic device, and readable storage medium

Country Status (2)

Country Link
CN (1) CN114546325B (en)
WO (1) WO2022111381A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116743913B (en) * 2022-09-02 2024-03-19 荣耀终端有限公司 Audio processing method and device
CN116453492A (en) * 2023-06-16 2023-07-18 成都小唱科技有限公司 Method and device for switching jukebox airport scenes, computer equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1976114A1 (en) * 2007-03-13 2008-10-01 Vestel Elektronik Sanayi ve Ticaret A.S. Automatic equalizer adjustment method
CN105959483A (en) * 2016-06-16 2016-09-21 广东欧珀移动通信有限公司 Audio stream processing method and mobile terminal
CN106488311A (en) * 2016-11-09 2017-03-08 微鲸科技有限公司 Audio method of adjustment and user terminal
CN108989871A (en) * 2018-06-27 2018-12-11 广州视源电子科技股份有限公司 Parameter adjusting method, device, readable storage medium storing program for executing and video playback apparatus
CN109271128A (en) * 2018-09-04 2019-01-25 Oppo广东移动通信有限公司 Audio setting method, device, electronic equipment and storage medium
CN111556198A (en) * 2020-04-24 2020-08-18 深圳传音控股股份有限公司 Sound effect control method, terminal equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1976114A1 (en) * 2007-03-13 2008-10-01 Vestel Elektronik Sanayi ve Ticaret A.S. Automatic equalizer adjustment method
CN105959483A (en) * 2016-06-16 2016-09-21 广东欧珀移动通信有限公司 Audio stream processing method and mobile terminal
CN106488311A (en) * 2016-11-09 2017-03-08 微鲸科技有限公司 Audio method of adjustment and user terminal
CN108989871A (en) * 2018-06-27 2018-12-11 广州视源电子科技股份有限公司 Parameter adjusting method, device, readable storage medium storing program for executing and video playback apparatus
CN109271128A (en) * 2018-09-04 2019-01-25 Oppo广东移动通信有限公司 Audio setting method, device, electronic equipment and storage medium
CN111556198A (en) * 2020-04-24 2020-08-18 深圳传音控股股份有限公司 Sound effect control method, terminal equipment and storage medium

Also Published As

Publication number Publication date
WO2022111381A1 (en) 2022-06-02
CN114546325A (en) 2022-05-27

Similar Documents

Publication Publication Date Title
CN110870201B (en) Audio signal adjusting method, device, storage medium and terminal
CN107509153B (en) Detection method and device of sound playing device, storage medium and terminal
CN104394491B (en) A kind of intelligent earphone, Cloud Server and volume adjusting method and system
CN114546325B (en) Audio processing method, electronic device, and readable storage medium
WO2020224322A1 (en) Method and device for processing music file, terminal and storage medium
CN109918039B (en) Volume adjusting method and mobile terminal
CN102160358A (en) Upstream signal processing for client devices in a small-cell wireless network
CN108668024B (en) Voice processing method and terminal
CN106126165B (en) A kind of audio stream processing method and mobile terminal
CN101271722A (en) Music broadcasting method and device
US11133024B2 (en) Biometric personalized audio processing system
US20220343929A1 (en) Personal audio assistant device and method
CN110062309A (en) Method and apparatus for controlling intelligent sound box
CN110430475A (en) A kind of interactive approach and relevant apparatus
WO2020228226A1 (en) Instrumental music detection method and apparatus, and storage medium
CN114245271A (en) Audio signal processing method and electronic equipment
WO2022267468A1 (en) Sound processing method and apparatus thereof
CN115985309A (en) Voice recognition method and device, electronic equipment and storage medium
CN111918174A (en) Method and device for balancing volume gain, electronic device and vehicle
CN113593602B (en) Audio processing method and device, electronic equipment and storage medium
CN114501297A (en) Audio processing method and electronic equipment
CN112612874A (en) Data processing method and device and electronic equipment
CN111739496A (en) Audio processing method, device and storage medium
KR20220107052A (en) Listening device, how to adjust the listening device
WO2022228174A1 (en) Rendering method and related device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant