CN109961796B - Audio data processing method, device and storage medium - Google Patents

Audio data processing method, device and storage medium Download PDF

Info

Publication number
CN109961796B
CN109961796B CN201910190569.1A CN201910190569A CN109961796B CN 109961796 B CN109961796 B CN 109961796B CN 201910190569 A CN201910190569 A CN 201910190569A CN 109961796 B CN109961796 B CN 109961796B
Authority
CN
China
Prior art keywords
data
transition
audio
overflow
similar
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910190569.1A
Other languages
Chinese (zh)
Other versions
CN109961796A (en
Inventor
赵伟峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Music Entertainment Technology Shenzhen Co Ltd
Original Assignee
Tencent Music Entertainment Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Music Entertainment Technology Shenzhen Co Ltd filed Critical Tencent Music Entertainment Technology Shenzhen Co Ltd
Priority to CN201910190569.1A priority Critical patent/CN109961796B/en
Publication of CN109961796A publication Critical patent/CN109961796A/en
Application granted granted Critical
Publication of CN109961796B publication Critical patent/CN109961796B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/003Changing voice quality, e.g. pitch or formants
    • G10L21/007Changing voice quality, e.g. pitch or formants characterised by the process used

Abstract

The embodiment of the application discloses an audio data processing method, an audio data processing device and a storage medium, wherein overflow data to be processed is obtained from audio data when the audio data are processed; determining transition data and a search interval of the overflow data; searching similar data with the highest similarity to the transition data from the search interval; replacing the overflow data with a similar value corresponding to the overflow data in the similar data; according to the scheme, the similar data with the highest similarity to the transition data of the overflowing data is searched in the search interval to repair the overflowing data of the audio frequency, so that the original waveform of the audio data can be restored to the greatest extent, and the continuity of the audio data is improved.

Description

Audio data processing method, device and storage medium
Technical Field
The present application relates to the field of communications technologies, and in particular, to an audio data processing method, apparatus, and storage medium.
Background
With the development of the internet, various song recording software comes to be endlessly, and a user often breaks in recording songs, so that audio data overflows and auditory discomfort is brought to the user. In the process of implementing the present invention, the inventor finds that at least the following problems exist in the prior art, and the current processing method for the audio data overflow is to directly cut off the audio data overflow to the maximum value or the minimum value, but the audio data overflow causes the sound to be uncoordinated, the front and the back are not continuous, and the sound effect is not good.
Disclosure of Invention
The embodiment of the application provides an audio data processing method, an audio data processing device and a storage medium, which can restore the original waveform of audio data to the greatest extent and improve the continuity of the audio data.
The embodiment of the application provides an audio data processing method, which comprises the following steps:
acquiring overflow data to be processed from the audio data;
respectively setting the length of the first transition data and the length of the second transition data;
determining first transition data before the overflow data according to the length of the first transition data, and determining second transition data after the overflow data according to the length of the second transition data;
determining a search interval for audio data preceding the first transition data;
searching similar data with the highest similarity to the transition data from the search interval;
and replacing the overflow data with a similar value corresponding to the overflow data in the similar data.
Optionally, in some embodiments, in the audio data processing method, the transition data includes first transition data and second transition data, and the step of determining the transition data and the search interval of the overflow data includes:
respectively setting the length of the first transition data and the length of the second transition data;
determining first transition data before the overflow data according to the length of the first transition data, and determining second transition data after the overflow data according to the length of the second transition data;
the audio data preceding the first transition data determines a search interval.
Optionally, in some embodiments, in the audio data processing method, the step of determining a search interval for the audio data before the first transition data includes:
taking the sum of the length of the first transition data, the length of the overflow data and the length of the second transition data as the data processing length of audio processing;
and acquiring an interval which is longer than the data processing length in the audio data before the first transition data as a search interval.
Optionally, in some embodiments, in the audio data processing method, the step of searching for similar data with the highest similarity to the transition data from the search interval includes:
searching similar data with the highest similarity to the first transition data and the second transition data from the search interval;
the replacing the similar value corresponding to the overflow data in the similar data with the overflow data specifically includes: and judging whether the similarity degree of the similar value of the similar data and the transition data reaches a preset threshold value, and if the similarity degree reaches the preset threshold value, replacing the overflow data with the similar value corresponding to the overflow data in the similar data.
Optionally, in some embodiments, in the audio data processing method, after the step of replacing the overflow data with the similar value corresponding to the overflow data in the search data, the method further includes:
and smoothing the transition data by adopting a preset algorithm.
Optionally, in some embodiments, in the audio data processing method, the smoothing processing on the transition data by using a preset algorithm includes:
acquiring a weighted value of the transition data to obtain a first weighted value, and acquiring a weighted value of the similar data to obtain a second weighted value;
summing the product of the transition data and the first weight value and the product of the similar data and the second weight value to obtain a replacement value of the transition data;
and replacing the transition data with the replacement value of the transition data.
Optionally, in some embodiments, in the audio data processing method, the step of obtaining overflow data to be processed from the audio data includes:
acquiring an audio file;
preprocessing the audio file to obtain audio data;
and searching audio data with the sound intensity exceeding a preset maximum value or a preset minimum value from the audio data to obtain overflow data to be processed.
Optionally, in some embodiments, in the audio data processing method, the step of preprocessing the audio file to obtain audio data includes:
converting the sound intensity of the audio file to obtain a conversion value;
and compressing the audio file by using the conversion numerical value to obtain audio data.
Correspondingly, an embodiment of the present application further provides an audio data processing apparatus, including:
the acquisition module is used for acquiring overflow data to be processed from the audio data;
the determining module is used for respectively setting the length of the first transition data and the length of the second transition data; determining first transition data before the overflow data according to the length of the first transition data, and determining second transition data after the overflow data according to the length of the second transition data; determining a search interval for audio data preceding the first transition data;
the searching module is used for searching similar data with the highest similarity to the transition data from the searching interval;
and the replacing module is used for replacing the overflow data with a similar value corresponding to the overflow data in the similar data.
Optionally, in some embodiments, in the audio data processing apparatus, the determining module is specifically configured to use a sum of a length of the first transition data, a length of the overflow data, and a length of the second transition data as a data processing length of audio processing; and acquiring an interval which is longer than the data processing length in the audio data before the first transition data as a search interval.
Optionally, in some embodiments, in the audio data processing apparatus, the searching module is specifically configured to search, from the search interval, similar data with a highest similarity to the first transition data and the second transition data;
the replacing module is specifically configured to determine whether a similarity degree between the similar value of the similar data and the transition data reaches a preset threshold, and replace the overflow data with the similar value corresponding to the overflow data in the similar data if the similarity degree reaches the preset threshold.
Optionally, in some embodiments, the audio data processing apparatus further includes a smoothing module, as follows:
and the smoothing processing module is used for smoothing the transition data by adopting a preset algorithm.
Optionally, in some embodiments, in the audio data processing apparatus, the smoothing processing module is specifically configured to obtain a weight value of transition data to obtain a first weight value, and obtain a weight value of similar data to obtain a second weight value; summing the product of the transition data and the first weight value and the product of the similar data and the second weight value to obtain a replacement value of the transition data; and replacing the transition data with the replacement value of the transition data.
Optionally, in some embodiments, in the audio data processing apparatus, the obtaining module includes an obtaining sub-module, a preprocessing sub-module, and a searching sub-module, as follows:
the acquisition submodule is used for acquiring an audio file;
the preprocessing submodule is used for preprocessing the audio file to obtain audio data;
the searching submodule is used for searching the audio data with the sound intensity exceeding a preset maximum value or a preset minimum value from the audio data to obtain overflow data to be processed.
Optionally, in some embodiments, in the audio data processing apparatus,
the preprocessing submodule is specifically used for converting the sound intensity of the audio file to obtain a conversion numerical value; and compressing the audio file by using the conversion numerical value to obtain audio data.
In addition, a storage medium is further provided, where the storage medium stores a plurality of instructions, and the instructions are suitable for being loaded by a processor to perform the steps in any one of the audio data processing methods provided in the embodiments of the present application.
When audio data are processed, overflow data to be processed are obtained from the audio data; respectively setting the length of the first transition data and the length of the second transition data; determining first transition data before the overflow data according to the length of the first transition data, and determining second transition data after the overflow data according to the length of the second transition data; determining a search interval for audio data preceding the first transition data; searching similar data with the highest similarity to the transition data from the search interval; replacing the overflow data with a similar value corresponding to the overflow data in the similar data; according to the scheme, the similar data with the highest similarity to the transition data of the overflowing data is searched in the search interval to repair the overflowing data of the audio frequency, so that the original waveform of the audio data can be restored to the greatest extent, and the continuity of the audio data is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1a is a schematic view of a scene of an audio data processing method provided in an embodiment of the present application;
fig. 1b is a schematic diagram of a first flow chart of an audio data processing method according to an embodiment of the present application;
FIG. 1c is a schematic diagram of overflow data of an audio data processing method according to an embodiment of the present application;
fig. 2a is a second flowchart of an audio data processing method provided by an embodiment of the present application;
FIG. 2b is a schematic diagram of transition data of an audio data processing method according to an embodiment of the present application;
fig. 3a is a schematic diagram of a first structure of an audio data processing apparatus according to an embodiment of the present application;
fig. 3b is a schematic diagram of a second structure of an audio data processing apparatus according to an embodiment of the present application;
FIG. 3c is a schematic diagram of a third structure of an audio data processing apparatus according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a network device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The terms "first", "second", and "third", etc. in this application are used to distinguish between different objects and not to describe a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions.
The embodiment of the application provides an audio data processing method, an audio data processing device and a storage medium.
For example, referring to fig. 1a, when a user generates a sound breaking problem in an audio recording or mixing process, the network device is triggered to process audio data, the network device may obtain overflow data to be processed from the audio data, then determine transition data and a search interval of the overflow data, then search for similar data with the highest similarity to the transition data from the search interval, and then replace the overflow data with a similar value corresponding to the overflow data in the similar data.
Optionally, a preset algorithm may be used to smooth the transition data thereafter to obtain a better auditory effect.
The following are detailed below. The order of the following examples is not intended to limit the preferred order of the examples.
In the present embodiment, the audio data processing apparatus will be described from the perspective of an audio data processing apparatus, which may be specifically integrated in a network device, which may be a terminal or a server, and the like, wherein the terminal may include a tablet Computer, a notebook Computer, a Personal Computer (PC), and the like.
The embodiment of the application provides an audio data processing method, which comprises the following steps: acquiring overflow data to be processed from the audio data; determining transition data and a search interval of the overflow data, for example, the length of first transition data and the length of second transition data may be set respectively, determining first transition data before the overflow data according to the length of the first transition data, determining second transition data after the overflow data according to the length of the second transition data, and determining the search interval in audio data before the first transition data; searching similar data with the highest similarity to the transition data from the search interval; and replacing the overflow data with a similar value corresponding to the overflow data in the similar data.
As shown in fig. 1b, the specific flow of the audio data processing method may be as follows:
101. and acquiring overflow data to be processed from the audio data.
For example, when the audio data needs to be processed, an offline audio file may be obtained for a discrete processing system, and an audio file input into a buffer may be obtained for a real-time processing system, and the obtained audio file is preprocessed to obtain audio data, and then, the audio data whose sound intensity exceeds a preset maximum value or a preset minimum value is searched from the audio data to obtain overflow data to be processed.
The overflow data refers to audio data with sound intensity exceeding a preset value, and there are two types of data: overflow, or underflow, for example, as shown in fig. 1c, overflow refers to the sound intensity of the audio data exceeding a preset maximum value, and underflow refers to the sound intensity of the audio data exceeding a preset minimum value. The preset maximum value or the preset minimum value may be set in various ways, for example, may be preset in the system, or may be set as needed during processing.
The preprocessing can be set according to the requirements of practical application, for example, for convenience of reading and writing, convenience of operation and convenience of estimation, the acquired audio file can be compressed to reduce data, and the like, that is, the step "preprocessing the acquired audio file to obtain audio data" can include:
(1) and converting the sound intensity of the audio file to obtain a conversion value.
For example, d may be set in dB (decibel), and converted by a first preset formula to obtain a converted value s, where the first preset formula is as follows:
s=10.^(d/20)
where d is the decibel value of the sound intensity and s is the converted value.
(2) And compressing the audio file by using the conversion numerical value to obtain audio data.
For example, a second preset formula may be adopted to perform unified processing on the audio data, where the second preset formula is as follows:
x1=x(n)*s
where x (N) is data of an audio file, the total length is N, and x1 is the processed audio data.
102. And determining transition data and a search interval of the overflow data.
For example, transition data of the overflow data may be specifically determined, the transition data refers to data before and after the overflow data, the transition data may include first transition data and second transition data, for example, a length of the first transition data and a length of the second transition data may be set, respectively, the first transition data before the overflow data is determined according to the length of the first transition data, the second transition data after the overflow data is determined according to the length of the second transition data, and the audio data before the first transition data determines the search interval.
The search interval refers to determining an interval in the audio data for searching data with a greater similarity to the overflow data. There are many ways to determine the search interval, for example, the search interval may be determined according to the length of data processing, or the search interval may be determined according to other ways; that is, the step of "determining a search interval for audio data preceding the first transition data" may include: taking the sum of the length of the first transition data, the length of the overflow data and the length of the second transition data as the data processing length of the audio processing; an interval greater than the data processing length is acquired as a search interval in the audio data preceding the first transition data. The length of the search interval is greater than the data processing length.
Correspondingly, after the length of the search interval is determined, whether the length of the audio data before the first transition data is larger than the length of the search interval is judged. If the length of the audio data before the first transition data is greater than or equal to the length of the search interval, executing step 103; if the length of the audio data before the first transition data is smaller than the length of the search interval, the system reports an error, does not process the audio data, and ends the audio data processing task.
For example, the length of the first transition data may be set to obtain a first length, and the length of the second transition data may be set to obtain a second length; intercepting audio data with a first length from a data segment before the overflow data to obtain first transition data, and intercepting audio data with a second length from the data segment before the overflow data to obtain second transition data; and intercepting audio data with a preset length from a data segment before the first transition data to obtain a search interval.
The maximum overflow data that can be processed at a single time, that is, the maximum number of sampling points that can be processed at a single time by the audio data processing apparatus, may also be set, if the actual overflow data is less than or equal to the maximum overflow data, the actual overflow data is processed, if the actual overflow data is greater than the maximum overflow data, the apparatus does not process, searches for the next segment of overflow data, and so on.
103. And searching similar data with the highest similarity to the transition data from the search interval.
For example, similar data with the highest similarity to the first transition data and the second transition data may be specifically searched from the search interval, where the similar data refers to similar data with the highest similarity to both the first transition data and the second transition data.
104. And replacing the overflow data with a similar value corresponding to the overflow data in the similar data.
For example, it may be specifically determined whether a similarity degree between a similarity value of the similar data and transition data reaches a preset threshold, and if the similarity degree reaches the preset threshold, replacing the overflow data with the similarity value corresponding to the overflow data in the similar data; if the similarity degree of the similar value of the similar data and the transition data does not reach the preset threshold value, the processing is considered to be impossible, the process is directly finished, and the processing of the next section of overflow data can be carried out.
The preset threshold may be set in various manners, for example, the preset threshold may be flexibly set according to the requirements of the actual application, or may be preset and stored in the network device. In addition, the preset threshold may be built in the network device, or may be stored in the memory and transmitted to the network device, and so on.
For example, after replacing the overflow data with a similar value corresponding to the overflow data in the similar data, the transition data may be smoothed or may be finished without being processed for better auditory effect; namely, after the step of replacing the similar value corresponding to the overflow data in the similar data with the overflow data, the method may further include: and smoothing the transition data by adopting a preset algorithm.
The smoothing processing of the transition data by adopting a preset algorithm can be used for obtaining a weight value of the transition data to obtain a first weight value and obtaining a weight value of similar data to obtain a second weight value; summing the product of the transition data and the first weight value and the product of the similar data and the second weight value to obtain a replacement value of the transition data; the transition data is replaced with the replacement value for the transition data.
For example, the preset algorithm may adopt a third preset formula to smooth the transition data, where the third preset formula is as follows:
T3=r*T1+(1-r)*T2
wherein, T1 is transition data in the data processing length, T2 is similar data corresponding to the transition data in the search space, r is a first weight value, 1-r are second weight values, and r is a pure decimal number (i.e. a positive decimal number less than 1, i.e. a decimal number where the integer part is zero).
If the system for acquiring the audio file is a discrete processing system, the overflow data of the acquired offline audio file can be searched first, the searched overflow data is processed, the next overflow data is searched after the processing is finished, and the overflow data of the acquired offline audio file can be processed in parallel; for a real-time processing system, the overflow data of the audio file input into the buffer each time can be processed, that is, the step "obtaining overflow data to be processed from the audio data" can include:
and acquiring overflow data which needs to be processed currently from the audio data.
Then, after the step of "smoothing the transition data using a preset algorithm", the audio data processing method may further include:
and after the transition data is subjected to smoothing processing, returning to the step of acquiring the overflow data needing to be processed currently from the audio data until all the overflow data in the audio data are processed.
As can be seen from the above, in the embodiment, when the audio data is processed, overflow data to be processed can be obtained from the audio data; determining transition data and a search interval of the overflow data; searching similar data with the highest similarity to the transition data from the search interval; replacing the overflow data with a similar value corresponding to the overflow data in the similar data; according to the scheme, the similar data with the highest similarity to the transition data of the overflow data is searched in the search interval to repair the overflow data of the audio, the original waveform of the audio data can be restored to the maximum extent, the transition data is subjected to smooth processing based on the continuity of music, the continuity of the audio data can be improved, the audio data can better accord with the hearing of a user, and better user experience can be obtained.
The following will describe the method according to the foregoing embodiment in further detail by way of example, in which the audio data processing apparatus is specifically integrated in a network device.
As shown in fig. 2a, a specific flow of an audio data processing method may be as follows:
201. the network device obtains the audio file.
For example, when audio data needs to be processed, an offline audio file may be obtained for a discrete processing system, and an audio file input into an incoming buffer may be obtained for a real-time processing system. In the present embodiment, description will be made mainly from the viewpoint of a real-time processing system.
202. And the network equipment preprocesses the acquired audio file to obtain audio data.
The preprocessing can be set according to the requirements of practical application, for example, for convenience of reading and writing, operation and estimation, the acquired audio file can be compressed to reduce data, for example:
(1) and converting the sound intensity of the audio file to obtain a conversion value.
For example, d may be set to a value between [ -200, 0), and the unit is dB (decibel), d may be selected to be-1, d is converted by a first predetermined formula, and the converted value s is 0.89, where the first predetermined formula is as follows:
s=10.^(d/20)
where d is the decibel value of the sound intensity and s is the converted value.
(2) And compressing the audio file by using the conversion numerical value to obtain audio data.
For example, a second preset formula may be adopted to perform unified processing on the audio data, where the second preset formula is as follows:
X1=X(n)*S
where x (N) is data of an audio file, the total length is N, and x1 is the processed audio data.
203. And the network equipment searches the audio data with the sound intensity exceeding a preset maximum value or a preset minimum value from the audio data to obtain overflow data needing to be processed currently.
The overflow data refers to audio data with sound intensity exceeding a preset value, and can be represented by Creal, and there are two general types: an overflow, which means that the sound intensity of the audio data exceeds a preset maximum value, or an underflow, which means that the sound intensity of the audio data exceeds a preset minimum value. The preset maximum value or the preset minimum value may be set in various manners, for example, may be preset in the system, or may be set as needed during the processing, for example, the actual continuous overflow data Creal of the processing is 5 sampling points.
204. The network device determines transition data and a search interval of the overflow data.
For example, transition data of the overflow data may be specifically determined, the transition data refers to data before and after the overflow data, the transition data may include first transition data and second transition data, for example, as shown in fig. 2b, the length M of the first transition data may be set to 10, the length N of the second transition data may be set to 12, 10 sampling points before the overflow data may be determined as the first transition data M1 according to the length of the first transition data, and 12 sampling points after the overflow data may be determined as the second transition data N1 according to the length of the second transition data, and the audio data before the first transition data may determine the search interval Lseek.
Wherein, the data length of the first transition data M1, the overflow data Creal, and the second transition data N1 may be used as the data processing length Lreal of the audio processing, i.e., Lreal-M1 + Creal + N1-10 +5+ 12-27; a section larger than the data processing length is acquired in the audio data before the first transition data as a search section Lseek, and the length of the search section is larger than the data processing length, that is, Lseek > Lreal.
Correspondingly, after the length of the search interval is determined, whether the length of the audio data before the first transition data is larger than the length of the search interval is judged. If the length of the audio data before the first transition data is larger than or equal to the length of the search interval, executing 203; if the length of the audio data before the first transition data is smaller than the length of the search interval, the system reports an error, does not process the audio data, and ends the audio data processing task.
The maximum overflow data Cmax that can be processed at a single time may also be set, for example, the Cmax is 7 sampling points, that is, the number of consecutive maximum sampling points that the audio data processing apparatus can process at a single time is 7, if the actual overflow data is less than or equal to the maximum overflow data, the actual overflow data may be processed, if the actual overflow data is greater than the maximum overflow data, the apparatus does not process, searches for the next overflow data, and so on.
205. And the network equipment searches the similar data with the highest similarity with the transition data from the search interval.
For example, the similar data with the highest similarity to the first transition data and the second transition data may be specifically searched from the search interval to be Lsim, where the similar data refers to the similar data with the highest similarity to both the first transition data and the second transition data.
206. The network device replaces the overflow data with a similar value in the similar data corresponding to the overflow data.
For example, it may be specifically determined whether a similarity degree between a similarity value of the similar data and transition data reaches a preset threshold, and if the similarity degree reaches the preset threshold, replacing the overflow data with a similarity value corresponding to the overflow data in the similar data, where the similarity value corresponding to the first transition data is M2, and the similarity value corresponding to the first transition data is N2; if the similarity degree of the similarity value of the similar data does not reach the preset threshold value, the processing is considered to be impossible, the process is directly finished, and the processing of the next section of overflow data can be carried out.
The preset threshold may be set in various manners, for example, the preset threshold may be flexibly set according to the requirements of the actual application, or may be preset and stored in the network device. In addition, the preset threshold may be built in the network device, or may be stored in the memory and transmitted to the network device, and so on. For example, the preset threshold may be set to 70%.
207. And the network equipment adopts a preset algorithm to carry out smooth processing on the transition data.
The smoothing processing of the transition data by adopting a preset algorithm can be used for obtaining a weight value of the transition data to obtain a first weight value and obtaining a weight value of similar data to obtain a second weight value; summing the product of the transition data and the first weight value and the product of the similar data and the second weight value to obtain a replacement value of the transition data; the transition data is replaced with the replacement value for the transition data.
The preset algorithm may adopt a third preset formula to perform smoothing processing on the transition data, where the third preset formula is as follows:
T3=r*T1+(1-r)*T2
where T1 is transition data in the data processing length, T2 is similar data corresponding to the transition data in the search space, r is a first weight value, 1-r are second weight values, and r is a pure decimal number (i.e., a positive decimal number less than 1, i.e., a decimal number where the integer part is zero), for example, r may be selected to be 0.5.
208. And the network equipment returns to execute the step of acquiring the overflowing data needing to be processed currently from the audio data until all the overflowing data in the audio data are processed.
For the real-time processing system, the overflow data of the audio file input into the buffer each time can be processed, that is, after the network device finishes the smooth processing of the transition data, the step of obtaining the overflow data which needs to be processed currently from the audio data is returned to be executed until all the overflow data in the audio data are processed.
In the discrete processing system, the processing may be performed by the above embodiments, or may be performed in parallel and collectively at one time.
As can be seen from the above, when the network device of this embodiment processes the audio data, the network device may obtain overflow data to be processed from the audio data; determining transition data and a search interval of the overflow data; searching similar data with the highest similarity to the transition data from the search interval; replacing the overflow data with a similar value corresponding to the overflow data in the similar data; according to the scheme, the similar data with the highest similarity to the transition data of the overflow data is searched in the search interval to repair the overflow data of the audio, the original waveform of the audio data can be restored to the maximum extent, the continuity of the audio data is improved, the audio data is enabled to be more in line with the auditory sensation of a user, and better user experience can be obtained.
In order to better implement the audio data processing method provided by the embodiment of the present application, an embodiment of the present application further provides an audio data processing apparatus, and the audio data processing apparatus may be specifically integrated in a network device such as a mobile phone, a tablet computer, a palm computer, and the like. Wherein the noun has the same meaning as in the above audio data processing method, and the details of the implementation can be referred to the description in the method embodiment.
For example, as shown in fig. 3a, the audio data processing apparatus may comprise an obtaining module 301, a determining module 302, a finding module 303, and a replacing module 304, as follows:
(1) an acquisition module 301;
an obtaining module 301, configured to obtain overflow data to be processed from the audio data.
The overflow data refers to audio data with sound intensity exceeding a preset value, and there are two types of data: an overflow, which means that the sound intensity of the audio data exceeds a preset maximum value, or an underflow, which means that the sound intensity of the audio data exceeds a preset minimum value. The preset maximum value or the preset minimum value may be set in various ways, for example, may be preset in the system, or may be set as needed during processing.
For example, as shown in fig. 3b, there are many embodiments that the obtaining module 301 may be configured to obtain overflow data to be processed from audio data, and in some embodiments, the obtaining module 301 may include a obtaining sub-module 3011, a preprocessing sub-module 3012, and a searching sub-module 3013:
the obtaining sub-module 3011 may be configured to obtain an audio file.
The preprocessing submodule 3012 may be configured to perform preprocessing on the audio file to obtain audio data.
The searching sub-module 3013 may be configured to search, from the audio data, audio data with a sound intensity exceeding a preset maximum value or a preset minimum value, and obtain overflow data to be processed.
For example, when the audio data needs to be processed, an offline audio file may be obtained for a discrete processing system, an audio file input into a buffer may be obtained for a real-time processing system, the obtained audio file is preprocessed to obtain audio data, and then, the audio data whose sound intensity exceeds a preset maximum value or a preset minimum value is searched from the audio data to obtain overflow data to be processed.
(2) A determination module 302;
a determining module 302, configured to determine transition data and a search interval of the overflow data, for example, the determining module may be configured to set a length of the first transition data and a length of the second transition data respectively; determining first transition data before the overflow data according to the length of the first transition data, and determining second transition data after the overflow data according to the length of the second transition data; the audio data preceding the first transition data determines a search interval.
For example, the determining module 302 may be configured to determine transition data of the overflow data, where the transition data refers to data before and after the overflow data, and the transition data may include first transition data and second transition data. The determining module 302 may be used to determine the search interval in various ways, for example, the search interval may be determined according to the length of data processing, or the search interval may be determined according to other ways.
Determining a search interval in the audio data before the first transition data, wherein the sum of the length of the first transition data, the length of the overflow data and the length of the second transition data can be used as the data processing length of the audio processing; an interval greater than the data processing length is acquired as a search interval in the audio data preceding the first transition data.
After the length of the search interval is determined, whether the length of the audio data before the first transition data is larger than the length of the search interval is judged. If the length of the audio data before the first transition data is greater than or equal to the length of the search interval, triggering the search module 303 to execute a search operation; if the length of the audio data before the first transition data is smaller than the length of the search interval, the determining module 302 reports an error, does not process the audio data, and ends the audio data processing task.
The maximum overflow data that can be processed at a single time, that is, the maximum number of sampling points that can be processed at a single time by the audio data processing apparatus, may also be set, if the actual overflow data is less than or equal to the maximum overflow data, the actual overflow data is processed, if the actual overflow data is greater than the maximum overflow data, the apparatus does not process, searches for the next segment of overflow data, and so on.
(3) A search module 303;
the searching module 303 is configured to search for similar data with the highest similarity to the transition data from the search interval.
For example, the searching module 303 may be specifically configured to search, from the search interval, similar data that is similar to the first transition data and the second transition data and has the highest similarity, where the similar data refers to similar data that is similar to both the first transition data and the second transition data and has the highest similarity.
(4) A replacement module 304;
a replacing module 304, configured to replace the overflow data with a similar value corresponding to the overflow data in the similar data.
For example, the replacing module 304 may be configured to determine whether a similarity degree between the similar value of the similar data and the transition data reaches a preset threshold, and if the similarity degree reaches the preset threshold, replace the overflow data with the similar value corresponding to the overflow data in the similar data; if the similarity degree of the similar value of the similar data and the transition data does not reach the preset threshold value, the processing is considered to be impossible, the process is directly finished, and the processing of the next section of overflow data can be carried out.
The preset threshold may be set in various manners, for example, the preset threshold may be flexibly set according to the requirements of the actual application, or may be preset and stored in the network device. In addition, the preset threshold may be built in the network device, or may be stored in the memory and transmitted to the network device, and so on.
For example, as shown in fig. 3c, for better hearing effect, the transition data may be smoothed, i.e. the audio processing apparatus may further comprise a smoothing module 305, as follows:
the smoothing module 305 may be configured to smooth the transition data by using a preset algorithm.
The preset algorithm may be set according to the requirements of the practical application, for example, a third preset formula may be adopted to perform smoothing processing on the transition data, where the third preset formula is as follows:
T3=r*T1+(1-r)*T2
wherein, T1 is transition data in the data processing length, T2 is similar data corresponding to the transition data in the search interval, and r is a pure decimal.
If the system for acquiring the audio file is a discrete processing system, the overflow data of the acquired offline audio file can be searched first, the searched overflow data is processed, the next overflow data is searched after the processing is finished, and the overflow data of the acquired offline audio file can be processed in parallel; for a real-time processing system, overflow data of an audio file input into a buffer each time can be processed, that is, the obtaining module 301 can be used to obtain overflow data that needs to be processed currently from the audio data. After the smoothing module 305 finishes smoothing the transition data, the obtaining module 301 may be triggered to perform an operation of obtaining overflow data that needs to be processed currently from the audio data until all overflow data in the audio data are processed.
It will be appreciated by a person skilled in the art that the audio data processing device shown in fig. 3a does not constitute a limitation of the device and may comprise more or less components than those shown, or some components may be combined, or a different arrangement of components. In addition, it should be noted that the specific implementation of each unit may refer to the foregoing method embodiment, and is not described herein again.
As can be seen from the above, the obtaining module 301 of the audio data processing apparatus of this embodiment can obtain overflow data to be processed from the audio data; then the determining module 302 determines the transition data and the search interval of the overflow data; next, the searching module 303 searches for similar data with the highest similarity to the transition data from the search interval; then the replacement module 304 replaces the overflow data with a similar value corresponding to the overflow data in the similar data; according to the scheme, the similar data with the highest similarity to the transition data of the overflow data is searched in the search interval to repair the overflow data of the audio, the original waveform of the audio data can be restored to the maximum extent, the continuity of the audio data is improved, the audio data is enabled to be more in line with the auditory sensation of a user, and better user experience can be obtained.
Correspondingly, the embodiment of the invention also provides a network device, which can be a server or a terminal and the like, and integrates any audio data processing device provided by the embodiment of the invention. Fig. 4 is a schematic diagram illustrating a network device according to an embodiment of the present invention, specifically:
the network device may include components such as a processor 401 of one or more processing cores, memory 402 of one or more computer-readable storage media, a power supply 403, and an input unit 404. Those skilled in the art will appreciate that the network device architecture shown in fig. 4 does not constitute a limitation of network devices and may include more or fewer components than shown, or some components may be combined, or a different arrangement of components. Wherein:
the processor 401 is a control center of the network device, connects various parts of the entire network device by using various interfaces and lines, and performs various functions of the network device and processes data by running or executing software programs and/or modules stored in the memory 402 and calling data stored in the memory 402, thereby performing overall monitoring of the network device. Optionally, processor 401 may include one or more processing cores; preferably, the processor 401 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 401.
The memory 402 may be used to store software programs and modules, and the processor 401 executes various functional applications and data processing by operating the software programs and modules stored in the memory 402. The memory 402 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data created according to use of the network device, and the like. Further, the memory 402 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, the memory 402 may also include a memory controller to provide the processor 401 access to the memory 402.
The network device further includes a power supply 403 for supplying power to each component, and preferably, the power supply 403 is logically connected to the processor 401 through a power management system, so that functions of managing charging, discharging, and power consumption are implemented through the power management system. The power supply 403 may also include any component of one or more dc or ac power sources, recharging systems, power failure detection circuitry, power converters or inverters, power status indicators, and the like.
The network device may also include an input unit 404, where the input unit 404 may be used to receive input numeric or character information and generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control.
Although not shown, the network device may further include a display unit and the like, which are not described in detail herein. Specifically, in this embodiment, the processor 401 in the network device loads the executable file corresponding to the process of one or more application programs into the memory 402 according to the following instructions, and the processor 401 runs the application program stored in the memory 402, thereby implementing various functions as follows:
when the audio data are processed, overflow data to be processed can be obtained from the audio data; determining transition data and a search interval of the overflow data; searching similar data with the highest similarity to the transition data from the search interval; and replacing the overflow data with a similar value corresponding to the overflow data in the similar data.
For example, taking transition data including first transition data and second transition data as an example, a length of the first transition data and a length of the second transition data may be specifically set, and then, on one hand, first transition data before the overflow data may be determined according to the length of the first transition data, and second transition data after the overflow data may be determined according to the length of the second transition data, and on the other hand, a search interval may be determined in audio data before the first transition data, for example, a sum of the length of the first transition data, the length of the overflow data, and the length of the second transition data may be used as a data processing length of audio processing, and then, an interval larger than the data processing length may be obtained in the audio data before the first transition data as a search interval, and so on.
Optionally, searching for similar data with the highest similarity to the transition data from the search interval may specifically include: and searching similar data with the highest similarity to the first transition data and the second transition data from the search interval.
Then, at this time, "replace the overflow data with a similar value corresponding to the overflow data in the similar data" may specifically be: and judging whether the similarity degree of the similar value of the similar data and the transition data reaches a preset threshold value, and if the similarity degree reaches the preset threshold value, replacing the overflow data with the similar value corresponding to the overflow data in the similar data. If the similarity degree does not reach the preset threshold value, the processing can be considered to be impossible, and the processing of the next section of overflow data is directly finished.
Optionally, after the similar value corresponding to the overflow data in the search data is substituted for the overflow data, a preset algorithm may be further adopted to perform smoothing processing on the transition data. That is, the processor 401 may also run an application program stored in the memory 402, thereby implementing the following functions:
acquiring a weighted value of the transition data to obtain a first weighted value, and acquiring a weighted value of the similar data to obtain a second weighted value; summing the product of the transition data and the first weight value and the product of the similar data and the second weight value to obtain a replacement value of the transition data; the transition data is replaced with the replacement value for the transition data.
Optionally, the obtaining of overflow data to be processed from the audio data may be: the method comprises the steps of obtaining an audio file, preprocessing the audio file to obtain audio data, searching the audio data with the sound intensity exceeding a preset maximum value or a preset minimum value from the audio data, and obtaining overflow data to be processed.
For example, the sound intensity of the audio file may be converted to obtain a conversion value, and then the audio file is compressed by using the conversion value to obtain audio data.
The above operations can be referred to the previous embodiments specifically, and are not described herein again.
As can be seen from the above, when the network device of this embodiment processes the audio data, the network device may obtain overflow data to be processed from the audio data; determining transition data and a search interval of the overflow data; searching similar data with the highest similarity to the transition data from the search interval; replacing the overflow data with a similar value corresponding to the overflow data in the similar data; according to the scheme, the similar data with the highest similarity to the transition data of the overflow data is searched in the search interval to repair the overflow data of the audio, the original waveform of the audio data can be restored to the maximum extent, the continuity of the audio data is improved, the audio data is enabled to be more in line with the auditory sensation of a user, and better user experience can be obtained.
It will be understood by those skilled in the art that all or part of the steps of the methods of the above embodiments may be performed by instructions or by associated hardware controlled by the instructions, which may be stored in a computer readable storage medium and loaded and executed by a processor.
To this end, the present application provides a storage medium, in which a plurality of instructions are stored, and the instructions can be loaded by a processor to execute the steps in any one of the audio data processing methods provided by the embodiments of the present application. For example, the instructions may perform the steps of:
when the audio data are processed, overflow data to be processed can be obtained from the audio data; determining transition data and a search interval of the overflow data; searching similar data with the highest similarity to the transition data from the search interval; and replacing the overflow data with a similar value corresponding to the overflow data in the similar data.
The above operations can be implemented in the foregoing embodiments, and are not described in detail herein.
Wherein the storage medium may include: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.
Since the instructions stored in the storage medium can execute the steps in any audio data processing method provided in the embodiments of the present application, the beneficial effects that any method provided in the embodiments of the present application can be applied to audio data processing can be achieved.
The foregoing detailed description has provided a method, an apparatus, and a storage medium for processing audio data according to embodiments of the present application, and specific examples have been applied in the present application to explain the principles and implementations of the present application, and the descriptions of the foregoing embodiments are only used to help understand the method and the core ideas of the present application; meanwhile, for those skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (15)

1. A method of audio data processing, comprising:
acquiring overflow data to be processed from the audio data, wherein the overflow data refers to the audio data with the sound intensity exceeding a preset value;
respectively setting the length of the first transition data and the length of the second transition data;
determining first transition data before the overflow data according to the length of the first transition data, and determining second transition data after the overflow data according to the length of the second transition data;
determining a search interval for audio data preceding the first transition data;
searching similar data with the highest similarity to the transition data from the search interval;
and replacing the overflow data with a similar value corresponding to the overflow data in the similar data.
2. The audio data processing method of claim 1, wherein the determining the search interval for the audio data preceding the first transition data comprises:
taking the sum of the length of the first transition data, the length of the overflow data and the length of the second transition data as the data processing length of audio processing;
and acquiring an interval which is longer than the data processing length in the audio data before the first transition data as a search interval.
3. The audio data processing method of claim 1, wherein the searching for similar data with the highest similarity to the transition data from the search interval comprises:
searching similar data with the highest similarity to the first transition data and the second transition data from the search interval;
the replacing the similar value corresponding to the overflow data in the similar data with the overflow data specifically includes: and judging whether the similarity degree of the similar value of the similar data and the transition data reaches a preset threshold value, and if the similarity degree reaches the preset threshold value, replacing the overflow data with the similar value corresponding to the overflow data in the similar data.
4. The audio data processing method according to any one of claims 1 to 3, wherein, after replacing the overflow data with a similar value corresponding to the overflow data in the search data, the method further comprises:
and smoothing the transition data by adopting a preset algorithm.
5. The audio data processing method according to claim 4, wherein the smoothing the transition data by using a preset algorithm comprises:
acquiring a weighted value of the transition data to obtain a first weighted value, and acquiring a weighted value of the similar data to obtain a second weighted value;
summing the product of the transition data and the first weight value and the product of the similar data and the second weight value to obtain a replacement value of the transition data;
and replacing the transition data with the replacement value of the transition data.
6. The audio data processing method of claim 1, wherein the obtaining overflow data to be processed from the audio data comprises:
acquiring an audio file;
preprocessing the audio file to obtain audio data;
and searching audio data with the sound intensity exceeding a preset maximum value or a preset minimum value from the audio data to obtain overflow data to be processed.
7. The audio data processing method of claim 6, wherein the pre-processing the audio file to obtain audio data comprises:
converting the sound intensity of the audio file to obtain a conversion value;
and compressing the audio file by using the conversion numerical value to obtain audio data.
8. An audio data processing apparatus, comprising:
the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring overflow data to be processed from audio data, and the overflow data refers to the audio data with the sound intensity exceeding a preset value;
the determining module is used for respectively setting the length of the first transition data and the length of the second transition data; determining first transition data before the overflow data according to the length of the first transition data, and determining second transition data after the overflow data according to the length of the second transition data; determining a search interval for audio data preceding the first transition data;
the searching module is used for searching similar data with the highest similarity to the transition data from the searching interval;
and the replacing module is used for replacing the overflow data with a similar value corresponding to the overflow data in the similar data.
9. The audio data processing apparatus of claim 8,
the determining module is specifically configured to use a sum of a length of the first transition data, a length of the overflow data, and a length of the second transition data as a data processing length of the audio processing; and acquiring an interval which is longer than the data processing length in the audio data before the first transition data as a search interval.
10. The audio data processing apparatus of claim 8,
the search module is specifically configured to search, from the search interval, similar data with the highest similarity to the first transition data and the second transition data;
the replacing module is specifically configured to determine whether a similarity value of the similar data reaches a preset threshold, and replace the similar value corresponding to the overflow data in the similar data with the overflow data if the similarity degree reaches the preset threshold.
11. The audio data processing device according to any one of claims 8 to 10, further comprising:
and the smoothing processing module is used for smoothing the transition data by adopting a preset algorithm.
12. The audio data processing apparatus of claim 11,
the smoothing processing module is specifically used for acquiring a weighted value of the transition data to obtain a first weighted value, and acquiring a weighted value of the similar data to obtain a second weighted value; summing the product of the transition data and the first weight value and the product of the similar data and the second weight value to obtain a replacement value of the transition data; and replacing the transition data with the replacement value of the transition data.
13. The audio data processing apparatus of claim 8, wherein the obtaining module comprises:
the acquisition submodule is used for acquiring an audio file;
the preprocessing submodule is used for preprocessing the audio file to obtain audio data;
and the searching submodule is used for searching the audio data of which the sound intensity exceeds a preset maximum value or a preset minimum value from the audio data to obtain overflow data to be processed.
14. The audio data processing apparatus of claim 13,
the preprocessing submodule is specifically used for converting the sound intensity of the audio file to obtain a conversion numerical value; and compressing the audio file by using the conversion numerical value to obtain audio data.
15. A storage medium storing a plurality of instructions adapted to be loaded by a processor to perform the steps of the audio data processing method according to any one of claims 1 to 7.
CN201910190569.1A 2019-03-13 2019-03-13 Audio data processing method, device and storage medium Active CN109961796B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910190569.1A CN109961796B (en) 2019-03-13 2019-03-13 Audio data processing method, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910190569.1A CN109961796B (en) 2019-03-13 2019-03-13 Audio data processing method, device and storage medium

Publications (2)

Publication Number Publication Date
CN109961796A CN109961796A (en) 2019-07-02
CN109961796B true CN109961796B (en) 2020-12-01

Family

ID=67024338

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910190569.1A Active CN109961796B (en) 2019-03-13 2019-03-13 Audio data processing method, device and storage medium

Country Status (1)

Country Link
CN (1) CN109961796B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1236153A (en) * 1998-05-14 1999-11-24 索尼公司 Audio signal processing apparatus and audio signal reproducing apparatus
CN102610235A (en) * 2011-12-22 2012-07-25 深圳市万兴软件有限公司 Sound mixing processing method, device and intelligent equipment
CN104488283A (en) * 2013-03-08 2015-04-01 尼尔森(美国)有限公司 Methods and systems for reducing spillover by detecting signal distortion
GB2545519A (en) * 2015-12-18 2017-06-21 Cirrus Logic Int Semiconductor Ltd Systems and methods for restoring microelectromechanical system transducer operation following plosive event
CN107086039A (en) * 2017-05-25 2017-08-22 北京小鱼在家科技有限公司 A kind of acoustic signal processing method and device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105719653B (en) * 2016-01-28 2020-04-24 腾讯科技(深圳)有限公司 Mixed sound processing method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1236153A (en) * 1998-05-14 1999-11-24 索尼公司 Audio signal processing apparatus and audio signal reproducing apparatus
CN102610235A (en) * 2011-12-22 2012-07-25 深圳市万兴软件有限公司 Sound mixing processing method, device and intelligent equipment
CN104488283A (en) * 2013-03-08 2015-04-01 尼尔森(美国)有限公司 Methods and systems for reducing spillover by detecting signal distortion
GB2545519A (en) * 2015-12-18 2017-06-21 Cirrus Logic Int Semiconductor Ltd Systems and methods for restoring microelectromechanical system transducer operation following plosive event
CN107086039A (en) * 2017-05-25 2017-08-22 北京小鱼在家科技有限公司 A kind of acoustic signal processing method and device

Also Published As

Publication number Publication date
CN109961796A (en) 2019-07-02

Similar Documents

Publication Publication Date Title
US9940929B2 (en) Extending the period of voice recognition
CN109542512B (en) Data processing method, device and storage medium
CN110265064B (en) Audio frequency crackle detection method, device and storage medium
CN107885545B (en) Application management method and device, storage medium and electronic equipment
CN111177453B (en) Method, apparatus, device and computer readable storage medium for controlling audio playing
CN110111811A (en) Audio signal detection method, device and storage medium
CN110688518A (en) Rhythm point determining method, device, equipment and storage medium
US11194378B2 (en) Information processing method and electronic device
US8868419B2 (en) Generalizing text content summary from speech content
CN107861602B (en) Terminal CPU performance control method, terminal and computer readable storage medium
CN109961796B (en) Audio data processing method, device and storage medium
US20150220362A1 (en) Multi-core processor system, electrical power control method, and computer product for migrating process from one core to another
CN113053362A (en) Method, device, equipment and computer readable medium for speech recognition
CN114363704B (en) Video playing method, device, equipment and storage medium
CN114490432A (en) Memory processing method and device, electronic equipment and computer readable storage medium
CN111381953B (en) Process management method and device, storage medium and electronic equipment
CN112002352B (en) Random music playing method and device, computer equipment and storage medium
CN112489644B (en) Voice recognition method and device for electronic equipment
US11431349B2 (en) Method, electronic device and computer program product for processing data
CN110337046A (en) It is a kind of to pass through the quick method, apparatus, computer equipment and computer readable storage medium of positioning video on a timeline of picture
CN116721662B (en) Audio processing method and device, storage medium and electronic equipment
CN115098456B (en) File processing method and device, storage medium and electronic equipment
CN114489445B (en) Screen adjustment method and device, intelligent equipment and storage medium
CN114090229A (en) Dynamic expansion and contraction method and device in elastic expansion group
CN115511620A (en) Fault processing method and device, storage medium and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant