CN104599681B - The method and apparatus of audio frequency process - Google Patents
The method and apparatus of audio frequency process Download PDFInfo
- Publication number
- CN104599681B CN104599681B CN201410855770.4A CN201410855770A CN104599681B CN 104599681 B CN104599681 B CN 104599681B CN 201410855770 A CN201410855770 A CN 201410855770A CN 104599681 B CN104599681 B CN 104599681B
- Authority
- CN
- China
- Prior art keywords
- data point
- oscillogram
- audio file
- difference
- minimum duration
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Abstract
The invention discloses a kind of method and apparatus of audio frequency process, it is related to computer realm, can automatically intercepts the climax parts of audio.Methods described includes:According to the oscillogram of audio file, the beginning and end of the climax parts of the audio file is automatically determined;Export the audio between the starting point and the terminal.The present invention is used to make audio.
Description
Technical field
The present invention relates to computer realm, more particularly to a kind of method and apparatus of audio frequency process.
Background technology
With internet popularization and continue to develop, people are downloaded by internet more and more oneself to be liked
The tinkle of bells.
The tinkle of bells on current internet is relied primarily on manually to be made by way of the climax parts of Manual interception audio
Make.But, this mode for making the tinkle of bells can not intercept the climax parts of audio exactly, and need to spend substantial amounts of people
Work cost.
The content of the invention
The embodiment of the present invention provides a kind of method and apparatus of audio frequency process, can automatically intercept the climax portion of audio
Point, save labour turnover.
First aspect includes there is provided a kind of method of audio frequency process, methods described:
According to the oscillogram of audio file, the beginning and end of the climax parts of the audio file is automatically determined;
Export the audio between the starting point and the terminal.
Wherein, alternatively, the oscillogram according to audio file, automatically determines the climax parts of the audio file
Beginning and end may include:
According to the oscillogram of audio file, determine that amplitude in the oscillogram is more than the data point of amplitude threshold, wherein,
Continuous data point formation cluster data point;
It is determined that per number of clusters strong point corresponding area in the oscillogram, and choose the two of the maximum cluster data point of area
The end data point corresponding time, it is used as the beginning and end of the climax parts of the audio file.
Alternatively, in one embodiment of the invention, methods described also includes:
Pre-set the minimum duration of the climax parts of the audio file;
The two ends data point corresponding time for choosing the maximum cluster data point of area, it is used as the audio file
The beginning and end of climax parts includes:
Determine the difference of the two ends data point corresponding time of the maximum cluster data point of area;
If difference is more than or equal to the minimum duration, it is determined that the two ends data point corresponding time is respectively described
The beginning and end of the climax parts of audio file.
In another embodiment of the present invention, further, the oscillogram is more smooth than carrying out using waveform compression
The oscillogram obtained after processing, wherein, the waveform compression ratio refers to, is smoothed in the waveform to the audio file
During, selected data are counted out in each data neighborhood of a point;
Methods described may also include:
If the difference is less than the minimum duration, waveform compression ratio is heightened, and perform following steps:
A, again according to the oscillogram of the audio file, determine that amplitude in the oscillogram is more than the number of amplitude threshold
Strong point, wherein, continuous data point formation cluster data point;
B, determination determine the two of the maximum cluster data point of area per number of clusters strong point corresponding area in the oscillogram
The difference of end data point corresponding time;
If c, difference are more than or equal to the minimum duration, it is determined that the two ends data point corresponding time is respectively institute
State the beginning and end of the climax parts of audio file;
If d, difference are still less than the minimum duration, continue to heighten waveform compression ratio, repeat step a-c until poor
Value is more than or equal to the minimum duration.
In another embodiment of the present invention, further, methods described may also include:
Pre-set maximum waveform compression ratio;
If the waveform compression ratio used is more than the maximum waveform compression ratio, turn down the amplitude threshold, and perform with
Lower step;
E, the oscillogram according to audio file, determine to be more than the amplitude of the amplitude threshold after adjustment in the oscillogram,
Wherein, continuous data point formation cluster data point;
F, determination determine the two of the maximum cluster data point of area per number of clusters strong point corresponding area in the oscillogram
The difference of end data point corresponding time;
If g, difference are more than or equal to the minimum duration, it is determined that the two ends data point corresponding time is respectively institute
State the beginning and end of the climax parts of audio file;
If h, difference are less than the minimum duration, continue to turn down the amplitude threshold, and perform step e-g until difference
More than or equal to the minimum duration.
Wherein, in one embodiment of the invention, in the oscillogram according to audio file, the sound is automatically determined
Before the beginning and end of the climax parts of frequency file, methods described may also include:
It is determined that the mean value of amplitude in whole oscillogram;
According to the mean value of amplitude, amplitude threshold is determined.
Second aspect includes there is provided a kind of device of audio frequency process, described device:
Processing module, for the oscillogram according to audio file, automatically determines of the climax parts of the audio file
Point and terminal;
Output module, for exporting the audio between the starting point and the terminal.
Wherein, alternatively, the processing module specifically for:
According to the oscillogram of audio file, determine that amplitude in the oscillogram is more than the data point of amplitude threshold, wherein,
Continuous data point formation cluster data point;
It is determined that per number of clusters strong point corresponding area in the oscillogram, and choose the two of the maximum cluster data point of area
The end data point corresponding time, it is used as the beginning and end of the climax parts of the audio file.
Alternatively, in one embodiment of the invention, described device also includes:
Setup module, the minimum duration of the climax parts for pre-setting the audio file;
When the two ends data point corresponding time for choosing the maximum cluster data point of area, the height of the audio file is used as
During the beginning and end of damp part, the processing module specifically for:
Determine the difference of the two ends data point corresponding time of the maximum cluster data point of area;
If difference is more than or equal to the minimum duration that the setup module is set, it is determined that the two ends data point is corresponding
Time is respectively the beginning and end of the climax parts of the audio file.
In another embodiment of the present invention, further, the oscillogram is more smooth than carrying out using waveform compression
The oscillogram obtained after processing, wherein, the waveform compression ratio refers to, is smoothed in the waveform to the audio file
During, selected data are counted out in the field of each data point;
The processing module, if being additionally operable to the difference less than the minimum duration, heightens waveform compression ratio, and perform
Following steps:
A, again according to the oscillogram of the audio file, determine that amplitude in the oscillogram is more than the number of amplitude threshold
Strong point, wherein, continuous data point formation cluster data point;
B, determination determine the two of the maximum cluster data point of area per number of clusters strong point corresponding area in the oscillogram
The difference of end data point corresponding time;
If c, difference are more than or equal to the minimum duration, it is determined that the two ends data point corresponding time is respectively institute
State the beginning and end of the climax parts of audio file;
If d, difference are still less than the minimum duration, continue to heighten waveform compression ratio, repeat step a-c until poor
Value is more than or equal to the minimum duration.
In another embodiment of the present invention, further, the setup module, is additionally operable to pre-set maximum waveform
Compression ratio;
The processing module, if being additionally operable to the waveform compression that uses than the maximum waveform pressure that is set more than the setup module
Contracting ratio, then turn down the amplitude threshold, and perform following steps:
E, the oscillogram according to audio file, determine to be more than the amplitude of the amplitude threshold after adjustment in the oscillogram,
Wherein, continuous data point formation cluster data point;
F, determination determine the two of the maximum cluster data point of area per number of clusters strong point corresponding area in the oscillogram
The difference of end data point corresponding time;
If g, difference are more than or equal to the minimum duration, it is determined that the two ends data point corresponding time is respectively institute
State the beginning and end of the climax parts of audio file;
If h, difference are less than the minimum duration, continue to turn down the amplitude threshold, and perform step e-g until difference
More than or equal to the minimum duration.
Wherein, alternatively in one embodiment of the invention, the processing module is additionally operable to:
It is determined that the mean value of amplitude in whole oscillogram;
According to the mean value of amplitude, amplitude threshold is determined.
The method and apparatus of audio frequency process provided in an embodiment of the present invention, determined according to the oscillogram of audio file described in
The beginning and end of the climax parts of audio file, compared to the side of correlation technique manually determined climax parts beginning and end
Formula, improves accuracy, and can automatically intercept the climax parts of audio, saves labour turnover.
Brief description of the drawings
Technical scheme in order to illustrate the embodiments of the present invention more clearly, makes required in being described below to embodiment
Accompanying drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the present invention, for
For those of ordinary skill in the art, on the premise of not paying creative work, other can also be obtained according to these accompanying drawings
Accompanying drawing.
Fig. 1 is a kind of flow chart of the method for audio frequency process provided in an embodiment of the present invention;
Fig. 2A is the flow chart of the method for another audio frequency process provided in an embodiment of the present invention;
Fig. 2 B are a kind of exemplary waveforms figures provided in an embodiment of the present invention;
Fig. 3 is the flow chart of the method for another audio frequency process provided in an embodiment of the present invention;
Fig. 4 A are the structural representations of the device of audio frequency process provided in an embodiment of the present invention;
Fig. 4 B are the structural representations of the device of audio frequency process provided in an embodiment of the present invention.
Embodiment
To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with accompanying drawing to embodiment party of the present invention
Formula is described in further detail.
Fig. 1 is a kind of flow chart of the method for audio frequency process provided in an embodiment of the present invention.Reference picture 1, the present invention is implemented
The method for the audio frequency process that example is provided may include:
11st, according to the oscillogram of audio file, the beginning and end of the climax parts of the audio file is automatically determined;
12nd, the audio between the starting point and the terminal is exported.
The method of audio frequency process provided in an embodiment of the present invention, the audio text is determined according to the oscillogram of audio file
The beginning and end of the climax parts of part, compared to the mode of correlation technique manually determined climax parts beginning and end, is improved
Accuracy, and can automatically intercept the climax parts of audio, save labour turnover.
It should be noted that in embodiments of the present invention, it is necessary to carry out modulus to audio file first before step 11
Conversion, i.e. the audio of audio file is converted into data signal from analog signal, audio wave is then converted into according to data signal
Shape, and then idealized (smooth, scaling processing) to waveform to obtain final sound using boxcar smoothing processing algorithm
The oscillogram of frequency file.
Boxcar is smooth, i.e. neighborhood averaging, is that one kind carries out Stencil operation volume using sliding window masterplate to spectrogram
The smoothing method of product computing.So-called sliding window masterplate refers to that the value of whole coefficients in masterplate is identical masterplate, commonly uses 3
Point, 5 points and 7 masterplates.The principle of neighborhood averaging is to ask for averagely removing by data point in a data point and neighborhood
The data point of mutation, so as to filter certain noise.Before and after value in boxcar algorithm after i-th of data point smoothing processing is it
The average value of each M data point, therefore it is 2M+1 to participate in average data in each masterplate and count out.Herein, waveform compression
Than the size for referring to M values.
Fig. 2A is the flow chart of the processing method of audio file provided in an embodiment of the present invention.Reference picture 2A, the present invention is real
Applying the method for the audio frequency process of example offer may include:
21st, according to the oscillogram of audio file, determine that amplitude in the oscillogram is more than the data point of amplitude threshold, its
In, continuous data point formation cluster data point;
22nd, determine per number of clusters strong point corresponding area in the oscillogram, and choose the maximum cluster data point of area
The two ends data point corresponding time, be used as the beginning and end of the climax parts of the audio file;
23rd, the audio between the starting point and the terminal is exported.
Wherein, the climax parts of audio file refer to, the overall amplitude of audio file is larger in the range of a period of time
Part.In embodiments of the present invention, the part in oscillogram more than amplitude threshold may be considered the climax portion of audio file
Point.
In embodiments of the present invention, amplitude threshold can be determined in the following way:It is determined that the amplitude in whole oscillogram
Average value;According to the mean value of amplitude, amplitude threshold is determined.Because amplitude threshold is chosen based on mean value of amplitude, select
Taking can typically ensure that amplitude threshold is more than mean value of amplitude during amplitude threshold.So, it is ensured that the selection of amplitude threshold
It is more scientific and reasonable.
Specifically, if the average of whole waveform is Avg, amplitude threshold is P, then can choose P=Avg+0.5*Avg, P=
Avg+0.4*Avg, P=Avg+0.3*Avg or T=Avg+0.2*Avg etc..When choosing, larger shake can be preferentially chosen
Width threshold value, i.e. can preferentially choose P=Avg+0.5*Avg.Certainly, the mode of amplitude threshold determined above is only citing.At this
In inventive embodiments, amplitude threshold can also be chosen independent of mean value of amplitude, for example directly choose amplitude threshold be 0.9,
0.8 or 0.7 etc..
Fig. 2 B are a kind of exemplary waveforms figures provided in an embodiment of the present invention.To more fully understand the above, with reference to
Fig. 2 B are further explained come the mode of the beginning and end of the climax parts to determining audio file in the embodiment of the present invention.
To highlight in reference picture 2B, figure, several important time point t1, t2, t3, t4 are illustrate only, and eliminate other
Time point.The corresponding amplitude P of horizontal dotted line in figure represents amplitude threshold.Part in oscillogram more than horizontal dotted line is
Amplitude is more than the data point part of amplitude threshold in oscillogram, that is, needs the climax parts of audio found out.Oscillogram
Two number of clusters strong points are shown in upper these continuous data point formation cluster data points more than amplitude threshold, Fig. 2 B.In Fig. 2 B
The area S1 of dashed part is the first number of clusters strong point corresponding area in the oscillogram, and the area S2 of dashed part is
Second number of clusters strong point corresponding area in the oscillogram.As seen from the figure, S2 is more than S1, thus chooses that corresponding number of clusters of S2
The corresponding time t3 and t4 of two ends data point at strong point is respectively the beginning and end of the climax parts of the audio file.Also
It is to say, the audio in t3 to t4 this periods is the climax parts needed.
Although it is pointed out that only show two number of clusters strong points corresponding area in the oscillogram in Fig. 2 B,
Depending on video and audio file and amplitude threshold, it is also possible to the area of more number of clusters strong points and more dashed parts occur, i.e. remove
Outside S1 and S2, it is also possible to S3, S4 occur ..., now need from all these areas S1, S2, S3, S4 ..., area is selected
The two ends data point corresponding time of that maximum cluster data point, starting point and end as the climax parts of the audio file
Point.
Wherein, the area S2 of dashed part can be calculated based on below equation:S=∑s (Yi* Δs t)=∑ Yi;Wherein, Δ
T represents the time coordinate of 1 unit, and Yi is the ordinate (i.e. amplitude) in this section of duration of t3 to t4.The face of other dashed parts
Product can similarly be calculated and obtained.
The method of audio frequency process provided in an embodiment of the present invention, amplitude threshold is met according in the oscillogram of audio file
Those put corresponding area to determine the beginning and end of the climax parts of the audio file, can further improve essence
Exactness, and the climax parts of audio can be automatically intercepted, save labour turnover.
In actual applications, in order to fully meet the listening demand of user, usually need to ensure audio under many circumstances
The time of climax parts can not be too short.In embodiments of the present invention, the climax parts of the audio file can be pre-set
Minimum duration, when determining the beginning and end of climax parts, it is ensured that the duration between the beginning and end of climax parts, which is more than, to be set
The minimum duration put.Specifically, in one embodiment of the invention, the maximum cluster data of area is chosen described in step 22
The two ends data point corresponding time of point, the beginning and end as the climax parts of the audio file may include:
Determine the difference of the two ends data point corresponding time of the maximum cluster data point of area;
If difference is more than or equal to the minimum duration, it is determined that the two ends data point corresponding time is respectively described
The beginning and end of the climax parts of audio file.
In embodiments of the present invention, the oscillogram can be the ripple than being obtained after being smoothed using waveform compression
Shape figure, wherein, the waveform compression ratio refers to, during being smoothed to the waveform of the audio file, every
Selected data are counted out in the neighborhood of individual data point.It is less than the situation of the minimum duration for reply difference, alternatively, this
The method of the audio frequency process provided in another embodiment of invention may also include:
If the difference is less than the minimum duration, waveform compression ratio is heightened, and perform following steps:
A, again according to the oscillogram of the audio file, determine that amplitude in the oscillogram is more than the number of amplitude threshold
Strong point, wherein, continuous data point formation cluster data point;
B, determination determine the two of the maximum cluster data point of area per number of clusters strong point corresponding area in the oscillogram
The difference of end data point corresponding time;
If c, difference are more than or equal to the minimum duration, it is determined that the two ends data point corresponding time is respectively institute
State the beginning and end of the climax parts of audio file;
If d, difference are still less than the minimum duration, continue to heighten waveform compression ratio, repeat step a-c until poor
Value is more than or equal to the minimum duration.
The method of audio frequency process provided in an embodiment of the present invention, when difference is less than the minimum duration, by gradually adjusting
High waveform compression ratio, it can be ensured that difference is more than or equal to the minimum duration.
Wherein, waveform compression can be 2 than the value initially used.In the embodiment of the present invention, alternatively, it can also set in advance
Maximum waveform compression ratio is put, wherein, maximum waveform compression is than that can be 1/3rd of audio file total duration.
If waveform compression ratio is constantly heightened when reaching maximum waveform compression ratio, still it cannot be guaranteed that difference is more than or equal to
The minimum duration, in another embodiment of the present invention, to ensure the duration of audio climax parts, can suitably turn down and shake
Width threshold value.
The method for the audio frequency process that another embodiment of the present invention is provided may also include on the basis of a upper embodiment:
If the waveform compression ratio used is more than the maximum waveform compression ratio, turn down the amplitude threshold, and perform with
Lower step;
E, the oscillogram according to audio file, determine to be more than the amplitude of the amplitude threshold after adjustment in the oscillogram,
Wherein, continuous data point formation cluster data point;
F, determination determine the two of the maximum cluster data point of area per number of clusters strong point corresponding area in the oscillogram
The difference of end data point corresponding time;
If g, difference are more than or equal to the minimum duration, it is determined that the two ends data point corresponding time is respectively institute
State the beginning and end of the climax parts of audio file;
If h, difference are less than the minimum duration, continue to turn down the amplitude threshold, and perform step e-g until difference
More than or equal to the minimum duration.
The method of audio frequency process provided in an embodiment of the present invention, can be with by turning down amplitude threshold after above-mentioned steps
Ensure that difference is more than or equal to the minimum duration, and then determine that the two ends data point corresponding time is respectively the audio
The beginning and end of the climax parts of file.
In order to which technical scheme is more completely understood, further explained by taking a specific embodiment as an example below
The processing method of audio file of the embodiment of the present invention.
Fig. 3 is the schematic flow sheet of the method for audio file processing provided in an embodiment of the present invention.Reference picture 3, with song
Exemplified by, the processing method for the audio file that the present embodiment is provided includes:
31st, when initial, the minimum duration Min Δs t of song climax parts is set, maximum waveform compression ratio is set.
Wherein, the song length of interception be have to be larger than or equal to this length, and the interception less than this length is regarded as
Failure.Waveform compression is than higher, and continuity, the smoothness of waveform are better, but accuracy rate also can be poorer, and it is therefore necessary to set
Put certain compression ratio.
32nd, amplitude threshold P is determined.
Under the difference of a variety of audio formats and a variety of audio Software for producing, the wave-shape amplitude for the audio file made
Have certain otherness, and waveform is not necessarily successional, and it is not our needs to also have some discrete data points
, so needing to ensure that a number of data point participates in calculating, sampling is highly important.Waveform values are up to 1, Ke Yiyong
The mode of arithmetic progression defines amplitude threshold P, and such as amplitude threshold P initial value is 0.9, when amplitude threshold P value not
When satisfaction is required, the reduction by 0.1 every time of later value.However, the suitable amplitude threshold P of some waveforms just only has 0.1 or so.It is
State adapts to such case, and amplitude threshold P can be determined according in the following manner:The average Avg (Y) of whole waveform is calculated first, is taken
Value P=Avg (Y1)+0.5*Avg (Y1), when amplitude threshold P value is unsatisfactory for requiring, P=Avg (Y)+0.4*Avg (Y2),
Next time, P=Avg (Y)+0.3*Avg (Y2) ... was by that analogy when amplitude threshold P value is unsatisfactory for requiring.
33rd, according to the oscillogram of song, data point of the amplitude more than amplitude threshold P in the oscillogram is determined.Wherein,
These continuous data points represent cluster data point in the oscillogram.Then, amplitude in the oscillogram is more than amplitude
Threshold value P every number of clusters strong point is saved in array.
34th, according to data point array, calculate per number of clusters strong point corresponding area in the oscillogram, and determine area most
The difference Δ X of the two ends data point corresponding time of big cluster data point.Wherein, data point area can be calculated by below equation
Go out:S=Σ Yi, time difference Δ X=Xmax-Xmin;Wherein, Xmax, Xmin are represented at this maximum number of clusters strong point of area
Two data points at two ends distinguish the corresponding time, and Yi represents the corresponding all amplitudes in this number of clusters strong point.
35th, according to the result of step 34, judge whether time difference Δ X meets the condition more than or equal to Min Δs t.
36th, when passage time difference Δ X and compression ratio calculate condition of the obtained duration satisfaction more than Min Δs t, return
Xmax and Xmin, so as to learn the beginning and end of song climax.
37th, when the duration that passage time difference Δ X and compression ratio calculating are obtained is unsatisfactory for the condition more than Min Δs t, heighten
The compression ratio of waveform, allows the curve more serialization of waveform.
38th, judge whether the compression ratio after heightening exceedes the maximum compression ratio set in step 31, when beyond compression ratio,
It is interval, it is necessary to reduce amplitude threshold P value that expression can not get suitable climax in the range of this sampling, allows more numbers
Strong point participates in calculating, and returns to step 32;When without departing from compression ratio, step 33 is jumped to, amplitude in oscillogram is recalculated
Data point of the value more than amplitude threshold P.
The method of audio frequency process provided in an embodiment of the present invention, the climax of audio is determined based on oscillogram, not only can be certainly
The climax parts of audio are determined dynamicly, but also can accurately determine the climax parts of audio.
It is pointed out that a kind of method of audio frequency process provided in an embodiment of the present invention, audio can be intercepted automatically
Climax parts.If applying in server end, ring making can be caused to depart from by hand, substantial amounts of manpower and time cost is saved;
If applying in client, the key operation of family one can be used, convenience and ease for use is improved.
Correspondingly, the embodiment of the present invention also provides a kind of device of audio frequency process, and reference picture 4A, the embodiment of the present invention is provided
Audio frequency process device 40 include processing module 41 and output module 42.Wherein:
Processing module 41, for the oscillogram according to audio file, automatically determines the climax parts of the audio file
Beginning and end;
Output module 42, for exporting the audio between the starting point and the terminal.
Wherein, the processing module 41 can be specifically for:
According to the oscillogram of audio file, determine that amplitude in the oscillogram is more than the data point of amplitude threshold, wherein,
The continuous data point formation cluster data point in the oscillogram;
It is determined that per number of clusters strong point corresponding area in the oscillogram, and choose the two of the maximum cluster data point of area
The end data point corresponding time, it is used as the beginning and end of the climax parts of the audio file.
Alternatively, the processing module 41 can be additionally used in:It is determined that the mean value of amplitude in whole oscillogram;Shaken according to described
Width average value, determines amplitude threshold.
Alternatively, reference picture 4B, in another embodiment of the present invention, described device 40 also include:
Setup module 43, the minimum duration of the climax parts for pre-setting the audio file;
When the two ends data point corresponding time for choosing the maximum cluster data point of area, the height of the audio file is used as
During the beginning and end of damp part, the processing module 41 specifically for:
Determine the difference of the two ends data point corresponding time of the maximum cluster data point of area;
If difference is more than or equal to the minimum duration that the setup module is set, it is determined that the two ends data point is corresponding
Time is respectively the beginning and end of the climax parts of the audio file.
Alternatively, in another embodiment of the present invention, the oscillogram is to be located using waveform compression than smoothly
The oscillogram obtained after reason, wherein, the waveform compression ratio refers to, is smoothed in the waveform to the audio file
During, selected data are counted out in the field of each data point;
The processing module 41, if being additionally operable to the difference less than the minimum duration, heightens waveform compression ratio, and hold
Row following steps:
A, again according to the oscillogram of the audio file, determine that amplitude in the oscillogram is more than the number of amplitude threshold
Strong point, wherein, continuous data point formation cluster data point;
B, determination determine the two of the maximum cluster data point of area per number of clusters strong point corresponding area in the oscillogram
The difference of end data point corresponding time;
If c, difference are more than or equal to the minimum duration, it is determined that the two ends data point corresponding time is respectively institute
State the beginning and end of the climax parts of audio file;
If d, difference are still less than the minimum duration, continue to heighten waveform compression ratio, repeat step a-c until poor
Value is more than or equal to the minimum duration.
Alternatively, in another embodiment of the present invention, the setup module 43, is additionally operable to pre-set maximum waveform
Compression ratio;
The processing module 41, if being additionally operable to the waveform compression that uses than the maximum waveform that is set more than the setup module
Compression ratio, then turn down the amplitude threshold, and perform following steps:
E, the oscillogram according to audio file, determine to be more than the amplitude of the amplitude threshold after adjustment in the oscillogram,
Wherein, continuous data point formation cluster data point;
F, determination determine the two of the maximum cluster data point of area per number of clusters strong point corresponding area in the oscillogram
The difference of end data point corresponding time;
If g, difference are more than or equal to the minimum duration, it is determined that the two ends data point corresponding time is respectively institute
State the beginning and end of the climax parts of audio file;
If h, difference are less than the minimum duration, continue to turn down the amplitude threshold, and perform step e-g until difference
More than or equal to the minimum duration.
The device of audio frequency process provided in an embodiment of the present invention, amplitude threshold is met according in the oscillogram of audio file
Those put corresponding area to determine the beginning and end of the climax parts of the audio file, can not only automatically intercept sound
The climax parts of frequency, save labour turnover, and can ensure that the accuracy of the audio climax parts of determination is higher.
The device of audio frequency process provided in an embodiment of the present invention can be any equipment for handling audio, both can be service
Device or user equipment.
Correspondingly, the embodiment of the present invention also provides a kind of computer program product, and the computer program product includes using
To perform the instruction of the various operations in above method embodiment.
Correspondingly, the embodiment of the present invention also provides a kind of storage medium, and the storage medium is used to store above computer
Program product.
It should be noted that:Device only being partitioned into above-mentioned each functional module for the audio frequency process that above-described embodiment is provided
Row is for example, in practical application, can as needed and by above-mentioned functions distribute and be completed by different functional modules, will be filled
The internal structure put is divided into different functional modules, to complete all or part of function described above.In addition, above-mentioned reality
The device of audio frequency process and the device embodiment of audio frequency process for applying example offer belong to same design, and it implements process and referred to
Device embodiment, is repeated no more here.
It should be noted that each embodiment in this specification is described by the way of progressive, each embodiment weight
Point explanation be all between difference with other embodiment, each embodiment identical similar part mutually referring to.
For device class embodiment, because it is substantially similar to device embodiment, so description is fairly simple, related part is joined
See the part explanation of device embodiment.
It should be noted that herein, term " comprising ", "comprising" or its any other variant are intended to non-row
His property is included, so that process, device, article or equipment including a series of key elements not only include those key elements, and
And also including other key elements being not expressly set out, or also include for this process, device, article or equipment institute inherently
Key element.In the absence of more restrictions, the key element limited by sentence "including a ...", it is not excluded that including institute
Also there is other identical element in process, device, article or the equipment of stating key element.
One of ordinary skill in the art will appreciate that realizing that all or part of step of above-described embodiment can be by hardware
To complete, the hardware of correlation can also be instructed to complete by program, described program can be stored in a kind of computer-readable
In storage medium, storage medium mentioned above can be read-only storage, disk or CD etc..
The foregoing is only presently preferred embodiments of the present invention, be not intended to limit the invention, it is all the present invention principle it
Interior, any modification, equivalent substitution and improvements made etc. should be included in the scope of the protection.
Claims (10)
1. a kind of method of audio frequency process, it is characterised in that methods described includes:
According to the oscillogram of audio file, the beginning and end of the climax parts of the audio file is automatically determined;
Export the audio between the starting point and the terminal;
The oscillogram according to audio file, automatically determining the beginning and end of the climax parts of the audio file includes:
According to the oscillogram of audio file, determine that amplitude in the oscillogram is more than the data point of amplitude threshold, wherein, it is described
The continuous data point formation cluster data point in oscillogram;
It is determined that per number of clusters strong point corresponding area in the oscillogram, and choose the two ends number of the maximum cluster data point of area
The strong point corresponding time, it is used as the beginning and end of the climax parts of the audio file.
2. the method as described in claim 1, it is characterised in that methods described also includes:
Pre-set the minimum duration of the climax parts of the audio file;
The two ends data point corresponding time for choosing the maximum cluster data point of area, it is used as the climax of the audio file
Partial beginning and end includes:
Determine the difference of the two ends data point corresponding time of the maximum cluster data point of area;
If difference is more than or equal to the minimum duration, it is determined that the two ends data point corresponding time is respectively the audio
The beginning and end of the climax parts of file.
3. method as claimed in claim 2, it is characterised in that the oscillogram is smoothed using waveform compression ratio
The oscillogram obtained afterwards, wherein, the waveform compression ratio refers to, the mistake being smoothed in the waveform to the audio file
Cheng Zhong, selected data are counted out in each data neighborhood of a point;
Methods described also includes:
If the difference is less than the minimum duration, waveform compression ratio is heightened, and perform following steps:
A, again according to the oscillogram of the audio file, determine that amplitude in the oscillogram is more than the data of amplitude threshold
Point, wherein, continuous data point formation cluster data point;
B, determination determine the two ends number of the maximum cluster data point of area per number of clusters strong point corresponding area in the oscillogram
The difference of strong point corresponding time;
If c, difference are more than or equal to the minimum duration, it is determined that the two ends data point corresponding time is respectively the sound
The beginning and end of the climax parts of frequency file;
If d, difference are still less than the minimum duration, continue to heighten waveform compression ratio, repeat step a-c until difference is big
In or equal to the minimum duration.
4. method as claimed in claim 3, it is characterised in that methods described also includes:
Pre-set maximum waveform compression ratio;
If the waveform compression ratio used is more than the maximum waveform compression ratio, the amplitude threshold is turned down, and perform following walk
Suddenly;
E, the oscillogram according to audio file, determine to be more than the amplitude of the amplitude threshold after adjustment in the oscillogram, wherein,
Continuous data point formation cluster data point;
F, determination determine the two ends number of the maximum cluster data point of area per number of clusters strong point corresponding area in the oscillogram
The difference of strong point corresponding time;
If g, difference are more than or equal to the minimum duration, it is determined that the two ends data point corresponding time is respectively the sound
The beginning and end of the climax parts of frequency file;
If h, difference are less than the minimum duration, continue to turn down the amplitude threshold, and perform step e-g until difference is more than
Or equal to the minimum duration.
5. the method as described in claim 1-4 is any, it is characterised in that in the oscillogram according to audio file, automatically
Before the beginning and end of climax parts for determining the audio file, methods described also includes:
It is determined that the mean value of amplitude in whole oscillogram;
According to the mean value of amplitude, amplitude threshold is determined.
6. a kind of device of audio frequency process, it is characterised in that described device includes:
Processing module, for the oscillogram according to audio file, automatically determine the climax parts of the audio file starting point and
Terminal;
Output module, for exporting the audio between the starting point and the terminal;
The processing module specifically for:
According to the oscillogram of audio file, determine that amplitude in the oscillogram is more than the data point of amplitude threshold, wherein, it is described
The continuous data point formation cluster data point in oscillogram;
It is determined that per number of clusters strong point corresponding area in the oscillogram, and choose the two ends number of the maximum cluster data point of area
The strong point corresponding time, it is used as the beginning and end of the climax parts of the audio file.
7. device as claimed in claim 6, it is characterised in that described device also includes:
Setup module, the minimum duration of the climax parts for pre-setting the audio file;
When the two ends data point corresponding time for choosing the maximum cluster data point of area, the climax portion of the audio file is used as
Point beginning and end when, the processing module specifically for:
Determine the difference of the two ends data point corresponding time of the maximum cluster data point of area;
If difference is more than or equal to the minimum duration that the setup module is set, it is determined that the two ends data point corresponding time
The beginning and end of the climax parts of respectively described audio file.
8. device as claimed in claim 7, it is characterised in that the oscillogram is smoothed using waveform compression ratio
The oscillogram obtained afterwards, wherein, the waveform compression ratio refers to, the mistake being smoothed in the waveform to the audio file
Cheng Zhong, selected data are counted out in the field of each data point;
The processing module, if being additionally operable to the difference less than the minimum duration, heightens waveform compression ratio, and perform following
Step:
A, again according to the oscillogram of the audio file, determine that amplitude in the oscillogram is more than the data of amplitude threshold
Point, wherein, continuous data point formation cluster data point;
B, determination determine the two ends number of the maximum cluster data point of area per number of clusters strong point corresponding area in the oscillogram
The difference of strong point corresponding time;
If c, difference are more than or equal to the minimum duration, it is determined that the two ends data point corresponding time is respectively the sound
The beginning and end of the climax parts of frequency file;
If d, difference are still less than the minimum duration, continue to heighten waveform compression ratio, repeat step a-c until difference is big
In or equal to the minimum duration.
9. device as claimed in claim 8, it is characterised in that
The setup module, is additionally operable to pre-set maximum waveform compression ratio;
The processing module, if being additionally operable to the waveform compression that uses than the maximum waveform compression that is set more than the setup module
Than then turning down the amplitude threshold, and perform following steps:
E, the oscillogram according to audio file, determine to be more than the amplitude of the amplitude threshold after adjustment in the oscillogram, wherein,
Continuous data point formation cluster data point;
F, determination determine the two ends number of the maximum cluster data point of area per number of clusters strong point corresponding area in the oscillogram
The difference of strong point corresponding time;
If g, difference are more than or equal to the minimum duration, it is determined that the two ends data point corresponding time is respectively the sound
The beginning and end of the climax parts of frequency file;
If h, difference are less than the minimum duration, continue to turn down the amplitude threshold, and perform step e-g until difference is more than
Or equal to the minimum duration.
10. the device as described in claim 6-9 is any, it is characterised in that the processing module is additionally operable to:
It is determined that the mean value of amplitude in whole oscillogram;
According to the mean value of amplitude, amplitude threshold is determined.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410855770.4A CN104599681B (en) | 2014-12-31 | 2014-12-31 | The method and apparatus of audio frequency process |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410855770.4A CN104599681B (en) | 2014-12-31 | 2014-12-31 | The method and apparatus of audio frequency process |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104599681A CN104599681A (en) | 2015-05-06 |
CN104599681B true CN104599681B (en) | 2017-08-01 |
Family
ID=53125413
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410855770.4A Active CN104599681B (en) | 2014-12-31 | 2014-12-31 | The method and apparatus of audio frequency process |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104599681B (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105404654A (en) * | 2015-10-30 | 2016-03-16 | 魅族科技(中国)有限公司 | Audio file playing method and device |
CN105657171B (en) * | 2016-01-13 | 2019-02-22 | Oppo广东移动通信有限公司 | The setting method and system of incoming ring tone |
CN107331411B (en) * | 2017-06-30 | 2019-10-29 | 广州酷狗计算机科技有限公司 | Extracting method, device and the computer readable storage medium of music climax cut off |
CN109634701A (en) * | 2018-11-27 | 2019-04-16 | 浙江万朋教育科技股份有限公司 | A method of it is shown based on audio frequency of mobile terminal |
CN111354378B (en) * | 2020-02-12 | 2020-11-24 | 北京声智科技有限公司 | Voice endpoint detection method, device, equipment and computer storage medium |
CN112118481B (en) * | 2020-09-18 | 2021-11-23 | 珠海格力电器股份有限公司 | Audio clip generation method and device, player and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1598923A (en) * | 2004-09-10 | 2005-03-23 | 清华大学 | Popular song key segment pick-up method for music listening |
CN102754159A (en) * | 2009-10-19 | 2012-10-24 | 杜比国际公司 | Metadata time marking information for indicating a section of an audio object |
CN102903357A (en) * | 2011-07-29 | 2013-01-30 | 华为技术有限公司 | Method, device and system for extracting chorus of song |
CN103824555A (en) * | 2012-11-19 | 2014-05-28 | 腾讯科技(深圳)有限公司 | Audio band extraction method and extraction device |
CN104091595A (en) * | 2013-10-15 | 2014-10-08 | 腾讯科技(深圳)有限公司 | Audio processing method and device |
-
2014
- 2014-12-31 CN CN201410855770.4A patent/CN104599681B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1598923A (en) * | 2004-09-10 | 2005-03-23 | 清华大学 | Popular song key segment pick-up method for music listening |
CN102754159A (en) * | 2009-10-19 | 2012-10-24 | 杜比国际公司 | Metadata time marking information for indicating a section of an audio object |
CN102903357A (en) * | 2011-07-29 | 2013-01-30 | 华为技术有限公司 | Method, device and system for extracting chorus of song |
CN103824555A (en) * | 2012-11-19 | 2014-05-28 | 腾讯科技(深圳)有限公司 | Audio band extraction method and extraction device |
CN104091595A (en) * | 2013-10-15 | 2014-10-08 | 腾讯科技(深圳)有限公司 | Audio processing method and device |
Also Published As
Publication number | Publication date |
---|---|
CN104599681A (en) | 2015-05-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104599681B (en) | The method and apparatus of audio frequency process | |
Kirch et al. | Detection of changes in multivariate time series with application to EEG data | |
CN106646615B (en) | A kind of data processing method and device of surface wave frequency dispersion curve | |
WO2020211105A1 (en) | Motor drive signal setting method, electronic device, and storage medium | |
CN106910511B (en) | Voice denoising method and device | |
JP2016518663A (en) | System and method for program identification | |
JP2017086894A (en) | Signal feature extracting method, and apparatus and computer-readable storage medium therefor | |
US20180188809A1 (en) | Bioelectricity-based control method and apparatus, and bioelectricity-based controller | |
CN109955257A (en) | A kind of awakening method of robot, device, terminal device and storage medium | |
JP2022185114A (en) | echo detection | |
CN110379414A (en) | Acoustic model enhances training method, device, readable storage medium storing program for executing and calculates equipment | |
CN110246502A (en) | Voice de-noising method, device and terminal device | |
CN112309426A (en) | Voice processing model training method and device and voice processing method and device | |
US20070271319A1 (en) | Apparatus for an Method of Signal Processing | |
CN107481203A (en) | A kind of image orientation filtering method and computing device | |
CN103426162A (en) | Image processing apparatus, image processing method, and program | |
CN109284783B (en) | Machine learning-based worship counting method and device, user equipment and medium | |
US11887615B2 (en) | Method and device for transparent processing of music | |
TWI514129B (en) | Portable electronic apparatus and power management method | |
WO2017114130A1 (en) | Method and device for obtaining state of robot | |
JP2020027245A5 (en) | Information processing method, information processing apparatus, and program | |
CN112771608A (en) | Voice information processing method and device, storage medium and electronic equipment | |
CN114171043B (en) | Echo determination method, device, equipment and storage medium | |
WO2021068107A1 (en) | Identity recognition method based on ballistocardiogram signal, electronic device, and storage medium | |
Bakharev | A correlational analysis of electroencephalograms based on the modeling of biopotentials of the cerebral cortex |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP02 | Change in the address of a patent holder |
Address after: 510660 Guangzhou City, Guangzhou, Guangdong, Whampoa Avenue, No. 315, self - made 1-17 Patentee after: Guangzhou KuGou Networks Co., Ltd. Address before: 510000 B1, building, No. 16, rhyme Road, Guangzhou, Guangdong, China 13F Patentee before: Guangzhou KuGou Networks Co., Ltd. |
|
CP02 | Change in the address of a patent holder |