CN104599681B

CN104599681B - The method and apparatus of audio frequency process

Info

Publication number: CN104599681B
Application number: CN201410855770.4A
Authority: CN
Inventors: 夏伟涛
Original assignee: Guangzhou Kugou Computer Technology Co Ltd
Current assignee: Guangzhou Kugou Computer Technology Co Ltd
Priority date: 2014-12-31
Filing date: 2014-12-31
Publication date: 2017-08-01
Anticipated expiration: 2034-12-31
Also published as: CN104599681A

Abstract

The invention discloses a kind of method and apparatus of audio frequency process, it is related to computer realm, can automatically intercepts the climax parts of audio.Methods described includes：According to the oscillogram of audio file, the beginning and end of the climax parts of the audio file is automatically determined；Export the audio between the starting point and the terminal.The present invention is used to make audio.

Description

The method and apparatus of audio frequency process

Technical field

The present invention relates to computer realm, more particularly to a kind of method and apparatus of audio frequency process.

Background technology

With internet popularization and continue to develop, people are downloaded by internet more and more oneself to be liked The tinkle of bells.

The tinkle of bells on current internet is relied primarily on manually to be made by way of the climax parts of Manual interception audio Make.But, this mode for making the tinkle of bells can not intercept the climax parts of audio exactly, and need to spend substantial amounts of people Work cost.

The content of the invention

The embodiment of the present invention provides a kind of method and apparatus of audio frequency process, can automatically intercept the climax portion of audio Point, save labour turnover.

First aspect includes there is provided a kind of method of audio frequency process, methods described：

According to the oscillogram of audio file, the beginning and end of the climax parts of the audio file is automatically determined；

Export the audio between the starting point and the terminal.

Wherein, alternatively, the oscillogram according to audio file, automatically determines the climax parts of the audio file Beginning and end may include：

According to the oscillogram of audio file, determine that amplitude in the oscillogram is more than the data point of amplitude threshold, wherein, Continuous data point formation cluster data point；

It is determined that per number of clusters strong point corresponding area in the oscillogram, and choose the two of the maximum cluster data point of area The end data point corresponding time, it is used as the beginning and end of the climax parts of the audio file.

Alternatively, in one embodiment of the invention, methods described also includes：

Pre-set the minimum duration of the climax parts of the audio file；

The two ends data point corresponding time for choosing the maximum cluster data point of area, it is used as the audio file The beginning and end of climax parts includes：

Determine the difference of the two ends data point corresponding time of the maximum cluster data point of area；

If difference is more than or equal to the minimum duration, it is determined that the two ends data point corresponding time is respectively described The beginning and end of the climax parts of audio file.

In another embodiment of the present invention, further, the oscillogram is more smooth than carrying out using waveform compression The oscillogram obtained after processing, wherein, the waveform compression ratio refers to, is smoothed in the waveform to the audio file During, selected data are counted out in each data neighborhood of a point；

Methods described may also include：

If the difference is less than the minimum duration, waveform compression ratio is heightened, and perform following steps：

A, again according to the oscillogram of the audio file, determine that amplitude in the oscillogram is more than the number of amplitude threshold Strong point, wherein, continuous data point formation cluster data point；

B, determination determine the two of the maximum cluster data point of area per number of clusters strong point corresponding area in the oscillogram The difference of end data point corresponding time；

If c, difference are more than or equal to the minimum duration, it is determined that the two ends data point corresponding time is respectively institute State the beginning and end of the climax parts of audio file；

If d, difference are still less than the minimum duration, continue to heighten waveform compression ratio, repeat step a-c until poor Value is more than or equal to the minimum duration.

In another embodiment of the present invention, further, methods described may also include：

Pre-set maximum waveform compression ratio；

If the waveform compression ratio used is more than the maximum waveform compression ratio, turn down the amplitude threshold, and perform with Lower step；

E, the oscillogram according to audio file, determine to be more than the amplitude of the amplitude threshold after adjustment in the oscillogram, Wherein, continuous data point formation cluster data point；

F, determination determine the two of the maximum cluster data point of area per number of clusters strong point corresponding area in the oscillogram The difference of end data point corresponding time；

If g, difference are more than or equal to the minimum duration, it is determined that the two ends data point corresponding time is respectively institute State the beginning and end of the climax parts of audio file；

If h, difference are less than the minimum duration, continue to turn down the amplitude threshold, and perform step e-g until difference More than or equal to the minimum duration.

Wherein, in one embodiment of the invention, in the oscillogram according to audio file, the sound is automatically determined Before the beginning and end of the climax parts of frequency file, methods described may also include：

It is determined that the mean value of amplitude in whole oscillogram；

According to the mean value of amplitude, amplitude threshold is determined.

Second aspect includes there is provided a kind of device of audio frequency process, described device：

Processing module, for the oscillogram according to audio file, automatically determines of the climax parts of the audio file Point and terminal；

Output module, for exporting the audio between the starting point and the terminal.

Wherein, alternatively, the processing module specifically for：

Alternatively, in one embodiment of the invention, described device also includes：

Setup module, the minimum duration of the climax parts for pre-setting the audio file；

When the two ends data point corresponding time for choosing the maximum cluster data point of area, the height of the audio file is used as During the beginning and end of damp part, the processing module specifically for：

If difference is more than or equal to the minimum duration that the setup module is set, it is determined that the two ends data point is corresponding Time is respectively the beginning and end of the climax parts of the audio file.

In another embodiment of the present invention, further, the oscillogram is more smooth than carrying out using waveform compression The oscillogram obtained after processing, wherein, the waveform compression ratio refers to, is smoothed in the waveform to the audio file During, selected data are counted out in the field of each data point；

The processing module, if being additionally operable to the difference less than the minimum duration, heightens waveform compression ratio, and perform Following steps：

In another embodiment of the present invention, further, the setup module, is additionally operable to pre-set maximum waveform Compression ratio；

The processing module, if being additionally operable to the waveform compression that uses than the maximum waveform pressure that is set more than the setup module Contracting ratio, then turn down the amplitude threshold, and perform following steps：

Wherein, alternatively in one embodiment of the invention, the processing module is additionally operable to：

It is determined that the mean value of amplitude in whole oscillogram；

According to the mean value of amplitude, amplitude threshold is determined.

The method and apparatus of audio frequency process provided in an embodiment of the present invention, determined according to the oscillogram of audio file described in The beginning and end of the climax parts of audio file, compared to the side of correlation technique manually determined climax parts beginning and end Formula, improves accuracy, and can automatically intercept the climax parts of audio, saves labour turnover.

Brief description of the drawings

Technical scheme in order to illustrate the embodiments of the present invention more clearly, makes required in being described below to embodiment Accompanying drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the present invention, for For those of ordinary skill in the art, on the premise of not paying creative work, other can also be obtained according to these accompanying drawings Accompanying drawing.

Fig. 1 is a kind of flow chart of the method for audio frequency process provided in an embodiment of the present invention；

Fig. 2A is the flow chart of the method for another audio frequency process provided in an embodiment of the present invention；

Fig. 2 B are a kind of exemplary waveforms figures provided in an embodiment of the present invention；

Fig. 3 is the flow chart of the method for another audio frequency process provided in an embodiment of the present invention；

Fig. 4 A are the structural representations of the device of audio frequency process provided in an embodiment of the present invention；

Fig. 4 B are the structural representations of the device of audio frequency process provided in an embodiment of the present invention.

Embodiment

To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with accompanying drawing to embodiment party of the present invention Formula is described in further detail.

Fig. 1 is a kind of flow chart of the method for audio frequency process provided in an embodiment of the present invention.Reference picture 1, the present invention is implemented The method for the audio frequency process that example is provided may include：

11st, according to the oscillogram of audio file, the beginning and end of the climax parts of the audio file is automatically determined；

12nd, the audio between the starting point and the terminal is exported.

The method of audio frequency process provided in an embodiment of the present invention, the audio text is determined according to the oscillogram of audio file The beginning and end of the climax parts of part, compared to the mode of correlation technique manually determined climax parts beginning and end, is improved Accuracy, and can automatically intercept the climax parts of audio, save labour turnover.

It should be noted that in embodiments of the present invention, it is necessary to carry out modulus to audio file first before step 11 Conversion, i.e. the audio of audio file is converted into data signal from analog signal, audio wave is then converted into according to data signal Shape, and then idealized (smooth, scaling processing) to waveform to obtain final sound using boxcar smoothing processing algorithm The oscillogram of frequency file.

Boxcar is smooth, i.e. neighborhood averaging, is that one kind carries out Stencil operation volume using sliding window masterplate to spectrogram The smoothing method of product computing.So-called sliding window masterplate refers to that the value of whole coefficients in masterplate is identical masterplate, commonly uses 3 Point, 5 points and 7 masterplates.The principle of neighborhood averaging is to ask for averagely removing by data point in a data point and neighborhood The data point of mutation, so as to filter certain noise.Before and after value in boxcar algorithm after i-th of data point smoothing processing is it The average value of each M data point, therefore it is 2M+1 to participate in average data in each masterplate and count out.Herein, waveform compression Than the size for referring to M values.

Fig. 2A is the flow chart of the processing method of audio file provided in an embodiment of the present invention.Reference picture 2A, the present invention is real Applying the method for the audio frequency process of example offer may include：

21st, according to the oscillogram of audio file, determine that amplitude in the oscillogram is more than the data point of amplitude threshold, its In, continuous data point formation cluster data point；

22nd, determine per number of clusters strong point corresponding area in the oscillogram, and choose the maximum cluster data point of area The two ends data point corresponding time, be used as the beginning and end of the climax parts of the audio file；

23rd, the audio between the starting point and the terminal is exported.

Wherein, the climax parts of audio file refer to, the overall amplitude of audio file is larger in the range of a period of time Part.In embodiments of the present invention, the part in oscillogram more than amplitude threshold may be considered the climax portion of audio file Point.

In embodiments of the present invention, amplitude threshold can be determined in the following way：It is determined that the amplitude in whole oscillogram Average value；According to the mean value of amplitude, amplitude threshold is determined.Because amplitude threshold is chosen based on mean value of amplitude, select Taking can typically ensure that amplitude threshold is more than mean value of amplitude during amplitude threshold.So, it is ensured that the selection of amplitude threshold It is more scientific and reasonable.

Specifically, if the average of whole waveform is Avg, amplitude threshold is P, then can choose P=Avg+0.5*Avg, P= Avg+0.4*Avg, P=Avg+0.3*Avg or T=Avg+0.2*Avg etc..When choosing, larger shake can be preferentially chosen Width threshold value, i.e. can preferentially choose P=Avg+0.5*Avg.Certainly, the mode of amplitude threshold determined above is only citing.At this In inventive embodiments, amplitude threshold can also be chosen independent of mean value of amplitude, for example directly choose amplitude threshold be 0.9, 0.8 or 0.7 etc..

Fig. 2 B are a kind of exemplary waveforms figures provided in an embodiment of the present invention.To more fully understand the above, with reference to Fig. 2 B are further explained come the mode of the beginning and end of the climax parts to determining audio file in the embodiment of the present invention. To highlight in reference picture 2B, figure, several important time point t1, t2, t3, t4 are illustrate only, and eliminate other Time point.The corresponding amplitude P of horizontal dotted line in figure represents amplitude threshold.Part in oscillogram more than horizontal dotted line is Amplitude is more than the data point part of amplitude threshold in oscillogram, that is, needs the climax parts of audio found out.Oscillogram Two number of clusters strong points are shown in upper these continuous data point formation cluster data points more than amplitude threshold, Fig. 2 B.In Fig. 2 B The area S1 of dashed part is the first number of clusters strong point corresponding area in the oscillogram, and the area S2 of dashed part is Second number of clusters strong point corresponding area in the oscillogram.As seen from the figure, S2 is more than S1, thus chooses that corresponding number of clusters of S2 The corresponding time t3 and t4 of two ends data point at strong point is respectively the beginning and end of the climax parts of the audio file.Also It is to say, the audio in t3 to t4 this periods is the climax parts needed.

Although it is pointed out that only show two number of clusters strong points corresponding area in the oscillogram in Fig. 2 B, Depending on video and audio file and amplitude threshold, it is also possible to the area of more number of clusters strong points and more dashed parts occur, i.e. remove Outside S1 and S2, it is also possible to S3, S4 occur ..., now need from all these areas S1, S2, S3, S4 ..., area is selected The two ends data point corresponding time of that maximum cluster data point, starting point and end as the climax parts of the audio file Point.

Wherein, the area S2 of dashed part can be calculated based on below equation：S=∑s (Yi* Δs t)=∑ Yi；Wherein, Δ T represents the time coordinate of 1 unit, and Yi is the ordinate (i.e. amplitude) in this section of duration of t3 to t4.The face of other dashed parts Product can similarly be calculated and obtained.

The method of audio frequency process provided in an embodiment of the present invention, amplitude threshold is met according in the oscillogram of audio file Those put corresponding area to determine the beginning and end of the climax parts of the audio file, can further improve essence Exactness, and the climax parts of audio can be automatically intercepted, save labour turnover.

In actual applications, in order to fully meet the listening demand of user, usually need to ensure audio under many circumstances The time of climax parts can not be too short.In embodiments of the present invention, the climax parts of the audio file can be pre-set Minimum duration, when determining the beginning and end of climax parts, it is ensured that the duration between the beginning and end of climax parts, which is more than, to be set The minimum duration put.Specifically, in one embodiment of the invention, the maximum cluster data of area is chosen described in step 22 The two ends data point corresponding time of point, the beginning and end as the climax parts of the audio file may include：

In embodiments of the present invention, the oscillogram can be the ripple than being obtained after being smoothed using waveform compression Shape figure, wherein, the waveform compression ratio refers to, during being smoothed to the waveform of the audio file, every Selected data are counted out in the neighborhood of individual data point.It is less than the situation of the minimum duration for reply difference, alternatively, this The method of the audio frequency process provided in another embodiment of invention may also include：

The method of audio frequency process provided in an embodiment of the present invention, when difference is less than the minimum duration, by gradually adjusting High waveform compression ratio, it can be ensured that difference is more than or equal to the minimum duration.

Wherein, waveform compression can be 2 than the value initially used.In the embodiment of the present invention, alternatively, it can also set in advance Maximum waveform compression ratio is put, wherein, maximum waveform compression is than that can be 1/3rd of audio file total duration.

If waveform compression ratio is constantly heightened when reaching maximum waveform compression ratio, still it cannot be guaranteed that difference is more than or equal to The minimum duration, in another embodiment of the present invention, to ensure the duration of audio climax parts, can suitably turn down and shake Width threshold value.

The method for the audio frequency process that another embodiment of the present invention is provided may also include on the basis of a upper embodiment：

The method of audio frequency process provided in an embodiment of the present invention, can be with by turning down amplitude threshold after above-mentioned steps Ensure that difference is more than or equal to the minimum duration, and then determine that the two ends data point corresponding time is respectively the audio The beginning and end of the climax parts of file.

In order to which technical scheme is more completely understood, further explained by taking a specific embodiment as an example below The processing method of audio file of the embodiment of the present invention.

Fig. 3 is the schematic flow sheet of the method for audio file processing provided in an embodiment of the present invention.Reference picture 3, with song Exemplified by, the processing method for the audio file that the present embodiment is provided includes：

31st, when initial, the minimum duration Min Δs t of song climax parts is set, maximum waveform compression ratio is set.

Wherein, the song length of interception be have to be larger than or equal to this length, and the interception less than this length is regarded as Failure.Waveform compression is than higher, and continuity, the smoothness of waveform are better, but accuracy rate also can be poorer, and it is therefore necessary to set Put certain compression ratio.

32nd, amplitude threshold P is determined.

Under the difference of a variety of audio formats and a variety of audio Software for producing, the wave-shape amplitude for the audio file made Have certain otherness, and waveform is not necessarily successional, and it is not our needs to also have some discrete data points , so needing to ensure that a number of data point participates in calculating, sampling is highly important.Waveform values are up to 1, Ke Yiyong The mode of arithmetic progression defines amplitude threshold P, and such as amplitude threshold P initial value is 0.9, when amplitude threshold P value not When satisfaction is required, the reduction by 0.1 every time of later value.However, the suitable amplitude threshold P of some waveforms just only has 0.1 or so.It is State adapts to such case, and amplitude threshold P can be determined according in the following manner：The average Avg (Y) of whole waveform is calculated first, is taken Value P=Avg (Y1)+0.5*Avg (Y1), when amplitude threshold P value is unsatisfactory for requiring, P=Avg (Y)+0.4*Avg (Y2), Next time, P=Avg (Y)+0.3*Avg (Y2) ... was by that analogy when amplitude threshold P value is unsatisfactory for requiring.

33rd, according to the oscillogram of song, data point of the amplitude more than amplitude threshold P in the oscillogram is determined.Wherein, These continuous data points represent cluster data point in the oscillogram.Then, amplitude in the oscillogram is more than amplitude Threshold value P every number of clusters strong point is saved in array.

34th, according to data point array, calculate per number of clusters strong point corresponding area in the oscillogram, and determine area most The difference Δ X of the two ends data point corresponding time of big cluster data point.Wherein, data point area can be calculated by below equation Go out：S=Σ Yi, time difference Δ X=Xmax-Xmin；Wherein, Xmax, Xmin are represented at this maximum number of clusters strong point of area Two data points at two ends distinguish the corresponding time, and Yi represents the corresponding all amplitudes in this number of clusters strong point.

35th, according to the result of step 34, judge whether time difference Δ X meets the condition more than or equal to Min Δs t.

36th, when passage time difference Δ X and compression ratio calculate condition of the obtained duration satisfaction more than Min Δs t, return Xmax and Xmin, so as to learn the beginning and end of song climax.

37th, when the duration that passage time difference Δ X and compression ratio calculating are obtained is unsatisfactory for the condition more than Min Δs t, heighten The compression ratio of waveform, allows the curve more serialization of waveform.

38th, judge whether the compression ratio after heightening exceedes the maximum compression ratio set in step 31, when beyond compression ratio, It is interval, it is necessary to reduce amplitude threshold P value that expression can not get suitable climax in the range of this sampling, allows more numbers Strong point participates in calculating, and returns to step 32；When without departing from compression ratio, step 33 is jumped to, amplitude in oscillogram is recalculated Data point of the value more than amplitude threshold P.

The method of audio frequency process provided in an embodiment of the present invention, the climax of audio is determined based on oscillogram, not only can be certainly The climax parts of audio are determined dynamicly, but also can accurately determine the climax parts of audio.

It is pointed out that a kind of method of audio frequency process provided in an embodiment of the present invention, audio can be intercepted automatically Climax parts.If applying in server end, ring making can be caused to depart from by hand, substantial amounts of manpower and time cost is saved； If applying in client, the key operation of family one can be used, convenience and ease for use is improved.

Correspondingly, the embodiment of the present invention also provides a kind of device of audio frequency process, and reference picture 4A, the embodiment of the present invention is provided Audio frequency process device 40 include processing module 41 and output module 42.Wherein：

Processing module 41, for the oscillogram according to audio file, automatically determines the climax parts of the audio file Beginning and end；

Output module 42, for exporting the audio between the starting point and the terminal.

Wherein, the processing module 41 can be specifically for：

According to the oscillogram of audio file, determine that amplitude in the oscillogram is more than the data point of amplitude threshold, wherein, The continuous data point formation cluster data point in the oscillogram；

Alternatively, the processing module 41 can be additionally used in：It is determined that the mean value of amplitude in whole oscillogram；Shaken according to described Width average value, determines amplitude threshold.

Alternatively, reference picture 4B, in another embodiment of the present invention, described device 40 also include：

Setup module 43, the minimum duration of the climax parts for pre-setting the audio file；

When the two ends data point corresponding time for choosing the maximum cluster data point of area, the height of the audio file is used as During the beginning and end of damp part, the processing module 41 specifically for：

Alternatively, in another embodiment of the present invention, the oscillogram is to be located using waveform compression than smoothly The oscillogram obtained after reason, wherein, the waveform compression ratio refers to, is smoothed in the waveform to the audio file During, selected data are counted out in the field of each data point；

The processing module 41, if being additionally operable to the difference less than the minimum duration, heightens waveform compression ratio, and hold Row following steps：

Alternatively, in another embodiment of the present invention, the setup module 43, is additionally operable to pre-set maximum waveform Compression ratio；

The processing module 41, if being additionally operable to the waveform compression that uses than the maximum waveform that is set more than the setup module Compression ratio, then turn down the amplitude threshold, and perform following steps：

The device of audio frequency process provided in an embodiment of the present invention, amplitude threshold is met according in the oscillogram of audio file Those put corresponding area to determine the beginning and end of the climax parts of the audio file, can not only automatically intercept sound The climax parts of frequency, save labour turnover, and can ensure that the accuracy of the audio climax parts of determination is higher.

The device of audio frequency process provided in an embodiment of the present invention can be any equipment for handling audio, both can be service Device or user equipment.

Correspondingly, the embodiment of the present invention also provides a kind of computer program product, and the computer program product includes using To perform the instruction of the various operations in above method embodiment.

Correspondingly, the embodiment of the present invention also provides a kind of storage medium, and the storage medium is used to store above computer Program product.

It should be noted that：Device only being partitioned into above-mentioned each functional module for the audio frequency process that above-described embodiment is provided Row is for example, in practical application, can as needed and by above-mentioned functions distribute and be completed by different functional modules, will be filled The internal structure put is divided into different functional modules, to complete all or part of function described above.In addition, above-mentioned reality The device of audio frequency process and the device embodiment of audio frequency process for applying example offer belong to same design, and it implements process and referred to Device embodiment, is repeated no more here.

It should be noted that each embodiment in this specification is described by the way of progressive, each embodiment weight Point explanation be all between difference with other embodiment, each embodiment identical similar part mutually referring to. For device class embodiment, because it is substantially similar to device embodiment, so description is fairly simple, related part is joined See the part explanation of device embodiment.

It should be noted that herein, term " comprising ", "comprising" or its any other variant are intended to non-row His property is included, so that process, device, article or equipment including a series of key elements not only include those key elements, and And also including other key elements being not expressly set out, or also include for this process, device, article or equipment institute inherently Key element.In the absence of more restrictions, the key element limited by sentence "including a ...", it is not excluded that including institute Also there is other identical element in process, device, article or the equipment of stating key element.

One of ordinary skill in the art will appreciate that realizing that all or part of step of above-described embodiment can be by hardware To complete, the hardware of correlation can also be instructed to complete by program, described program can be stored in a kind of computer-readable In storage medium, storage medium mentioned above can be read-only storage, disk or CD etc..

The foregoing is only presently preferred embodiments of the present invention, be not intended to limit the invention, it is all the present invention principle it Interior, any modification, equivalent substitution and improvements made etc. should be included in the scope of the protection.

Claims

1. a kind of method of audio frequency process, it is characterised in that methods described includes：

Export the audio between the starting point and the terminal；

The oscillogram according to audio file, automatically determining the beginning and end of the climax parts of the audio file includes：

According to the oscillogram of audio file, determine that amplitude in the oscillogram is more than the data point of amplitude threshold, wherein, it is described The continuous data point formation cluster data point in oscillogram；

It is determined that per number of clusters strong point corresponding area in the oscillogram, and choose the two ends number of the maximum cluster data point of area The strong point corresponding time, it is used as the beginning and end of the climax parts of the audio file.

2. the method as described in claim 1, it is characterised in that methods described also includes：

Pre-set the minimum duration of the climax parts of the audio file；

The two ends data point corresponding time for choosing the maximum cluster data point of area, it is used as the climax of the audio file Partial beginning and end includes：

If difference is more than or equal to the minimum duration, it is determined that the two ends data point corresponding time is respectively the audio The beginning and end of the climax parts of file.

3. method as claimed in claim 2, it is characterised in that the oscillogram is smoothed using waveform compression ratio The oscillogram obtained afterwards, wherein, the waveform compression ratio refers to, the mistake being smoothed in the waveform to the audio file Cheng Zhong, selected data are counted out in each data neighborhood of a point；

Methods described also includes：

A, again according to the oscillogram of the audio file, determine that amplitude in the oscillogram is more than the data of amplitude threshold Point, wherein, continuous data point formation cluster data point；

B, determination determine the two ends number of the maximum cluster data point of area per number of clusters strong point corresponding area in the oscillogram The difference of strong point corresponding time；

If c, difference are more than or equal to the minimum duration, it is determined that the two ends data point corresponding time is respectively the sound The beginning and end of the climax parts of frequency file；

If d, difference are still less than the minimum duration, continue to heighten waveform compression ratio, repeat step a-c until difference is big In or equal to the minimum duration.

4. method as claimed in claim 3, it is characterised in that methods described also includes：

Pre-set maximum waveform compression ratio；

If the waveform compression ratio used is more than the maximum waveform compression ratio, the amplitude threshold is turned down, and perform following walk Suddenly；

F, determination determine the two ends number of the maximum cluster data point of area per number of clusters strong point corresponding area in the oscillogram The difference of strong point corresponding time；

If g, difference are more than or equal to the minimum duration, it is determined that the two ends data point corresponding time is respectively the sound The beginning and end of the climax parts of frequency file；

If h, difference are less than the minimum duration, continue to turn down the amplitude threshold, and perform step e-g until difference is more than Or equal to the minimum duration.

5. the method as described in claim 1-4 is any, it is characterised in that in the oscillogram according to audio file, automatically Before the beginning and end of climax parts for determining the audio file, methods described also includes：

It is determined that the mean value of amplitude in whole oscillogram；

According to the mean value of amplitude, amplitude threshold is determined.

6. a kind of device of audio frequency process, it is characterised in that described device includes：

Processing module, for the oscillogram according to audio file, automatically determine the climax parts of the audio file starting point and Terminal；

Output module, for exporting the audio between the starting point and the terminal；

The processing module specifically for：

7. device as claimed in claim 6, it is characterised in that described device also includes：

When the two ends data point corresponding time for choosing the maximum cluster data point of area, the climax portion of the audio file is used as Point beginning and end when, the processing module specifically for：

If difference is more than or equal to the minimum duration that the setup module is set, it is determined that the two ends data point corresponding time The beginning and end of the climax parts of respectively described audio file.

8. device as claimed in claim 7, it is characterised in that the oscillogram is smoothed using waveform compression ratio The oscillogram obtained afterwards, wherein, the waveform compression ratio refers to, the mistake being smoothed in the waveform to the audio file Cheng Zhong, selected data are counted out in the field of each data point；

The processing module, if being additionally operable to the difference less than the minimum duration, heightens waveform compression ratio, and perform following Step：

9. device as claimed in claim 8, it is characterised in that

The setup module, is additionally operable to pre-set maximum waveform compression ratio；

The processing module, if being additionally operable to the waveform compression that uses than the maximum waveform compression that is set more than the setup module Than then turning down the amplitude threshold, and perform following steps：

10. the device as described in claim 6-9 is any, it is characterised in that the processing module is additionally operable to：

It is determined that the mean value of amplitude in whole oscillogram；

According to the mean value of amplitude, amplitude threshold is determined.