CN104978380A

CN104978380A - Audio frequency processing method and device

Info

Publication number: CN104978380A
Application number: CN201410568013.9A
Authority: CN
Inventors: 赵伟峰
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Guangzhou Kugou Computer Technology Co Ltd
Priority date: 2014-10-22
Filing date: 2014-10-22
Publication date: 2015-10-14
Anticipated expiration: 2034-10-22
Also published as: CN104978380B

Abstract

The embodiment of the invention provides an audio frequency processing method and device. The method comprises the following steps: obtaining a MIDI (Musical Instrument Digital Interface) file corresponding to an audio frequency file; analyzing the MIDI file, and obtaining at least one note and the characteristic value and the time attribute of at least one note; adopting at least note and the characteristic value and the time attribute of at least one note to construct a reference sequence of the audio frequency file; adopting the characteristic value of at least one note to construct a characteristic sequence of the audio file; and carrying out analysis processing on the reference sequence and the characteristic sequence, and positioning the refrain of the audio frequency file. The audio frequency file can be subjected to refrain positioning on the basis of the MIDI file corresponding to the audio frequency file, audio frequency processing accuracy is improved, and audio frequency processing intellectuality is improved.

Description

A kind of audio-frequency processing method and device

Technical field

The present invention relates to Internet technical field, particularly relate to Audiotechnica field, be specifically related to a kind of audio-frequency processing method and device.

Background technology

Refrain, is often referred to the climax parts of audio file (as song, music etc.).For song, a song adopts AA ' BA ' Form (music structure) usually, and A represents main song, and B represents refrain; Also immediately say, by " the main song of prelude+two sections+one section of refrain+music of moving into one's husband's household upon marriage+one section of refrain+a section main song+ending music ", be linked in sequence is formed a usual song.Refrain location plays an important role to the treatment and analysis of audio file, prior art is mainly based on the analyzing and processing location refrain to audio file, but the analyzing and processing calculated amount based on audio file is large, accuracy is low, thus reduces the intelligent of audio frequency process.

Summary of the invention

The embodiment of the present invention provides a kind of audio-frequency processing method and device, can based on MIDI (Musical Instrument Digital Interface corresponding to audio file, musical instrument digital interface) file carries out refrain location to audio file, promote the accuracy of audio frequency process, promote the intelligent of audio frequency process.

Embodiment of the present invention first aspect provides a kind of audio-frequency processing method, can comprise:

Obtain the MIDI file that audio file is corresponding;

Resolve described MIDI file, obtain at least one note, and the eigenwert of at least one note described and time attribute;

Adopt at least one note described, and the eigenwert of at least one note described and time attribute, build the reference sequences of described audio file;

Adopt the eigenwert of at least one note described, build the characteristic sequence of described audio file;

Analyzing and processing is carried out to described reference sequences and described characteristic sequence, locates the refrain of described audio file.

Embodiment of the present invention second aspect provides a kind of apparatus for processing audio, can comprise:

File obtaining unit, for obtaining MIDI file corresponding to audio file;

Resolution unit, for resolving described MIDI file, obtains at least one note, and the eigenwert of at least one note described and time attribute;

Reference sequences construction unit, for adopting at least one note described, and the eigenwert of at least one note described and time attribute, build the reference sequences of described audio file;

Characteristic sequence construction unit, for adopting the eigenwert of at least one note described, builds the characteristic sequence of described audio file;

Positioning unit, for carrying out analyzing and processing to described reference sequences and described characteristic sequence, locates the refrain of described audio file.

Implement the embodiment of the present invention, there is following beneficial effect:

In the embodiment of the present invention, can based on MIDI file corresponding to audio file, build reference sequences and the characteristic sequence of described audio file, by carrying out analyzing and processing to described reference sequences and described characteristic sequence, can locate the refrain of described audio file, because the data volume of MIDI file is less, calculated load can be reduced in the refrain location based on MIDI file, promote the accuracy of audio frequency process, promote the intelligent of audio frequency process.

Accompanying drawing explanation

In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, be briefly described to the accompanying drawing used required in embodiment or description of the prior art below, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawing can also be obtained according to these accompanying drawings.

The process flow diagram of a kind of audio-frequency processing method that Fig. 1 provides for the embodiment of the present invention;

Fig. 2 is the process flow diagram of the embodiment of the step S103 shown in Fig. 1;

Fig. 3 is the process flow diagram of the embodiment of the step S104 shown in Fig. 1;

Fig. 4 is the process flow diagram of the embodiment of the step S105 shown in Fig. 1;

The structural representation of a kind of apparatus for processing audio that Fig. 5 provides for the embodiment of the present invention;

The structural representation of a kind of reference sequences construction unit that Fig. 6 provides for the embodiment of the present invention;

The structural representation of a kind of characteristic sequence construction unit that Fig. 7 provides for the embodiment of the present invention;

The structural representation of a kind of positioning unit that Fig. 8 provides for the embodiment of the present invention;

The structural representation of a kind of refrain positioning unit that Fig. 9 provides for the embodiment of the present invention.

Embodiment

Below in conjunction with the accompanying drawing in the embodiment of the present invention, be clearly and completely described the technical scheme in the embodiment of the present invention, obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, belong to the scope of protection of the invention.

In the embodiment of the present invention, audio file can include but not limited to: song, snatch of song, music, snatch of music, performance are found pleasure in, play happy fragment, hum song, hum the files such as snatch of song.MIDI is the electronic communication protocol of an industrial standard, and MIDI transmits the instruction of the such as non-acoustic such as note, controling parameters signal, and what be used to indicate between the operation of MIDI equipment and control MIDI equipment is mutual.MIDI file is normally the command file of ending with .mid, and these instructions can comprise start time of certain note, end time, represent the information such as eigenwert of the tonality feature of note.

The audio frequency processing scheme of the embodiment of the present invention, mainly based on the MIDI file that audio file is corresponding, carries out quick position to the refrain of audio file.The audio frequency processing scheme of the embodiment of the present invention can be applied in multiple scenes of internet arena, such as: the scene of the audio file in internet audio storehouse being carried out to analyzing and processing can be applied to, comprise: can be applicable to carry out in the scene that keynote searches the quick position of refrain and accurately extract, also can be applicable in the scene of humming search, carry out the quick position of refrain and accurately extraction, also can be applicable in the scene of melody identification, carry out the quick position of refrain and accurately extraction etc.; For another example: the scene of the audio file in internet audio storehouse being carried out to audition can be applied to, comprise: refrain can be provided to play online or audition before music download; Or, CRBT can be applied to and download or audition scene, comprise: quick position and accurately extract refrain, refrain is supplied to user as CRBT and carries out downloading or audition, etc.

Below in conjunction with accompanying drawing 1-accompanying drawing 4, the audio-frequency processing method that the embodiment of the present invention provides is described in detail.It should be noted that, performed by the apparatus for processing audio that audio-frequency processing method shown in accompanying drawing 1-accompanying drawing 4 can be provided by the embodiment of the present invention, this apparatus for processing audio can run in terminal device or server, wherein, terminal device can include but not limited to: the equipment such as PC (Personal Computer, personal computer), PAD (panel computer), mobile phone, smart mobile phone, notebook computer.

Referring to Fig. 1, is the process flow diagram of a kind of audio-frequency processing method that the embodiment of the present invention provides; The method can comprise the following steps S101-step S105.

S101, obtains the MIDI file that audio file is corresponding.

An an audio file corresponding MIDI file usually, this MIDI file can wait audio frequency producer to produce and generate by the composer of such as audio file, also can be produced and generated according to audio file by the equipment possessing MIDI making function.The MIDI file that audio file is corresponding can as the accuracy in pitch reference paper of this audio file, when user deduces again to audio file, the MIDI file that this audio file can be adopted corresponding carries out accuracy in pitch comparison to the content of again deducing, and the content that can be used for again deducing is marked.In this step, MIDI file corresponding to pending audio file can be obtained from internet audio storehouse.

S102, resolves described MIDI file, obtains at least one note, and the eigenwert of at least one note described and time attribute.

MIDI file is normally the command file of ending with .mid, and these instructions can comprise start time of certain note, end time, represent the information such as eigenwert of the tonality feature of note.In this step, resolve described MIDI file according to the format standard of MIDI file, can be obtained up to a few note, and the eigenwert of at least one note described and time attribute.

Wherein, note refers to the symbol for record length sound.Wherein, the eigenwert of note can be used for the tonality feature representing note, and usually, the span of the eigenwert of note is [21,108], and the eigenwert of note is larger, represents that the tone of this note is higher; The eigenwert of note is less, represents that the tone of this note is lower.Wherein, the time attribute of note can be used for the duration describing note, and the time attribute of note can comprise: the start time of note and the end time of note.

S103, adopts at least one note described, and the eigenwert of at least one note described and time attribute, builds the reference sequences of described audio file.

Described reference sequences comprises: the index of at least one reference element and at least one reference element described; Wherein, reference element comprise a note, comprise note eigenwert and comprise the time attribute of note.In this step, a note, the eigenwert of this note and the time attribute of this note can be defined as the three elements of a reference element, so, at least one note described, and the eigenwert of at least one note described and time attribute, then can respectively as the three elements of at least one reference element.By at least one reference element order arrangement described, generate the reference sequences of described audio file.

S104, adopts the eigenwert of at least one note described, builds the characteristic sequence of described audio file.

Described characteristic sequence comprises: the index of at least one characteristic element and at least one characteristic element described; Wherein, characteristic element comprises the eigenwert of a note.In this step, the eigenwert of a note can be defined as the key element of a characteristic element, so, the eigenwert of at least one note described, then can respectively as the key element of at least one characteristic element.By at least one characteristic element order arrangement described, generate the characteristic sequence of described audio file.

S105, carries out analyzing and processing to described reference sequences and described characteristic sequence, locates the refrain of described audio file.

Refrain, is often referred to the climax parts of audio file.For song, a song adopts AA ' BA ' Form usually, and A represents main song, and B represents refrain; Also immediately say, by " the main song of prelude+two sections+one section of refrain+music of moving into one's husband's household upon marriage+one section of refrain+a section main song+ending music ", be linked in sequence is formed a usual song.In this step, by carrying out analyzing and processing to described reference sequences and described characteristic sequence, can at least one section of refrain of 3dpa file.

Referring to Fig. 2, is the process flow diagram of the embodiment of the step S103 shown in Fig. 1; This step S103 can comprise the following steps s2001-step s2003.

S2001, according at least one note described, and the eigenwert of at least one note described and time attribute, determines at least one reference element, reference element comprise a note, comprise note eigenwert and comprise the time attribute of note.

Described in supposing, the quantity of at least one note is N, N is positive integer, in this step, can determine that the quantity of at least one reference element described is also N; This N number of reference element can be expressed as a ₁a _n, each reference element all comprises three elements, and these three elements comprise: the eigenwert of note, note and the time attribute of note; Such as: reference element a ₁comprise note one, the eigenwert of note one and the time attribute of note one; By that analogy, reference element a _ncomprise note N, the eigenwert of note N and the time attribute of note N.

S2002, according at least one reference element described comprise the time attribute of note, determine the index of described each reference element.

The time attribute of a note can be used for the duration describing note, and the time attribute of note can comprise: the start time of note and the end time of note.In this step, can according to each reference element comprise the sequencing of the start time of note, determine the index of described each reference element; Such as: suppose in note one to note N, start time of note two, secondly by that analogy, the start time most end of note N, then can determine reference element a at first the start time of note one ₁index be 1, reference element a ₂index be 2, by that analogy, reference element a _nindex be N.

S2003, according to the index of at least one reference element described, order arrangement at least one reference element described, obtains the reference sequences of described audio file.

According to example shown in the present embodiment, the reference sequences of described audio file can be expressed as note (i), the length of this reference sequences note (i) is N, wherein, i represents the index of each reference element in described reference sequences note (i), and i is positive integer and 0 < i≤N.

In practical application, a structure can be adopted to store this reference sequences note (i), and this structure can be expressed as follows:

tydef struct tag_note{

int start_ms；

int end_ms；

int note_value；

}Tnote

Tnote note；

Referring to Fig. 3, is the process flow diagram of the embodiment of the step S104 shown in Fig. 1; This step S104 can comprise the following steps s3001-step s3003.

S3001, according to the eigenwert of at least one note described, determine at least one characteristic element, a characteristic element comprises the eigenwert of a note.

According to the example in embodiment illustrated in fig. 2, in this step, can determine that the quantity of at least one characteristic element described is also N; This N number of characteristic element can be expressed as b ₁b _n, each characteristic element all comprises the eigenwert of a note; Such as: characteristic element b ₁comprise the eigenwert of note one; By that analogy, characteristic element b _ncomprise the eigenwert of note N.

S3002, according to the index of at least one reference element described, determines the index of corresponding at least one characteristic element described.

Reference element and characteristic element by comprise the eigenwert of note mutually corresponding, such as: reference element a ₁comprise the eigenwert of note one, characteristic element b ₁also the eigenwert of note one is comprised, then reference element a ₁with characteristic element b ₁corresponding; By that analogy, reference element a _ncomprise the eigenwert of note N, characteristic element b _nalso the eigenwert of note N is comprised, then reference element a _nwith characteristic element b _ncorresponding.In this step, according to the index of described each reference element, the index of corresponding described each characteristic element can be determined, such as: hypothetical reference element a ₁index be 1, then with described reference element a ₁corresponding characteristic element b ₁index be also 1; By that analogy, hypothetical reference element a _nindex be N, then with described reference element a _ncorresponding characteristic element b _nindex be also N.

S3003, according to the index of at least one characteristic element described, order arrangement at least one characteristic element described, obtains the characteristic sequence of described audio file.

According to example shown in the present embodiment, the characteristic sequence of described audio file can be expressed as note_value (i), the length of this characteristic sequence note_value (i) is N, wherein, i represents the index of each characteristic element in described characteristic sequence note_value (i), and i is positive integer and 0 < i≤N.

Referring to Fig. 4, is the process flow diagram of the embodiment of the step S105 shown in Fig. 1; This step S105 can comprise the following steps s4001-step s4005.

S4001, carries out maximum value calculation to described characteristic sequence, obtains the index of the maximal value of described characteristic sequence and target signature element corresponding to described maximal value.

In this step, can adopt following formula (1), carry out maximum value calculation to described characteristic sequence note_value (i), this formula (1) can be expressed as follows:

[ind,dval]＝max(note_value(i)) (1)

In above-mentioned formula (1), max () is for asking for maxima operation; Dval represents the value of maximal value; Ind represents the index of the target signature element that maximal value is corresponding, namely represents that index be the value of the note_value (ind) that the target signature element of ind is corresponding is maximal value dval.

S4002, according to the index of described target signature element, determines the index of the object reference element corresponding with the index of described target signature element.

The index of described target signature element is ind, namely the index of described target signature element in described characteristic sequence note_value (i) is ind, this step can determine that the index of the object reference element corresponding with the index of described target signature element is also ind, and namely the index of described object reference element in described reference sequences note (i) is also ind.

S4003, according to the index of described object reference element, obtain from described reference sequences described object reference element comprise the time attribute of note.

In this step, can first from described reference sequences note (i) registration wire be cited as the object reference element a of ind _ind, then obtain this object reference element a _indcomprise the time attribute of note.

S4004, adopt described object reference element comprise the time attribute of note, determine the positional information of refrain.

In this step, following formula (2) can be adopted, by described object reference element a _indcomprise start time of note, be defined as the positional information of refrain; This formula (2) can be expressed as follows:

Pos＝note(ind).start_ms (2)

In above-mentioned formula (2), Pos represents the positional information of refrain in described audio file.

S4005, according to the positional information of described refrain, locates refrain in described audio file.

Owing to adopting above-mentioned formula (2) can obtain the positional information of refrain in described audio file, this step then according to the positional information of refrain, can find or orients this refrain in described audio file.

Step s4005 specifically can comprise the following steps ss451-ss452:

Ss451, is normalized the positional information of described refrain.

In this step ss451, can according to actual needs, the time parameter being used for normalized is set, such as: can according to the feature of song, at interval [1s, 20s] interior random selecting m ₁and m ₂be set to the time parameter for normalized, wherein, m ₁and m ₂value can be equal, also can not wait.In this step ss551, the positional information of described refrain is normalized and can be comprised: positional information Pos formula (2) being calculated the refrain obtained is normalized to [Pos-m ₁, Pos+m ₂].

Ss452, according to the normalized positional information of refrain, locates refrain in described audio file.

In this step, can by normalized positional information [Pos-m ₁, Pos+m ₂] as the duration section of this section of refrain in described audio file, from described audio file, locate this section of refrain.

It should be noted that, the process having set forth the one section of refrain in location in described audio file embodiment illustrated in fig. 4, in practical application, if there is multistage refrain in described audio file, then step s4001 can be obtained up to the index of a few maximal value and target signature element corresponding to each maximal value, in the embodiment of the present invention, for the index of the target signature element of each maximal value and correspondence, respectively according to the process of description embodiment illustrated in fig. 4, each section of refrain can be located respectively in described audio file.

Below in conjunction with accompanying drawing 5-accompanying drawing 9, the apparatus for processing audio that the embodiment of the present invention provides is described in detail.It should be noted that, the apparatus for processing audio shown in accompanying drawing 5-accompanying drawing 9 can run in terminal device or server, for performing the audio-frequency processing method shown in accompanying drawing 1-accompanying drawing 4.Wherein, terminal device can include but not limited to: the equipment such as PC, PAD, mobile phone, smart mobile phone, notebook computer.

Referring to Fig. 5, is the structural representation of a kind of apparatus for processing audio that the embodiment of the present invention provides; This device can comprise: file obtaining unit 101, resolution unit 102, reference sequences construction unit 103, characteristic sequence construction unit 104 and positioning unit 105.

File obtaining unit 101, for obtaining MIDI file corresponding to audio file.

An an audio file corresponding MIDI file usually, this MIDI file can wait audio frequency producer to produce and generate by the composer of such as audio file, also can be produced and generated according to audio file by the equipment possessing MIDI making function.The MIDI file that audio file is corresponding can as the accuracy in pitch reference paper of this audio file, when user deduces again to audio file, the MIDI file that this audio file can be adopted corresponding carries out accuracy in pitch comparison to the content of again deducing, and the content that can be used for again deducing is marked.Described file obtaining unit 101 can obtain MIDI file corresponding to pending audio file from internet audio storehouse.

Resolution unit 102, for resolving described MIDI file, obtains at least one note, and the eigenwert of at least one note described and time attribute.

MIDI file is normally the command file of ending with .mid, and these instructions can comprise start time of certain note, end time, represent the information such as eigenwert of the tonality feature of note.Described resolution unit 102 resolves described MIDI file according to the format standard of MIDI file, can be obtained up to a few note, and the eigenwert of at least one note described and time attribute.

Reference sequences construction unit 103, for adopting at least one note described, and the eigenwert of at least one note described and time attribute, build the reference sequences of described audio file.

Described reference sequences comprises: the index of at least one reference element and at least one reference element described; Wherein, reference element comprise a note, comprise note eigenwert and comprise the time attribute of note.A note, the eigenwert of this note and the time attribute of this note can be defined as the three elements of a reference element, so, at least one note described, and the eigenwert of at least one note described and time attribute, then can respectively as the three elements of at least one reference element.Described reference sequences construction unit 103, by least one reference element order arrangement described, generates the reference sequences of described audio file.

Characteristic sequence construction unit 104, for adopting the eigenwert of at least one note described, builds the characteristic sequence of described audio file.

Described characteristic sequence comprises: the index of at least one characteristic element and at least one characteristic element described; Wherein, characteristic element comprises the eigenwert of a note.The eigenwert of a note can be defined as the key element of a characteristic element, so, the eigenwert of at least one note described, then can respectively as the key element of at least one characteristic element.Described characteristic sequence construction unit 104, by least one characteristic element order arrangement described, generates the characteristic sequence of described audio file.

Positioning unit 105, for carrying out analyzing and processing to described reference sequences and described characteristic sequence, locates the refrain of described audio file.

Refrain, is often referred to the climax parts of audio file.For song, a song adopts AA ' BA ' Form usually, and A represents main song, and B represents refrain; Also immediately say, by " the main song of prelude+two sections+one section of refrain+music of moving into one's husband's household upon marriage+one section of refrain+a section main song+ending music ", be linked in sequence is formed a usual song.Described positioning unit 105, can at least one section of refrain of 3dpa file by carrying out analyzing and processing to described reference sequences and described characteristic sequence.

Referring to Fig. 6, is the structural representation of a kind of reference sequences construction unit that the embodiment of the present invention provides; This reference sequences construction unit 103 can comprise: reference element determining unit 1301, first index determining unit 1302 and the first construction unit 1303.

Reference element determining unit 1301, for according at least one note described, and the eigenwert of at least one note described and time attribute, determine at least one reference element, reference element comprise a note, comprise note eigenwert and comprise the time attribute of note.

Described in supposing, the quantity of at least one note is N, N is positive integer, and described reference element determining unit 1301 can determine that the quantity of at least one reference element described is also N; This N number of reference element can be expressed as a ₁a _n, each reference element all comprises three elements, and these three elements comprise: the eigenwert of note, note and the time attribute of note; Such as: reference element a ₁comprise note one, the eigenwert of note one and the time attribute of note one; By that analogy, reference element a _ncomprise note N, the eigenwert of note N and the time attribute of note N.

First index determining unit 1302, for according at least one reference element described comprise the time attribute of note, determine the index of described each reference element.

The time attribute of a note can be used for the duration describing note, and the time attribute of note can comprise: the start time of note and the end time of note.Described first index determining unit 1302 can according to each reference element comprise the sequencing of the start time of note, determine the index of described each reference element; Such as: suppose in note one to note N, start time of note two, secondly by that analogy, the start time most end of note N, then can determine reference element a at first the start time of note one ₁index be 1, reference element a ₂index be 2, by that analogy, reference element a _nindex be N.

First construction unit 1303, for the index according at least one reference element described, order arrangement at least one reference element described, obtains the reference sequences of described audio file.

tydef struct tag_note{

int start_ms；

int end_ms；

int note_value；

}Tnote

Tnote note；

Referring to Fig. 7, is the structural representation of a kind of characteristic sequence construction unit that the embodiment of the present invention provides; This characteristic sequence construction unit 104 can comprise: characteristic element determining unit 1401, second index determining unit 1402 and the second construction unit 1403.

Characteristic element determining unit 1401, for the eigenwert according at least one note described, determine at least one characteristic element, a characteristic element comprises the eigenwert of a note.

According to the example in embodiment illustrated in fig. 6, described characteristic element determining unit 1401 can determine that the quantity of at least one characteristic element described is also N; This N number of characteristic element can be expressed as b ₁b _n, each characteristic element all comprises the eigenwert of a note; Such as: characteristic element b ₁comprise the eigenwert of note one; By that analogy, characteristic element b _ncomprise the eigenwert of note N.

Second index determining unit 1402, for the index according at least one reference element described, determines the index of corresponding at least one characteristic element described.

Reference element and characteristic element by comprise the eigenwert of note mutually corresponding, such as: reference element a ₁comprise the eigenwert of note one, characteristic element b ₁also the eigenwert of note one is comprised, then reference element a ₁with characteristic element b ₁corresponding; By that analogy, reference element a _ncomprise the eigenwert of note N, characteristic element b _nalso the eigenwert of note N is comprised, then reference element a _nwith characteristic element b _ncorresponding.Described second index determining unit 1402, according to the index of described each reference element, can determine the index of corresponding described each characteristic element, such as: hypothetical reference element a ₁index be 1, then with described reference element a ₁corresponding characteristic element b ₁index be also 1; By that analogy, hypothetical reference element a _nindex be N, then with described reference element a _ncorresponding characteristic element b _nindex be also N.

Second construction unit 1403, for the index according at least one characteristic element described, order arrangement at least one characteristic element described, obtains the characteristic sequence of described audio file.

Referring to Fig. 8, is the structural representation of a kind of positioning unit that the embodiment of the present invention provides; This positioning unit 105 can comprise: maximum value calculation unit 1501, target index determining unit 1502, time attribute determining unit 1503, positional information determining unit 1504 and refrain positioning unit 1505.

Maximum value calculation unit 1501, for carrying out maximum value calculation to described characteristic sequence, obtains the index of the maximal value of described characteristic sequence and target signature element corresponding to described at least maximal value.

Described maximum value calculation unit 1501 can adopt embodiment illustrated in fig. 4 in formula (1), maximum value calculation is carried out to described characteristic sequence note_value (i).

Target index determining unit 1502, for the index according to described target signature element, determines the index of the object reference element corresponding with the index of described target signature element.

The index of described target signature element is ind, namely the index of described target signature element in described characteristic sequence note_value (i) is ind, described target index determining unit 1502 can determine that the index of the object reference element corresponding with the index of described target signature element is also ind, and namely the index of described object reference element in described reference sequences note (i) is also ind.

Time attribute determining unit 1503, for the index according to described object reference element, obtain from described reference sequences described object reference element comprise the time attribute of note.

Described time attribute determining unit 1503 can first from described reference sequences note (i) registration wire be cited as the object reference element a of ind _ind, then obtain this object reference element a _indcomprise the time attribute of note.

Positional information determining unit 1504, for adopt described object reference element comprise the time attribute of note, determine the positional information of refrain.

Described positional information determining unit 1504 can adopt embodiment illustrated in fig. 4 in formula (2), by described object reference element a _indcomprise start time of note, be defined as the positional information of refrain.

Refrain positioning unit 1505, for the positional information according to described refrain, locates refrain in described audio file.

Owing to adopting above-mentioned formula (2) can obtain the positional information of refrain in described audio file, described refrain positioning unit 1505 according to the positional information of refrain, can find or orients this refrain in described audio file.

Please also refer to Fig. 9, it is the structural representation of a kind of refrain positioning unit that the embodiment of the present invention provides; This refrain positioning unit 1505 can comprise: normalized subelement 1551 and refrain locator unit 1552.

Normalized subelement 1551, for being normalized the positional information of described refrain.

Described normalized subelement 1551 can according to actual needs, arranges the time parameter being used for normalized, such as: can according to the feature of song, at interval [1s, 20s] interior random selecting m ₁and m ₂be set to the time parameter for normalized, wherein, m ₁and m ₂value can be equal, also can not wait.The positional information of described normalized subelement 1551 to described refrain is normalized and can comprises: positional information Pos formula (2) being calculated the refrain obtained is normalized to [Pos-m ₁, Pos+m ₂].

Refrain locator unit 1552, for the normalized positional information according to refrain, locates refrain in described audio file.

Described refrain locator unit 1552 can by normalized positional information [Pos-m ₁, Pos+m ₂] as the duration section of this section of refrain in described audio file, from described audio file, locate this section of refrain.

It should be noted that, if there is multistage refrain in described audio file, maximum value calculation unit 1501 in embodiment illustrated in fig. 8 can be obtained up to the index of a few maximal value and target signature element corresponding to each maximal value, in the embodiment of the present invention, for the index of the target signature element of each maximal value and correspondence, respectively according to each functional unit of described described positioning unit 105 embodiment illustrated in fig. 8, each section of refrain can be located respectively in described audio file.

One of ordinary skill in the art will appreciate that all or part of flow process realized in above-described embodiment method, that the hardware that can carry out instruction relevant by computer program has come, described program can be stored in a computer read/write memory medium, this program, when performing, can comprise the flow process of the embodiment as above-mentioned each side method.Wherein, described storage medium can be magnetic disc, CD, read-only store-memory body (Read-Only Memory, ROM) or random store-memory body (Random Access Memory, RAM) etc.

Above disclosedly be only present pre-ferred embodiments, certainly can not limit the interest field of the present invention with this, therefore according to the equivalent variations that the claims in the present invention are done, still belong to the scope that the present invention is contained.

Claims

1. an audio-frequency processing method, is characterized in that, comprising:

Obtain the musical instrument digital interface MIDI file that audio file is corresponding;

2. the method for claim 1, is characterized in that, described reference sequences comprises: the index of at least one reference element and at least one reference element described;

Wherein, reference element comprise a note, comprise note eigenwert and comprise the time attribute of note;

Described characteristic sequence comprises: the index of at least one characteristic element and at least one characteristic element described;

Wherein, characteristic element comprises the eigenwert of a note.

3. method as claimed in claim 2, is characterized in that, describedly carries out analyzing and processing to described reference sequences and described characteristic sequence, locates the refrain of described audio file, comprising:

Maximum value calculation is carried out to described characteristic sequence, obtains the index of the maximal value of described characteristic sequence and target signature element corresponding to described maximal value;

According to the index of described target signature element, determine the index of the object reference element corresponding with the index of described target signature element;

According to the index of described object reference element, obtain from described reference sequences described object reference element comprise the time attribute of note;

Adopt described object reference element comprise the time attribute of note, determine the positional information of refrain;

According to the positional information of described refrain, in described audio file, locate refrain.

4. method as claimed in claim 3, is characterized in that the described positional information according to described refrain is located refrain, being comprised in described audio file:

The positional information of described refrain is normalized;

According to the normalized positional information of refrain, in described audio file, locate refrain.

5. the method as described in any one of claim 1-4, is characterized in that, at least one note described in described employing, and the eigenwert of at least one note described and time attribute, build the reference sequences of described audio file, build the reference sequences of described audio file, comprising:

According at least one note described, and the eigenwert of at least one note described and time attribute, determine at least one reference element, reference element comprise a note, comprise note eigenwert and comprise the time attribute of note;

According at least one reference element described comprise the time attribute of note, determine the index of at least one reference element described;

According to the index of at least one reference element described, order arrangement at least one reference element described, obtains the reference sequences of described audio file.

6. method as claimed in claim 5, it is characterized in that, the eigenwert of at least one note described in described employing, builds the characteristic sequence of described audio file, comprising:

According to the eigenwert of at least one note described, determine at least one characteristic element, a characteristic element comprises the eigenwert of a note;

According to the index of at least one reference element described, determine the index of corresponding at least one characteristic element described;

According to the index of at least one characteristic element described, order arrangement at least one characteristic element described, obtains the characteristic sequence of described audio file.

7. an apparatus for processing audio, is characterized in that, comprising:

File obtaining unit, for obtaining MIDI file corresponding to audio file;

8. device as claimed in claim 7, it is characterized in that, described reference sequences comprises: the index of at least one reference element and at least one reference element described;

Wherein, characteristic element comprises the eigenwert of a note.

9. device as claimed in claim 8, it is characterized in that, described positioning unit comprises:

Maximum value calculation unit, for carrying out maximum value calculation to described characteristic sequence, obtains the index of the maximal value of described characteristic sequence and target signature element corresponding to described at least maximal value;

Target index determining unit, for the index according to described target signature element, determines the index of the object reference element corresponding with the index of described target signature element;

Time attribute determining unit, for the index according to described object reference element, obtain from described reference sequences described object reference element comprise the time attribute of note;

Positional information determining unit, for adopt described object reference element comprise the time attribute of note, determine the positional information of refrain;

Refrain positioning unit, for the positional information according to described refrain, locates refrain in described audio file.

10. device as claimed in claim 9, it is characterized in that, described refrain positioning unit comprises:

Normalized subelement, for being normalized the positional information of described refrain;

Refrain locator unit, for the normalized positional information according to refrain, locates refrain in described audio file.

11. devices as described in any one of claim 7-10, it is characterized in that, described reference sequences construction unit comprises:

Reference element determining unit, for according at least one note described, and the eigenwert of at least one note described and time attribute, determine at least one reference element, reference element comprise a note, comprise note eigenwert and comprise the time attribute of note;

First index determining unit, for according at least one reference element described comprise the time attribute of note, determine the index of at least one reference element described;

First construction unit, for the index according at least one reference element described, order arrangement at least one reference element described, obtains the reference sequences of described audio file.

12. devices as claimed in claim 11, it is characterized in that, described characteristic sequence construction unit comprises:

Characteristic element determining unit, for the eigenwert according at least one note described, determine at least one characteristic element, a characteristic element comprises the eigenwert of a note;

Second index determining unit, for the index according at least one reference element described, determines the index of corresponding at least one characteristic element described;

Second construction unit, for the index according at least one characteristic element described, order arrangement at least one characteristic element described, obtains the characteristic sequence of described audio file.