CN104978380A - Audio frequency processing method and device - Google Patents

Audio frequency processing method and device Download PDF

Info

Publication number
CN104978380A
CN104978380A CN201410568013.9A CN201410568013A CN104978380A CN 104978380 A CN104978380 A CN 104978380A CN 201410568013 A CN201410568013 A CN 201410568013A CN 104978380 A CN104978380 A CN 104978380A
Authority
CN
China
Prior art keywords
note
index
refrain
eigenwert
audio file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410568013.9A
Other languages
Chinese (zh)
Other versions
CN104978380B (en
Inventor
赵伟峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Kugou Computer Technology Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201410568013.9A priority Critical patent/CN104978380B/en
Publication of CN104978380A publication Critical patent/CN104978380A/en
Application granted granted Critical
Publication of CN104978380B publication Critical patent/CN104978380B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Electrophonic Musical Instruments (AREA)
  • Auxiliary Devices For Music (AREA)

Abstract

The embodiment of the invention provides an audio frequency processing method and device. The method comprises the following steps: obtaining a MIDI (Musical Instrument Digital Interface) file corresponding to an audio frequency file; analyzing the MIDI file, and obtaining at least one note and the characteristic value and the time attribute of at least one note; adopting at least note and the characteristic value and the time attribute of at least one note to construct a reference sequence of the audio frequency file; adopting the characteristic value of at least one note to construct a characteristic sequence of the audio file; and carrying out analysis processing on the reference sequence and the characteristic sequence, and positioning the refrain of the audio frequency file. The audio frequency file can be subjected to refrain positioning on the basis of the MIDI file corresponding to the audio frequency file, audio frequency processing accuracy is improved, and audio frequency processing intellectuality is improved.

Description

A kind of audio-frequency processing method and device
Technical field
The present invention relates to Internet technical field, particularly relate to Audiotechnica field, be specifically related to a kind of audio-frequency processing method and device.
Background technology
Refrain, is often referred to the climax parts of audio file (as song, music etc.).For song, a song adopts AA ' BA ' Form (music structure) usually, and A represents main song, and B represents refrain; Also immediately say, by " the main song of prelude+two sections+one section of refrain+music of moving into one's husband's household upon marriage+one section of refrain+a section main song+ending music ", be linked in sequence is formed a usual song.Refrain location plays an important role to the treatment and analysis of audio file, prior art is mainly based on the analyzing and processing location refrain to audio file, but the analyzing and processing calculated amount based on audio file is large, accuracy is low, thus reduces the intelligent of audio frequency process.
Summary of the invention
The embodiment of the present invention provides a kind of audio-frequency processing method and device, can based on MIDI (Musical Instrument Digital Interface corresponding to audio file, musical instrument digital interface) file carries out refrain location to audio file, promote the accuracy of audio frequency process, promote the intelligent of audio frequency process.
Embodiment of the present invention first aspect provides a kind of audio-frequency processing method, can comprise:
Obtain the MIDI file that audio file is corresponding;
Resolve described MIDI file, obtain at least one note, and the eigenwert of at least one note described and time attribute;
Adopt at least one note described, and the eigenwert of at least one note described and time attribute, build the reference sequences of described audio file;
Adopt the eigenwert of at least one note described, build the characteristic sequence of described audio file;
Analyzing and processing is carried out to described reference sequences and described characteristic sequence, locates the refrain of described audio file.
Embodiment of the present invention second aspect provides a kind of apparatus for processing audio, can comprise:
File obtaining unit, for obtaining MIDI file corresponding to audio file;
Resolution unit, for resolving described MIDI file, obtains at least one note, and the eigenwert of at least one note described and time attribute;
Reference sequences construction unit, for adopting at least one note described, and the eigenwert of at least one note described and time attribute, build the reference sequences of described audio file;
Characteristic sequence construction unit, for adopting the eigenwert of at least one note described, builds the characteristic sequence of described audio file;
Positioning unit, for carrying out analyzing and processing to described reference sequences and described characteristic sequence, locates the refrain of described audio file.
Implement the embodiment of the present invention, there is following beneficial effect:
In the embodiment of the present invention, can based on MIDI file corresponding to audio file, build reference sequences and the characteristic sequence of described audio file, by carrying out analyzing and processing to described reference sequences and described characteristic sequence, can locate the refrain of described audio file, because the data volume of MIDI file is less, calculated load can be reduced in the refrain location based on MIDI file, promote the accuracy of audio frequency process, promote the intelligent of audio frequency process.
Accompanying drawing explanation
In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, be briefly described to the accompanying drawing used required in embodiment or description of the prior art below, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawing can also be obtained according to these accompanying drawings.
The process flow diagram of a kind of audio-frequency processing method that Fig. 1 provides for the embodiment of the present invention;
Fig. 2 is the process flow diagram of the embodiment of the step S103 shown in Fig. 1;
Fig. 3 is the process flow diagram of the embodiment of the step S104 shown in Fig. 1;
Fig. 4 is the process flow diagram of the embodiment of the step S105 shown in Fig. 1;
The structural representation of a kind of apparatus for processing audio that Fig. 5 provides for the embodiment of the present invention;
The structural representation of a kind of reference sequences construction unit that Fig. 6 provides for the embodiment of the present invention;
The structural representation of a kind of characteristic sequence construction unit that Fig. 7 provides for the embodiment of the present invention;
The structural representation of a kind of positioning unit that Fig. 8 provides for the embodiment of the present invention;
The structural representation of a kind of refrain positioning unit that Fig. 9 provides for the embodiment of the present invention.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, be clearly and completely described the technical scheme in the embodiment of the present invention, obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, belong to the scope of protection of the invention.
In the embodiment of the present invention, audio file can include but not limited to: song, snatch of song, music, snatch of music, performance are found pleasure in, play happy fragment, hum song, hum the files such as snatch of song.MIDI is the electronic communication protocol of an industrial standard, and MIDI transmits the instruction of the such as non-acoustic such as note, controling parameters signal, and what be used to indicate between the operation of MIDI equipment and control MIDI equipment is mutual.MIDI file is normally the command file of ending with .mid, and these instructions can comprise start time of certain note, end time, represent the information such as eigenwert of the tonality feature of note.
The audio frequency processing scheme of the embodiment of the present invention, mainly based on the MIDI file that audio file is corresponding, carries out quick position to the refrain of audio file.The audio frequency processing scheme of the embodiment of the present invention can be applied in multiple scenes of internet arena, such as: the scene of the audio file in internet audio storehouse being carried out to analyzing and processing can be applied to, comprise: can be applicable to carry out in the scene that keynote searches the quick position of refrain and accurately extract, also can be applicable in the scene of humming search, carry out the quick position of refrain and accurately extraction, also can be applicable in the scene of melody identification, carry out the quick position of refrain and accurately extraction etc.; For another example: the scene of the audio file in internet audio storehouse being carried out to audition can be applied to, comprise: refrain can be provided to play online or audition before music download; Or, CRBT can be applied to and download or audition scene, comprise: quick position and accurately extract refrain, refrain is supplied to user as CRBT and carries out downloading or audition, etc.
Below in conjunction with accompanying drawing 1-accompanying drawing 4, the audio-frequency processing method that the embodiment of the present invention provides is described in detail.It should be noted that, performed by the apparatus for processing audio that audio-frequency processing method shown in accompanying drawing 1-accompanying drawing 4 can be provided by the embodiment of the present invention, this apparatus for processing audio can run in terminal device or server, wherein, terminal device can include but not limited to: the equipment such as PC (Personal Computer, personal computer), PAD (panel computer), mobile phone, smart mobile phone, notebook computer.
Referring to Fig. 1, is the process flow diagram of a kind of audio-frequency processing method that the embodiment of the present invention provides; The method can comprise the following steps S101-step S105.
S101, obtains the MIDI file that audio file is corresponding.
An an audio file corresponding MIDI file usually, this MIDI file can wait audio frequency producer to produce and generate by the composer of such as audio file, also can be produced and generated according to audio file by the equipment possessing MIDI making function.The MIDI file that audio file is corresponding can as the accuracy in pitch reference paper of this audio file, when user deduces again to audio file, the MIDI file that this audio file can be adopted corresponding carries out accuracy in pitch comparison to the content of again deducing, and the content that can be used for again deducing is marked.In this step, MIDI file corresponding to pending audio file can be obtained from internet audio storehouse.
S102, resolves described MIDI file, obtains at least one note, and the eigenwert of at least one note described and time attribute.
MIDI file is normally the command file of ending with .mid, and these instructions can comprise start time of certain note, end time, represent the information such as eigenwert of the tonality feature of note.In this step, resolve described MIDI file according to the format standard of MIDI file, can be obtained up to a few note, and the eigenwert of at least one note described and time attribute.
Wherein, note refers to the symbol for record length sound.Wherein, the eigenwert of note can be used for the tonality feature representing note, and usually, the span of the eigenwert of note is [21,108], and the eigenwert of note is larger, represents that the tone of this note is higher; The eigenwert of note is less, represents that the tone of this note is lower.Wherein, the time attribute of note can be used for the duration describing note, and the time attribute of note can comprise: the start time of note and the end time of note.
S103, adopts at least one note described, and the eigenwert of at least one note described and time attribute, builds the reference sequences of described audio file.
Described reference sequences comprises: the index of at least one reference element and at least one reference element described; Wherein, reference element comprise a note, comprise note eigenwert and comprise the time attribute of note.In this step, a note, the eigenwert of this note and the time attribute of this note can be defined as the three elements of a reference element, so, at least one note described, and the eigenwert of at least one note described and time attribute, then can respectively as the three elements of at least one reference element.By at least one reference element order arrangement described, generate the reference sequences of described audio file.
S104, adopts the eigenwert of at least one note described, builds the characteristic sequence of described audio file.
Described characteristic sequence comprises: the index of at least one characteristic element and at least one characteristic element described; Wherein, characteristic element comprises the eigenwert of a note.In this step, the eigenwert of a note can be defined as the key element of a characteristic element, so, the eigenwert of at least one note described, then can respectively as the key element of at least one characteristic element.By at least one characteristic element order arrangement described, generate the characteristic sequence of described audio file.
S105, carries out analyzing and processing to described reference sequences and described characteristic sequence, locates the refrain of described audio file.
Refrain, is often referred to the climax parts of audio file.For song, a song adopts AA ' BA ' Form usually, and A represents main song, and B represents refrain; Also immediately say, by " the main song of prelude+two sections+one section of refrain+music of moving into one's husband's household upon marriage+one section of refrain+a section main song+ending music ", be linked in sequence is formed a usual song.In this step, by carrying out analyzing and processing to described reference sequences and described characteristic sequence, can at least one section of refrain of 3dpa file.
In the embodiment of the present invention, can based on MIDI file corresponding to audio file, build reference sequences and the characteristic sequence of described audio file, by carrying out analyzing and processing to described reference sequences and described characteristic sequence, can locate the refrain of described audio file, because the data volume of MIDI file is less, calculated load can be reduced in the refrain location based on MIDI file, promote the accuracy of audio frequency process, promote the intelligent of audio frequency process.
Referring to Fig. 2, is the process flow diagram of the embodiment of the step S103 shown in Fig. 1; This step S103 can comprise the following steps s2001-step s2003.
S2001, according at least one note described, and the eigenwert of at least one note described and time attribute, determines at least one reference element, reference element comprise a note, comprise note eigenwert and comprise the time attribute of note.
Described in supposing, the quantity of at least one note is N, N is positive integer, in this step, can determine that the quantity of at least one reference element described is also N; This N number of reference element can be expressed as a 1a n, each reference element all comprises three elements, and these three elements comprise: the eigenwert of note, note and the time attribute of note; Such as: reference element a 1comprise note one, the eigenwert of note one and the time attribute of note one; By that analogy, reference element a ncomprise note N, the eigenwert of note N and the time attribute of note N.
S2002, according at least one reference element described comprise the time attribute of note, determine the index of described each reference element.
The time attribute of a note can be used for the duration describing note, and the time attribute of note can comprise: the start time of note and the end time of note.In this step, can according to each reference element comprise the sequencing of the start time of note, determine the index of described each reference element; Such as: suppose in note one to note N, start time of note two, secondly by that analogy, the start time most end of note N, then can determine reference element a at first the start time of note one 1index be 1, reference element a 2index be 2, by that analogy, reference element a nindex be N.
S2003, according to the index of at least one reference element described, order arrangement at least one reference element described, obtains the reference sequences of described audio file.
According to example shown in the present embodiment, the reference sequences of described audio file can be expressed as note (i), the length of this reference sequences note (i) is N, wherein, i represents the index of each reference element in described reference sequences note (i), and i is positive integer and 0 < i≤N.
In practical application, a structure can be adopted to store this reference sequences note (i), and this structure can be expressed as follows:
tydef struct tag_note{
int start_ms;
int end_ms;
int note_value;
}Tnote
Tnote note;
In the embodiment of the present invention, can based on MIDI file corresponding to audio file, build reference sequences and the characteristic sequence of described audio file, by carrying out analyzing and processing to described reference sequences and described characteristic sequence, can locate the refrain of described audio file, because the data volume of MIDI file is less, calculated load can be reduced in the refrain location based on MIDI file, promote the accuracy of audio frequency process, promote the intelligent of audio frequency process.
Referring to Fig. 3, is the process flow diagram of the embodiment of the step S104 shown in Fig. 1; This step S104 can comprise the following steps s3001-step s3003.
S3001, according to the eigenwert of at least one note described, determine at least one characteristic element, a characteristic element comprises the eigenwert of a note.
According to the example in embodiment illustrated in fig. 2, in this step, can determine that the quantity of at least one characteristic element described is also N; This N number of characteristic element can be expressed as b 1b n, each characteristic element all comprises the eigenwert of a note; Such as: characteristic element b 1comprise the eigenwert of note one; By that analogy, characteristic element b ncomprise the eigenwert of note N.
S3002, according to the index of at least one reference element described, determines the index of corresponding at least one characteristic element described.
Reference element and characteristic element by comprise the eigenwert of note mutually corresponding, such as: reference element a 1comprise the eigenwert of note one, characteristic element b 1also the eigenwert of note one is comprised, then reference element a 1with characteristic element b 1corresponding; By that analogy, reference element a ncomprise the eigenwert of note N, characteristic element b nalso the eigenwert of note N is comprised, then reference element a nwith characteristic element b ncorresponding.In this step, according to the index of described each reference element, the index of corresponding described each characteristic element can be determined, such as: hypothetical reference element a 1index be 1, then with described reference element a 1corresponding characteristic element b 1index be also 1; By that analogy, hypothetical reference element a nindex be N, then with described reference element a ncorresponding characteristic element b nindex be also N.
S3003, according to the index of at least one characteristic element described, order arrangement at least one characteristic element described, obtains the characteristic sequence of described audio file.
According to example shown in the present embodiment, the characteristic sequence of described audio file can be expressed as note_value (i), the length of this characteristic sequence note_value (i) is N, wherein, i represents the index of each characteristic element in described characteristic sequence note_value (i), and i is positive integer and 0 < i≤N.
In the embodiment of the present invention, can based on MIDI file corresponding to audio file, build reference sequences and the characteristic sequence of described audio file, by carrying out analyzing and processing to described reference sequences and described characteristic sequence, can locate the refrain of described audio file, because the data volume of MIDI file is less, calculated load can be reduced in the refrain location based on MIDI file, promote the accuracy of audio frequency process, promote the intelligent of audio frequency process.
Referring to Fig. 4, is the process flow diagram of the embodiment of the step S105 shown in Fig. 1; This step S105 can comprise the following steps s4001-step s4005.
S4001, carries out maximum value calculation to described characteristic sequence, obtains the index of the maximal value of described characteristic sequence and target signature element corresponding to described maximal value.
In this step, can adopt following formula (1), carry out maximum value calculation to described characteristic sequence note_value (i), this formula (1) can be expressed as follows:
[ind,dval]=max(note_value(i)) (1)
In above-mentioned formula (1), max () is for asking for maxima operation; Dval represents the value of maximal value; Ind represents the index of the target signature element that maximal value is corresponding, namely represents that index be the value of the note_value (ind) that the target signature element of ind is corresponding is maximal value dval.
S4002, according to the index of described target signature element, determines the index of the object reference element corresponding with the index of described target signature element.
The index of described target signature element is ind, namely the index of described target signature element in described characteristic sequence note_value (i) is ind, this step can determine that the index of the object reference element corresponding with the index of described target signature element is also ind, and namely the index of described object reference element in described reference sequences note (i) is also ind.
S4003, according to the index of described object reference element, obtain from described reference sequences described object reference element comprise the time attribute of note.
In this step, can first from described reference sequences note (i) registration wire be cited as the object reference element a of ind ind, then obtain this object reference element a indcomprise the time attribute of note.
S4004, adopt described object reference element comprise the time attribute of note, determine the positional information of refrain.
In this step, following formula (2) can be adopted, by described object reference element a indcomprise start time of note, be defined as the positional information of refrain; This formula (2) can be expressed as follows:
Pos=note(ind).start_ms (2)
In above-mentioned formula (2), Pos represents the positional information of refrain in described audio file.
S4005, according to the positional information of described refrain, locates refrain in described audio file.
Owing to adopting above-mentioned formula (2) can obtain the positional information of refrain in described audio file, this step then according to the positional information of refrain, can find or orients this refrain in described audio file.
Step s4005 specifically can comprise the following steps ss451-ss452:
Ss451, is normalized the positional information of described refrain.
In this step ss451, can according to actual needs, the time parameter being used for normalized is set, such as: can according to the feature of song, at interval [1s, 20s] interior random selecting m 1and m 2be set to the time parameter for normalized, wherein, m 1and m 2value can be equal, also can not wait.In this step ss551, the positional information of described refrain is normalized and can be comprised: positional information Pos formula (2) being calculated the refrain obtained is normalized to [Pos-m 1, Pos+m 2].
Ss452, according to the normalized positional information of refrain, locates refrain in described audio file.
In this step, can by normalized positional information [Pos-m 1, Pos+m 2] as the duration section of this section of refrain in described audio file, from described audio file, locate this section of refrain.
It should be noted that, the process having set forth the one section of refrain in location in described audio file embodiment illustrated in fig. 4, in practical application, if there is multistage refrain in described audio file, then step s4001 can be obtained up to the index of a few maximal value and target signature element corresponding to each maximal value, in the embodiment of the present invention, for the index of the target signature element of each maximal value and correspondence, respectively according to the process of description embodiment illustrated in fig. 4, each section of refrain can be located respectively in described audio file.
In the embodiment of the present invention, can based on MIDI file corresponding to audio file, build reference sequences and the characteristic sequence of described audio file, by carrying out analyzing and processing to described reference sequences and described characteristic sequence, can locate the refrain of described audio file, because the data volume of MIDI file is less, calculated load can be reduced in the refrain location based on MIDI file, promote the accuracy of audio frequency process, promote the intelligent of audio frequency process.
Below in conjunction with accompanying drawing 5-accompanying drawing 9, the apparatus for processing audio that the embodiment of the present invention provides is described in detail.It should be noted that, the apparatus for processing audio shown in accompanying drawing 5-accompanying drawing 9 can run in terminal device or server, for performing the audio-frequency processing method shown in accompanying drawing 1-accompanying drawing 4.Wherein, terminal device can include but not limited to: the equipment such as PC, PAD, mobile phone, smart mobile phone, notebook computer.
Referring to Fig. 5, is the structural representation of a kind of apparatus for processing audio that the embodiment of the present invention provides; This device can comprise: file obtaining unit 101, resolution unit 102, reference sequences construction unit 103, characteristic sequence construction unit 104 and positioning unit 105.
File obtaining unit 101, for obtaining MIDI file corresponding to audio file.
An an audio file corresponding MIDI file usually, this MIDI file can wait audio frequency producer to produce and generate by the composer of such as audio file, also can be produced and generated according to audio file by the equipment possessing MIDI making function.The MIDI file that audio file is corresponding can as the accuracy in pitch reference paper of this audio file, when user deduces again to audio file, the MIDI file that this audio file can be adopted corresponding carries out accuracy in pitch comparison to the content of again deducing, and the content that can be used for again deducing is marked.Described file obtaining unit 101 can obtain MIDI file corresponding to pending audio file from internet audio storehouse.
Resolution unit 102, for resolving described MIDI file, obtains at least one note, and the eigenwert of at least one note described and time attribute.
MIDI file is normally the command file of ending with .mid, and these instructions can comprise start time of certain note, end time, represent the information such as eigenwert of the tonality feature of note.Described resolution unit 102 resolves described MIDI file according to the format standard of MIDI file, can be obtained up to a few note, and the eigenwert of at least one note described and time attribute.
Wherein, note refers to the symbol for record length sound.Wherein, the eigenwert of note can be used for the tonality feature representing note, and usually, the span of the eigenwert of note is [21,108], and the eigenwert of note is larger, represents that the tone of this note is higher; The eigenwert of note is less, represents that the tone of this note is lower.Wherein, the time attribute of note can be used for the duration describing note, and the time attribute of note can comprise: the start time of note and the end time of note.
Reference sequences construction unit 103, for adopting at least one note described, and the eigenwert of at least one note described and time attribute, build the reference sequences of described audio file.
Described reference sequences comprises: the index of at least one reference element and at least one reference element described; Wherein, reference element comprise a note, comprise note eigenwert and comprise the time attribute of note.A note, the eigenwert of this note and the time attribute of this note can be defined as the three elements of a reference element, so, at least one note described, and the eigenwert of at least one note described and time attribute, then can respectively as the three elements of at least one reference element.Described reference sequences construction unit 103, by least one reference element order arrangement described, generates the reference sequences of described audio file.
Characteristic sequence construction unit 104, for adopting the eigenwert of at least one note described, builds the characteristic sequence of described audio file.
Described characteristic sequence comprises: the index of at least one characteristic element and at least one characteristic element described; Wherein, characteristic element comprises the eigenwert of a note.The eigenwert of a note can be defined as the key element of a characteristic element, so, the eigenwert of at least one note described, then can respectively as the key element of at least one characteristic element.Described characteristic sequence construction unit 104, by least one characteristic element order arrangement described, generates the characteristic sequence of described audio file.
Positioning unit 105, for carrying out analyzing and processing to described reference sequences and described characteristic sequence, locates the refrain of described audio file.
Refrain, is often referred to the climax parts of audio file.For song, a song adopts AA ' BA ' Form usually, and A represents main song, and B represents refrain; Also immediately say, by " the main song of prelude+two sections+one section of refrain+music of moving into one's husband's household upon marriage+one section of refrain+a section main song+ending music ", be linked in sequence is formed a usual song.Described positioning unit 105, can at least one section of refrain of 3dpa file by carrying out analyzing and processing to described reference sequences and described characteristic sequence.
In the embodiment of the present invention, can based on MIDI file corresponding to audio file, build reference sequences and the characteristic sequence of described audio file, by carrying out analyzing and processing to described reference sequences and described characteristic sequence, can locate the refrain of described audio file, because the data volume of MIDI file is less, calculated load can be reduced in the refrain location based on MIDI file, promote the accuracy of audio frequency process, promote the intelligent of audio frequency process.
Referring to Fig. 6, is the structural representation of a kind of reference sequences construction unit that the embodiment of the present invention provides; This reference sequences construction unit 103 can comprise: reference element determining unit 1301, first index determining unit 1302 and the first construction unit 1303.
Reference element determining unit 1301, for according at least one note described, and the eigenwert of at least one note described and time attribute, determine at least one reference element, reference element comprise a note, comprise note eigenwert and comprise the time attribute of note.
Described in supposing, the quantity of at least one note is N, N is positive integer, and described reference element determining unit 1301 can determine that the quantity of at least one reference element described is also N; This N number of reference element can be expressed as a 1a n, each reference element all comprises three elements, and these three elements comprise: the eigenwert of note, note and the time attribute of note; Such as: reference element a 1comprise note one, the eigenwert of note one and the time attribute of note one; By that analogy, reference element a ncomprise note N, the eigenwert of note N and the time attribute of note N.
First index determining unit 1302, for according at least one reference element described comprise the time attribute of note, determine the index of described each reference element.
The time attribute of a note can be used for the duration describing note, and the time attribute of note can comprise: the start time of note and the end time of note.Described first index determining unit 1302 can according to each reference element comprise the sequencing of the start time of note, determine the index of described each reference element; Such as: suppose in note one to note N, start time of note two, secondly by that analogy, the start time most end of note N, then can determine reference element a at first the start time of note one 1index be 1, reference element a 2index be 2, by that analogy, reference element a nindex be N.
First construction unit 1303, for the index according at least one reference element described, order arrangement at least one reference element described, obtains the reference sequences of described audio file.
According to example shown in the present embodiment, the reference sequences of described audio file can be expressed as note (i), the length of this reference sequences note (i) is N, wherein, i represents the index of each reference element in described reference sequences note (i), and i is positive integer and 0 < i≤N.
In practical application, a structure can be adopted to store this reference sequences note (i), and this structure can be expressed as follows:
tydef struct tag_note{
int start_ms;
int end_ms;
int note_value;
}Tnote
Tnote note;
In the embodiment of the present invention, can based on MIDI file corresponding to audio file, build reference sequences and the characteristic sequence of described audio file, by carrying out analyzing and processing to described reference sequences and described characteristic sequence, can locate the refrain of described audio file, because the data volume of MIDI file is less, calculated load can be reduced in the refrain location based on MIDI file, promote the accuracy of audio frequency process, promote the intelligent of audio frequency process.
Referring to Fig. 7, is the structural representation of a kind of characteristic sequence construction unit that the embodiment of the present invention provides; This characteristic sequence construction unit 104 can comprise: characteristic element determining unit 1401, second index determining unit 1402 and the second construction unit 1403.
Characteristic element determining unit 1401, for the eigenwert according at least one note described, determine at least one characteristic element, a characteristic element comprises the eigenwert of a note.
According to the example in embodiment illustrated in fig. 6, described characteristic element determining unit 1401 can determine that the quantity of at least one characteristic element described is also N; This N number of characteristic element can be expressed as b 1b n, each characteristic element all comprises the eigenwert of a note; Such as: characteristic element b 1comprise the eigenwert of note one; By that analogy, characteristic element b ncomprise the eigenwert of note N.
Second index determining unit 1402, for the index according at least one reference element described, determines the index of corresponding at least one characteristic element described.
Reference element and characteristic element by comprise the eigenwert of note mutually corresponding, such as: reference element a 1comprise the eigenwert of note one, characteristic element b 1also the eigenwert of note one is comprised, then reference element a 1with characteristic element b 1corresponding; By that analogy, reference element a ncomprise the eigenwert of note N, characteristic element b nalso the eigenwert of note N is comprised, then reference element a nwith characteristic element b ncorresponding.Described second index determining unit 1402, according to the index of described each reference element, can determine the index of corresponding described each characteristic element, such as: hypothetical reference element a 1index be 1, then with described reference element a 1corresponding characteristic element b 1index be also 1; By that analogy, hypothetical reference element a nindex be N, then with described reference element a ncorresponding characteristic element b nindex be also N.
Second construction unit 1403, for the index according at least one characteristic element described, order arrangement at least one characteristic element described, obtains the characteristic sequence of described audio file.
According to example shown in the present embodiment, the characteristic sequence of described audio file can be expressed as note_value (i), the length of this characteristic sequence note_value (i) is N, wherein, i represents the index of each characteristic element in described characteristic sequence note_value (i), and i is positive integer and 0 < i≤N.
In the embodiment of the present invention, can based on MIDI file corresponding to audio file, build reference sequences and the characteristic sequence of described audio file, by carrying out analyzing and processing to described reference sequences and described characteristic sequence, can locate the refrain of described audio file, because the data volume of MIDI file is less, calculated load can be reduced in the refrain location based on MIDI file, promote the accuracy of audio frequency process, promote the intelligent of audio frequency process.
Referring to Fig. 8, is the structural representation of a kind of positioning unit that the embodiment of the present invention provides; This positioning unit 105 can comprise: maximum value calculation unit 1501, target index determining unit 1502, time attribute determining unit 1503, positional information determining unit 1504 and refrain positioning unit 1505.
Maximum value calculation unit 1501, for carrying out maximum value calculation to described characteristic sequence, obtains the index of the maximal value of described characteristic sequence and target signature element corresponding to described at least maximal value.
Described maximum value calculation unit 1501 can adopt embodiment illustrated in fig. 4 in formula (1), maximum value calculation is carried out to described characteristic sequence note_value (i).
Target index determining unit 1502, for the index according to described target signature element, determines the index of the object reference element corresponding with the index of described target signature element.
The index of described target signature element is ind, namely the index of described target signature element in described characteristic sequence note_value (i) is ind, described target index determining unit 1502 can determine that the index of the object reference element corresponding with the index of described target signature element is also ind, and namely the index of described object reference element in described reference sequences note (i) is also ind.
Time attribute determining unit 1503, for the index according to described object reference element, obtain from described reference sequences described object reference element comprise the time attribute of note.
Described time attribute determining unit 1503 can first from described reference sequences note (i) registration wire be cited as the object reference element a of ind ind, then obtain this object reference element a indcomprise the time attribute of note.
Positional information determining unit 1504, for adopt described object reference element comprise the time attribute of note, determine the positional information of refrain.
Described positional information determining unit 1504 can adopt embodiment illustrated in fig. 4 in formula (2), by described object reference element a indcomprise start time of note, be defined as the positional information of refrain.
Refrain positioning unit 1505, for the positional information according to described refrain, locates refrain in described audio file.
Owing to adopting above-mentioned formula (2) can obtain the positional information of refrain in described audio file, described refrain positioning unit 1505 according to the positional information of refrain, can find or orients this refrain in described audio file.
Please also refer to Fig. 9, it is the structural representation of a kind of refrain positioning unit that the embodiment of the present invention provides; This refrain positioning unit 1505 can comprise: normalized subelement 1551 and refrain locator unit 1552.
Normalized subelement 1551, for being normalized the positional information of described refrain.
Described normalized subelement 1551 can according to actual needs, arranges the time parameter being used for normalized, such as: can according to the feature of song, at interval [1s, 20s] interior random selecting m 1and m 2be set to the time parameter for normalized, wherein, m 1and m 2value can be equal, also can not wait.The positional information of described normalized subelement 1551 to described refrain is normalized and can comprises: positional information Pos formula (2) being calculated the refrain obtained is normalized to [Pos-m 1, Pos+m 2].
Refrain locator unit 1552, for the normalized positional information according to refrain, locates refrain in described audio file.
Described refrain locator unit 1552 can by normalized positional information [Pos-m 1, Pos+m 2] as the duration section of this section of refrain in described audio file, from described audio file, locate this section of refrain.
It should be noted that, if there is multistage refrain in described audio file, maximum value calculation unit 1501 in embodiment illustrated in fig. 8 can be obtained up to the index of a few maximal value and target signature element corresponding to each maximal value, in the embodiment of the present invention, for the index of the target signature element of each maximal value and correspondence, respectively according to each functional unit of described described positioning unit 105 embodiment illustrated in fig. 8, each section of refrain can be located respectively in described audio file.
In the embodiment of the present invention, can based on MIDI file corresponding to audio file, build reference sequences and the characteristic sequence of described audio file, by carrying out analyzing and processing to described reference sequences and described characteristic sequence, can locate the refrain of described audio file, because the data volume of MIDI file is less, calculated load can be reduced in the refrain location based on MIDI file, promote the accuracy of audio frequency process, promote the intelligent of audio frequency process.
One of ordinary skill in the art will appreciate that all or part of flow process realized in above-described embodiment method, that the hardware that can carry out instruction relevant by computer program has come, described program can be stored in a computer read/write memory medium, this program, when performing, can comprise the flow process of the embodiment as above-mentioned each side method.Wherein, described storage medium can be magnetic disc, CD, read-only store-memory body (Read-Only Memory, ROM) or random store-memory body (Random Access Memory, RAM) etc.
Above disclosedly be only present pre-ferred embodiments, certainly can not limit the interest field of the present invention with this, therefore according to the equivalent variations that the claims in the present invention are done, still belong to the scope that the present invention is contained.

Claims (12)

1. an audio-frequency processing method, is characterized in that, comprising:
Obtain the musical instrument digital interface MIDI file that audio file is corresponding;
Resolve described MIDI file, obtain at least one note, and the eigenwert of at least one note described and time attribute;
Adopt at least one note described, and the eigenwert of at least one note described and time attribute, build the reference sequences of described audio file;
Adopt the eigenwert of at least one note described, build the characteristic sequence of described audio file;
Analyzing and processing is carried out to described reference sequences and described characteristic sequence, locates the refrain of described audio file.
2. the method for claim 1, is characterized in that, described reference sequences comprises: the index of at least one reference element and at least one reference element described;
Wherein, reference element comprise a note, comprise note eigenwert and comprise the time attribute of note;
Described characteristic sequence comprises: the index of at least one characteristic element and at least one characteristic element described;
Wherein, characteristic element comprises the eigenwert of a note.
3. method as claimed in claim 2, is characterized in that, describedly carries out analyzing and processing to described reference sequences and described characteristic sequence, locates the refrain of described audio file, comprising:
Maximum value calculation is carried out to described characteristic sequence, obtains the index of the maximal value of described characteristic sequence and target signature element corresponding to described maximal value;
According to the index of described target signature element, determine the index of the object reference element corresponding with the index of described target signature element;
According to the index of described object reference element, obtain from described reference sequences described object reference element comprise the time attribute of note;
Adopt described object reference element comprise the time attribute of note, determine the positional information of refrain;
According to the positional information of described refrain, in described audio file, locate refrain.
4. method as claimed in claim 3, is characterized in that the described positional information according to described refrain is located refrain, being comprised in described audio file:
The positional information of described refrain is normalized;
According to the normalized positional information of refrain, in described audio file, locate refrain.
5. the method as described in any one of claim 1-4, is characterized in that, at least one note described in described employing, and the eigenwert of at least one note described and time attribute, build the reference sequences of described audio file, build the reference sequences of described audio file, comprising:
According at least one note described, and the eigenwert of at least one note described and time attribute, determine at least one reference element, reference element comprise a note, comprise note eigenwert and comprise the time attribute of note;
According at least one reference element described comprise the time attribute of note, determine the index of at least one reference element described;
According to the index of at least one reference element described, order arrangement at least one reference element described, obtains the reference sequences of described audio file.
6. method as claimed in claim 5, it is characterized in that, the eigenwert of at least one note described in described employing, builds the characteristic sequence of described audio file, comprising:
According to the eigenwert of at least one note described, determine at least one characteristic element, a characteristic element comprises the eigenwert of a note;
According to the index of at least one reference element described, determine the index of corresponding at least one characteristic element described;
According to the index of at least one characteristic element described, order arrangement at least one characteristic element described, obtains the characteristic sequence of described audio file.
7. an apparatus for processing audio, is characterized in that, comprising:
File obtaining unit, for obtaining MIDI file corresponding to audio file;
Resolution unit, for resolving described MIDI file, obtains at least one note, and the eigenwert of at least one note described and time attribute;
Reference sequences construction unit, for adopting at least one note described, and the eigenwert of at least one note described and time attribute, build the reference sequences of described audio file;
Characteristic sequence construction unit, for adopting the eigenwert of at least one note described, builds the characteristic sequence of described audio file;
Positioning unit, for carrying out analyzing and processing to described reference sequences and described characteristic sequence, locates the refrain of described audio file.
8. device as claimed in claim 7, it is characterized in that, described reference sequences comprises: the index of at least one reference element and at least one reference element described;
Wherein, reference element comprise a note, comprise note eigenwert and comprise the time attribute of note;
Described characteristic sequence comprises: the index of at least one characteristic element and at least one characteristic element described;
Wherein, characteristic element comprises the eigenwert of a note.
9. device as claimed in claim 8, it is characterized in that, described positioning unit comprises:
Maximum value calculation unit, for carrying out maximum value calculation to described characteristic sequence, obtains the index of the maximal value of described characteristic sequence and target signature element corresponding to described at least maximal value;
Target index determining unit, for the index according to described target signature element, determines the index of the object reference element corresponding with the index of described target signature element;
Time attribute determining unit, for the index according to described object reference element, obtain from described reference sequences described object reference element comprise the time attribute of note;
Positional information determining unit, for adopt described object reference element comprise the time attribute of note, determine the positional information of refrain;
Refrain positioning unit, for the positional information according to described refrain, locates refrain in described audio file.
10. device as claimed in claim 9, it is characterized in that, described refrain positioning unit comprises:
Normalized subelement, for being normalized the positional information of described refrain;
Refrain locator unit, for the normalized positional information according to refrain, locates refrain in described audio file.
11. devices as described in any one of claim 7-10, it is characterized in that, described reference sequences construction unit comprises:
Reference element determining unit, for according at least one note described, and the eigenwert of at least one note described and time attribute, determine at least one reference element, reference element comprise a note, comprise note eigenwert and comprise the time attribute of note;
First index determining unit, for according at least one reference element described comprise the time attribute of note, determine the index of at least one reference element described;
First construction unit, for the index according at least one reference element described, order arrangement at least one reference element described, obtains the reference sequences of described audio file.
12. devices as claimed in claim 11, it is characterized in that, described characteristic sequence construction unit comprises:
Characteristic element determining unit, for the eigenwert according at least one note described, determine at least one characteristic element, a characteristic element comprises the eigenwert of a note;
Second index determining unit, for the index according at least one reference element described, determines the index of corresponding at least one characteristic element described;
Second construction unit, for the index according at least one characteristic element described, order arrangement at least one characteristic element described, obtains the characteristic sequence of described audio file.
CN201410568013.9A 2014-10-22 2014-10-22 A kind of audio-frequency processing method and device Active CN104978380B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410568013.9A CN104978380B (en) 2014-10-22 2014-10-22 A kind of audio-frequency processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410568013.9A CN104978380B (en) 2014-10-22 2014-10-22 A kind of audio-frequency processing method and device

Publications (2)

Publication Number Publication Date
CN104978380A true CN104978380A (en) 2015-10-14
CN104978380B CN104978380B (en) 2019-09-27

Family

ID=54274892

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410568013.9A Active CN104978380B (en) 2014-10-22 2014-10-22 A kind of audio-frequency processing method and device

Country Status (1)

Country Link
CN (1) CN104978380B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105390128A (en) * 2015-11-09 2016-03-09 清华大学 Automatic playing mechanical device and automatic playing system of percussion
CN106448630A (en) * 2016-09-09 2017-02-22 腾讯科技(深圳)有限公司 Method and device for generating digital music file of song
CN106652986A (en) * 2016-12-08 2017-05-10 腾讯音乐娱乐(深圳)有限公司 Song audio splicing method and device
CN113797541A (en) * 2021-09-06 2021-12-17 武汉指娱互动信息技术有限公司 Music game level generating method, device, equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102541965A (en) * 2010-12-30 2012-07-04 国际商业机器公司 Method and system for automatically acquiring feature fragments from music file
CN102903357A (en) * 2011-07-29 2013-01-30 华为技术有限公司 Method, device and system for extracting chorus of song
CN103853836A (en) * 2014-03-14 2014-06-11 广州酷狗计算机科技有限公司 Music retrieval method and system based on music fingerprint characteristic

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102541965A (en) * 2010-12-30 2012-07-04 国际商业机器公司 Method and system for automatically acquiring feature fragments from music file
CN102903357A (en) * 2011-07-29 2013-01-30 华为技术有限公司 Method, device and system for extracting chorus of song
CN103853836A (en) * 2014-03-14 2014-06-11 广州酷狗计算机科技有限公司 Music retrieval method and system based on music fingerprint characteristic

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105390128A (en) * 2015-11-09 2016-03-09 清华大学 Automatic playing mechanical device and automatic playing system of percussion
CN105390128B (en) * 2015-11-09 2019-10-11 清华大学 Automatic Playing mechanical device and percussion instrument automatic playing system
CN106448630A (en) * 2016-09-09 2017-02-22 腾讯科技(深圳)有限公司 Method and device for generating digital music file of song
WO2018045988A1 (en) * 2016-09-09 2018-03-15 腾讯科技(深圳)有限公司 Method and device for generating digital music score file of song, and storage medium
CN106448630B (en) * 2016-09-09 2020-08-04 腾讯科技(深圳)有限公司 Method and device for generating digital music score file of song
US10923089B2 (en) 2016-09-09 2021-02-16 Tencent Technology (Shenzhen) Company Limited Method and apparatus for generating digital score file of song, and storage medium
CN106652986A (en) * 2016-12-08 2017-05-10 腾讯音乐娱乐(深圳)有限公司 Song audio splicing method and device
CN106652986B (en) * 2016-12-08 2020-03-20 腾讯音乐娱乐(深圳)有限公司 Song audio splicing method and equipment
CN113797541A (en) * 2021-09-06 2021-12-17 武汉指娱互动信息技术有限公司 Music game level generating method, device, equipment and storage medium
CN113797541B (en) * 2021-09-06 2024-04-09 武汉指娱互动信息技术有限公司 Music game level generation method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN104978380B (en) 2019-09-27

Similar Documents

Publication Publication Date Title
US10210884B2 (en) Systems and methods facilitating selective removal of content from a mixed audio recording
US9317561B2 (en) Scene change detection around a set of seed points in media data
EP2791935B1 (en) Low complexity repetition detection in media data
CN102522083B (en) Method for searching hummed song by using mobile terminal and mobile terminal thereof
CN104978380A (en) Audio frequency processing method and device
CN105825850B (en) Audio processing method and device
CN104978974A (en) Audio processing method and device
CN104282322A (en) Mobile terminal and method and device for identifying chorus part of song thereof
US20140135964A1 (en) Music information searching method and apparatus thereof
CN111326171B (en) Method and system for extracting vocal melody based on numbered musical notation recognition and fundamental frequency extraction
WO2020199384A1 (en) Audio recognition method, apparatus and device, and storage medium
WO2016189307A1 (en) Audio identification method
CN104750839A (en) Data recommendation method, terminal and server
Sonnleitner et al. Quad-Based Audio Fingerprinting Robust to Time and Frequency Scaling.
CN106782601A (en) A kind of multimedia data processing method and its device
KR101648931B1 (en) Apparatus and method for producing a rhythm game, and computer program for executing the method
CN105047203A (en) Audio processing method, device and terminal
US20210241402A1 (en) Systems, devices, and methods for musical catalog amplification services
CN104091610B (en) A kind of management method of audio file and device
CN104978961A (en) Audio processing method, device and terminal
EP3644306B1 (en) Methods for analyzing musical compositions, computer-based system and machine readable storage medium
WO2023005193A1 (en) Subtitle display method and device
Soriano et al. Visualization of music collections based on structural content similarity
US20230197114A1 (en) Storage apparatus, playback apparatus, storage method, playback method, and medium
CN116259292B (en) Method, device, computer equipment and storage medium for identifying basic harmonic musical scale

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C41 Transfer of patent application or patent right or utility model
TA01 Transfer of patent application right

Effective date of registration: 20161207

Address after: 510000 Guangzhou, Tianhe District branch Yun Yun Road, No. 16, self built room 2, building 1301

Applicant after: Guangzhou KuGou Networks Co., Ltd.

Address before: Shenzhen Futian District City, Guangdong province 518000 Zhenxing Road, SEG Science Park 2 East Room 403

Applicant before: Tencent Technology (Shenzhen) Co., Ltd.

C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 510000 Guangzhou City, Guangzhou, Guangdong, Whampoa Avenue, No. 315, self - made 1-17

Applicant after: Guangzhou KuGou Networks Co., Ltd.

Address before: 510000 Guangzhou, Tianhe District branch Yun Yun Road, No. 16, self built room 2, building 1301

Applicant before: Guangzhou KuGou Networks Co., Ltd.

GR01 Patent grant
GR01 Patent grant