Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is clearly and completely described, obviously, described embodiment is only the present invention's part embodiment, rather than whole embodiment.Embodiment based in the present invention, those of ordinary skills, not making the every other embodiment obtaining under creative work prerequisite, belong to the scope of protection of the invention.
In the embodiment of the present invention, audio file can include but not limited to: the files such as song, snatch of song.Source audio file refers to can be used for the file as the reference data of audio frequency test and appraisal, for example: and original singer's song, or the snatch of song intercepting from original singer's song etc.; Target audio file refers to carry out the file of audio frequency test and appraisal, for example: and the song that user sings again for original singer's song, or the snatch of song intercepting the song of again singing from user etc.
In the embodiment of the present invention, an audio file can sequentially be arranged and form by least one audio frequency simple sentence, the set description of this at least one audio frequency simple sentence in audio file, need the part of singing.Take song A as example, and the description of song A can be expressed as follows:
[661,860]aaaaaaaa
[1541,320]bbbbbbbb
[1871,245]cccccccc
……
In the description of above-mentioned song A, such as " aaaaaaaa ", " bbbbbbbb ", " cccccccc " can be respectively used to represent an audio frequency simple sentence, " [] " before each audio frequency simple sentence is for describing the time attribute of corresponding audio frequency simple sentence, its unit interval is generally ms, for example: above-mentioned [661, 860] for the time attribute of description audio simple sentence " aaaaaaaa ", " 661 " wherein represent the start time of audio frequency simple sentence " aaaaaaaa ", " 860 " represent the duration of audio frequency simple sentence " aaaaaaaa ", suppose song A totally 5 minutes, audio frequency simple sentence " aaaaaaaa " is sung since 661ms, lasting 860ms finishes to sing.According to the sequencing of start time, can determine the order of each audio frequency simple sentence that audio file comprises, for example: according to the description of above-mentioned song A, audio frequency simple sentence " aaaaaaaa " is first audio frequency simple sentence, and its order in song A is 1; Audio frequency simple sentence " bbbbbbbb " is second audio frequency simple sentence, and its order in song A is 2; By that analogy.Be understandable that, before each audio frequency simple sentence starts or after finishing, for example also can comprise, without the part of singing: the 0-661ms time period of above-mentioned song A is that this part can comprise prelude information without the part of singing.
In the embodiment of the present invention, described source audio file is sequentially arranged and is formed by least one audio frequency simple sentence, and this audio frequency simple sentence can be described as performance simple sentence.Described target audio file is sequentially arranged and is formed by least one audio frequency simple sentence, and this audio frequency simple sentence can be described as performance simple sentence.
Below in conjunction with Fig. 1-Fig. 7, the audio frequency assessment method that the embodiment of the present invention is provided describes in detail.
Refer to Fig. 1, the process flow diagram of a kind of audio frequency assessment method providing for the embodiment of the present invention; The method can comprise the following steps S101-step S103.
S101, at least one that obtain in target audio file sung the test and appraisal mark of simple sentence.
The test and appraisal mark of the performance simple sentence in target audio file is higher, shows that the singing effect of this performance simple sentence is better, and it more approaches the singing effect of reference simple sentence corresponding in source audio file.Otherwise the test and appraisal mark of the performance simple sentence in target audio file is lower, show that the singing effect of this performance simple sentence is poorer, it more departs from the singing effect of reference simple sentence corresponding in source audio file.This step need to be obtained the test and appraisal mark of at least one the performance simple sentence in target audio file.Described target audio file can comprise at least one performance simple sentence, and this step need to be obtained the test and appraisal mark of all performance simple sentences of described target audio file including.
S102, according to the test and appraisal mark of described at least one performance simple sentence, builds the simple sentence Grading sequence of described target audio file.
In this step, to respectively singing the test and appraisal mark of simple sentence in described at least one performance simple sentence, carry out order arrangement, can be configured to the simple sentence Grading sequence of described target audio file.
S103, carries out total points computing to described simple sentence Grading sequence, obtains the test and appraisal total points of described target audio file.
Wherein, it is computing basis that the test and appraisal total points of described target audio file be take the test and appraisal mark of respectively singing simple sentence in described target audio file, and the test and appraisal total points of described target audio file can be used for reflecting that the integral body of described target audio file sings level.The test and appraisal total points of described target audio file is higher, shows that the performance level of described target audio file is higher, and it more approaches the singing effect of source audio file.Otherwise the test and appraisal total points of described target audio file is lower, show that the performance level of described target audio file is lower, it more departs from the singing effect of source audio file.
In the embodiment of the present invention, can pass through the test and appraisal mark of at least one performance simple sentence of target audio file, the simple sentence Grading sequence of establishing target audio file, based on simple sentence Grading sequence, carry out total points computing, the total points test and appraisal to target audio file have been realized, both met user for the actual demand in the application process of audio file, the application that has promoted again audio file is intelligent.
Below in conjunction with Fig. 2-Fig. 7, each step in the audio frequency assessment method shown in Fig. 1 is described in detail.
Referring to Fig. 2, is the process flow diagram of an embodiment of the step S101 in embodiment illustrated in fig. 1; This step S101 can comprise the following steps s1001-step s1003.
S1001, determines the order of current performance simple sentence in target audio file.
In this step, can, according to the time attribute of current performance simple sentence, determine the order of this current performance simple sentence in target audio file.Wherein, current performance simple sentence refers to corresponding performance simple sentence of current in progress time in described target audio file, set described target audio file and comprise the individual performance simple sentence of Q (Q is positive integer), if (k is positive integer to corresponding k of current in progress time, and 1≤k≤Q) individual performance simple sentence, current performance simple sentence is k performance simple sentence, and the order of current performance simple sentence in described target audio file is k.Target audio file is song A, the example that is described as with above-mentioned song A: suppose song A totally 5 minutes, the current in progress time is 1895ms, according to the description of song A, 1895ms belongs to the time attribute of audio frequency simple sentence " cccccccc " in the described time period, can determine that thus audio frequency simple sentence " cccccccc " is current performance simple sentence, can determine that the order of current performance simple sentence in target audio file is 3 thus.
S1002, obtains the test and appraisal mark of described current performance simple sentence.
It should be noted that, this step is preferably carried out after described current performance simple sentence performance finishes, according to example shown in step s1001, for song A, its current performance simple sentence is audio frequency simple sentence " cccccccc ", its time attribute is [1871,245], and this step can be obtained constantly at 1871ms+245ms=2116ms the test and appraisal mark of described current performance simple sentence.
In specific implementation, please also refer to Fig. 3, be the process flow diagram of an embodiment of the step s1002 in embodiment illustrated in fig. 2; This step s1002 can comprise the following steps s2001-step s2004.
S2001, obtains the characteristic sequence to be measured of described current performance simple sentence.
Note is again note, refers to, for recording the symbol of carrying out of the sound of different length, can comprise whole note, minim, crotchet, quaver etc. kind.Audio frequency simple sentence can be expressed as the frame sequence that a plurality of audio frames form, and each audio frame all carries note, and according to each audio frame, the time order and function in this audio frequency simple sentence sequentially forms the melody of this audio frequency simple sentence to each note.Pitch is again pitch, refers to the height of sound.Audio frequency simple sentence can be expressed as the frame sequence that a plurality of audio frames form, and each audio frame all carries pitch, and according to each audio frame, the time order and function in this audio frequency simple sentence sequentially forms the melody of this audio frequency simple sentence to each pitch.To sum up, the sequence of notes of audio frequency simple sentence or pitch sequence all can reflect the melody characteristics of this audio frequency simple sentence.
In this step, can obtain the characteristic sequence to be measured of current performance simple sentence, sequence of notes or pitch sequence that described characteristic sequence to be measured is described current performance simple sentence.
S2002, according to the described order of performance simple sentence in described target audio file of working as, position reference simple sentence in source audio file, and obtain the described fixed reference feature sequence with reference to simple sentence.
In the present embodiment, except special instruction, the described reference simple sentence of locating in source audio file that refers in particular to reference to simple sentence.In this step, the order of the reference simple sentence of locating in described source audio file is identical with the order of described current performance simple sentence in described target audio file.According to the above-mentioned example of the present embodiment, if take song A as target audio file, the original singer song B of song A when publishing and distributing is source audio file, the order of current performance simple sentence is 3, the order of the reference simple sentence that song B locates is also 3, from song B, choose the 3rd with reference to simple sentence the test and appraisal benchmark as current performance simple sentence.
In a kind of feasible embodiment of the embodiment of the present invention, described characteristic sequence to be measured is the sequence of notes of described current performance simple sentence, and described fixed reference feature sequence is the described sequence of notes with reference to simple sentence.In the feasible embodiment of the another kind of the embodiment of the present invention, described characteristic sequence to be measured is the pitch sequence of described current performance simple sentence, and institute's fixed reference feature sequence is the described pitch sequence with reference to simple sentence.
S2003, carries out related operation to described fixed reference feature sequence and described characteristic sequence to be measured, obtains related coefficient sequence.
Because described fixed reference feature sequence can be used for characterizing the audio frequency characteristics of the reference simple sentence of locating in source audio file, described characteristic sequence to be measured can be used for characterizing the audio frequency characteristics of current performance simple sentence in target audio file, this step can, to the related operation between described fixed reference feature sequence and described characteristic sequence to be measured, obtain related coefficient sequence.
S2004, according to described related coefficient sequence, determines the test and appraisal mark of described performance simple sentence.
In this step, the test and appraisal mark of described current performance simple sentence is higher, shows that the singing effect of described current performance simple sentence is better, the singing effect of its more approaching located reference simple sentence.Otherwise the test and appraisal mark of described current performance simple sentence is lower, show that the singing effect of described current performance simple sentence is poorer, it more departs from the singing effect of located reference simple sentence.
S1003, order according to described current performance simple sentence in described target audio file, obtain in described target audio file order prior to the test and appraisal mark of all performance simple sentences of described current performance simple sentence, and in described target audio file after order the test and appraisal mark in all performance simple sentences of described current performance simple sentence be set to zero.
Set described target audio file and comprise that Q is sung simple sentence, if current, play k and singing simple sentence, current performance simple sentence is k performance simple sentence, in described target audio file, order comprises that prior to all performance simple sentences of described current performance simple sentence singing k-1 of simple sentence to for the 1st sings simple sentence, and in described target audio file, after order, all performance simple sentences in described current performance simple sentence comprise that singing Q of simple sentence to for k+1 sings simple sentence.This step need to be obtained respectively the 1st and sing k-1 test and appraisal mark of singing simple sentence of simple sentence to the, and k+1 the test and appraisal mark of singing Q performance simple sentence of simple sentence to the is set to zero.It should be noted that, obtaining the 1st process of singing the test and appraisal mark of k-1 performance simple sentence of simple sentence to the can, referring to the process of obtaining the test and appraisal mark of current performance simple sentence, be not repeated herein.Be understandable that, owing to also playing for user and singing in all performance simple sentences of described current performance simple sentence after order in described target audio file, so this step mark of can being tested and assessed is set to zero.
In the embodiment of the present invention, can pass through the test and appraisal mark of at least one performance simple sentence of target audio file, the simple sentence Grading sequence of establishing target audio file, based on simple sentence Grading sequence, carry out total points computing, the total points test and appraisal to target audio file have been realized, both met user for the actual demand in the application process of audio file, the application that has promoted again audio file is intelligent.
Referring to Fig. 4, is the process flow diagram of an embodiment of the step s2003 in embodiment illustrated in fig. 3; This step s2003 can comprise the following steps s3001-step s3004.
S3001, calculates respectively the average of described fixed reference feature sequence and the average of described characteristic sequence to be measured.
The reference simple sentence of locating comprises N audio frame, and described fixed reference feature sequence can be expressed as p (i); Wherein, i is integer, and 0≤i≤N-1.Particularly, if described fixed reference feature sequence is the described sequence of notes with reference to simple sentence, the note of first audio frame in the reference simple sentence that p (0) represents to locate, the note of second audio frame in the reference simple sentence that p (1) represents to locate, the note of N audio frame in the reference simple sentence that by that analogy, p (N-1) represents to locate.If described characteristic sequence to be measured is the pitch sequence of described performance simple sentence, the pitch of first audio frame in the reference simple sentence that p (0) represents to locate, the note of second audio frame in the reference simple sentence that p (1) represents to locate, the note of N audio frame in the reference simple sentence that by that analogy, p (N-1) represents to locate.
Set current performance simple sentence and comprise N audio frame, described characteristic sequence to be measured can be expressed as s (i), and wherein, i is integer, and 0≤i≤N-1.Particularly, if described characteristic sequence to be measured is the sequence of notes of described current performance simple sentence, s (0) represents the note of first audio frame in described current performance simple sentence, s (1) represents the note of second audio frame in described current performance simple sentence, by that analogy, s (N-1) represents the note of N audio frame in described current performance simple sentence.If described characteristic sequence to be measured is the pitch sequence of described current performance simple sentence, s (0) represents the pitch of first audio frame in described current performance simple sentence, s (1) represents the pitch of second audio frame in described current performance simple sentence, by that analogy, s (N-1) represents the pitch of N audio frame in described current performance simple sentence.
In this step, can adopt following formula (1) to calculate respectively the average of described fixed reference feature sequence p (i) and the average of described characteristic sequence s to be measured (i), this formula (1) is as follows:
MP=mean(p(i))
MS=mean(s(i)) (1)
In above-mentioned formula (1), MP represents the average of described fixed reference feature sequence p (i), and MS represents the average of described fixed reference feature sequence p (i), and mean () is the operation of averaging.
S3002, adopts the average of described fixed reference feature sequence, and described fixed reference feature sequence is carried out to regular processing, adopts the average of described characteristic sequence to be measured, and described characteristic sequence to be measured is carried out to regular processing.
The object of regular processing is: described fixed reference feature sequence and described characteristic sequence to be measured are adjusted to same benchmark, to eliminate described fixed reference feature sequence and described characteristic sequence to be measured because average is asked for the inconsistent calculation deviation impact being brought of standard.
In this step, can adopt formula (2) to carry out regular processing to described fixed reference feature sequence, this formula (2) can be expressed as follows:
p2(i)=p(i)-MP (2)
In above-mentioned formula (2), p2 (i) represents the fixed reference feature sequence obtaining after regular processing.
In this step, can adopt formula (3) to carry out regular processing to described characteristic sequence to be measured, this formula (3) can be expressed as follows:
s2(i)=s(i)-MS (3)
In above-mentioned formula (3), s2 (i) represents the characteristic sequence to be measured obtaining after regular processing.
S3003, adopts default slicing threshold value, and the described fixed reference feature sequence after regular processing is converted to referential data sequence, and the characteristic sequence described to be measured after regular processing is converted to sequence of values to be measured.
Wherein, described default slicing threshold value can be set according to actual needs, and preferably, described slicing threshold value can adopt formula (4) to set, and this formula (4) can be expressed as follows:
Th_xue=max(max(abs(p2(i)),max(abs(s2(i))) (4)
In above-mentioned formula (4), Th_xue represents described default slicing threshold value, and max () asks for operation for maximal value, and abs () asks for operation for absolute value.
In this step, can adopt formula (5) that the described fixed reference feature sequence after regular processing is converted to referential data sequence, this formula (5) can be expressed as follows:
In above-mentioned formula (5), p3 (i) represents referential data sequence.
In this step, can adopt formula (6) that the characteristic sequence described to be measured after regular processing is converted to sequence of values to be measured, this formula (6) can be expressed as follows:
In above-mentioned formula (6), s3 (i) represents sequence of values to be measured.
S3004, adopts cross correlation function to carry out related operation to described referential data sequence and described sequence of values to be measured, obtains related coefficient sequence.
In a kind of feasible embodiment of this step, can adopt formula (7) to carry out related operation to described referential data sequence and described sequence of values to be measured, this formula (7) can be expressed as follows:
In above-mentioned formula (7), R (n) represents related coefficient sequence; " " represents multiplication operations; S3 (i-n) represents the sequence that s3 (i) loopy moving n forms, wherein, and 0≤n≤N-1.
In the feasible embodiment of the another kind of this step, can adopt formula (8) to carry out related operation to described referential data sequence and described sequence of values to be measured, this formula (8) can be expressed as follows:
In above-mentioned formula (8), R (n) represents related coefficient sequence; " " represents multiplication operations; P3 (i-n) represents the sequence that p3 (i) loopy moving n forms, wherein, and 0≤n≤N-1.
In the embodiment of the present invention, can pass through the test and appraisal mark of at least one performance simple sentence of target audio file, the simple sentence Grading sequence of establishing target audio file, based on simple sentence Grading sequence, carry out total points computing, the total points test and appraisal to target audio file have been realized, both met user for the actual demand in the application process of audio file, the application that has promoted again audio file is intelligent.
Referring to Fig. 5, is the process flow diagram of an embodiment of the step s2004 in embodiment illustrated in fig. 3; This step s2004 can comprise the following steps s4001-step s4003.
S4001, calculates the maximal value of described related coefficient sequence.
In this step, can adopt following formula (9) to calculate the maximal value of described related coefficient sequence, this formula (9) can be expressed as follows:
RMAX=max(R(n)) (9)
In above-mentioned formula (9), R (n) represents related coefficient sequence; Max () asks for operation for maximal value; RMAX represents the maximal value of described related coefficient sequence.
S4002, maps to preset fraction interval by the maximal value of described related coefficient sequence, obtains the peaked mapping value of described related coefficient sequence.
Described preset fraction interval can be set according to actual needs, for example: described preset fraction interval can be set as [0,10]; Or described preset fraction interval can be set as [0,100].In this step, can adopt [score_min, score_max] to represent that described preset fraction is interval, by linearity or nonlinear method, the maximal value RMAX of described related coefficient sequence be mapped to described preset fraction interval, the mapping value of acquisition can represent score
k-1, this score
k-1be positioned in the preset fraction interval shown in [score_min, score_max].
S4003, is defined as described mapping value the test and appraisal mark of described current performance simple sentence.
This step can be by described mapping value score
kbe defined as the test and appraisal mark of described performance simple sentence to be tested and assessed, the test and appraisal mark of described current performance simple sentence is score
kvalue.
In the embodiment of the present invention, can pass through the test and appraisal mark of at least one performance simple sentence of target audio file, the simple sentence Grading sequence of establishing target audio file, based on simple sentence Grading sequence, carry out total points computing, the total points test and appraisal to target audio file have been realized, both met user for the actual demand in the application process of audio file, the application that has promoted again audio file is intelligent.
Referring to Fig. 6, is the process flow diagram of an embodiment of the step S102 in embodiment illustrated in fig. 1; This step S102 can comprise the following steps s5001-step s5002.
S5001, obtain described at least one sing and respectively sing the order of simple sentence in described target audio file in simple sentence.
In this step, can, according to described time attribute of respectively singing simple sentence, determine and respectively sing the order of simple sentence in described target audio file.
S5002, according to the described order of simple sentence in described target audio file of respectively singing, arranges the described test and appraisal mark of respectively singing simple sentence, forms the simple sentence Grading sequence of described target audio file.
Set described target audio file and comprise that Q is sung simple sentence, the simple sentence Grading sequence of described target audio file can adopt d (j) to represent, wherein, j is integer, and 0≤j≤Q-1.Test and appraisal mark and corresponding index that the simple sentence Grading sequence d (j) of described target audio file sings simple sentence by each form, and index corresponding to described test and appraisal mark refers to the order of the performance simple sentence of acquisition test and appraisal mark.Particularly, d (0) represents the test and appraisal mark of first performance simple sentence in described target audio file, and its corresponding index is 1; D (1) represents second test and appraisal mark of singing simple sentence in described target audio file, and its corresponding index is 2; By that analogy, d (Q-1) represents Q the test and appraisal mark of singing simple sentence in described target audio file, and its corresponding index is Q.According to the example in the embodiment of the present invention, the test and appraisal mark of described current performance simple sentence can be expressed as d (k-1), and the value of d (k-1) is score
k-1, its corresponding index is k.In the embodiment of the present invention, set in described simple sentence Grading sequence d (j), the value of d (0) is score
0, its corresponding index is 1; The value of d (1) is score
1, its corresponding index is 2; By that analogy, the value of d (k-1) is score
k-2, its corresponding index is k-1.It is k+2 that d (k+1) is to the value of d (Q-1) index that 0, d (k+1) is corresponding, and by that analogy, the index that d (Q-1) is corresponding is Q.
In the embodiment of the present invention, can pass through the test and appraisal mark of at least one performance simple sentence of target audio file, the simple sentence Grading sequence of establishing target audio file, based on simple sentence Grading sequence, carry out total points computing, the total points test and appraisal to target audio file have been realized, both met user for the actual demand in the application process of audio file, the application that has promoted again audio file is intelligent.
Referring to Fig. 7, is the process flow diagram of an embodiment of the step S103 in embodiment illustrated in fig. 1; This step S103 can comprise the following steps s6001-step s6003.
S6001, calculates average and the maximal value of described simple sentence Grading sequence.
In this step, can adopt following formula (10) to calculate the average of described simple sentence Grading sequence, this formula (10) can be expressed as follows:
E=mean(d(j)) (10)
In above-mentioned formula (10), mean () is the operation of averaging.
In this step, can adopt following formula (11) to calculate the maximal value of described simple sentence Grading sequence, this formula (11) can be expressed as follows:
[dmax,ind]=max(d(j)) (11)
In above-mentioned formula (11), max () asks for operation for maximal value, and d max represents the maximal value in d (j), and ind represents d (j) corresponding index while getting maximal value.
S6002, the maximal value of obtaining described simple sentence Grading sequence corresponding index in described simple sentence Grading sequence.The accessed index of this step is the ind in above-mentioned formula (11).
S6003, the index corresponding to the maximal value of the maximal value of the average of described simple sentence Grading sequence, described simple sentence Grading sequence and described simple sentence Grading sequence carries out total points computing, obtains the test and appraisal total points of described target audio file.
In this step, the process of total points computing can be referring to formula (12), and this formula (12) can be expressed as follows:
s=max{E+d max*exp[(ind-(k+1))/(k+1)],E} (12)
In above-mentioned formula (12), max{} is that maximal value is asked for operation; Exp represents to take the exponential function that e is the end; K represents the order of current performance simple sentence in described target audio file; S represents the test and appraisal total points of described target audio file, and along with the lasting performance of user to described target audio file, k value constantly changes, and d (j) constantly changes, and the s obtaining in real time also can respective change.
In the embodiment of the present invention, can pass through the test and appraisal mark of at least one performance simple sentence of target audio file, the simple sentence Grading sequence of establishing target audio file, based on simple sentence Grading sequence, carry out total points computing, the total points test and appraisal to target audio file have been realized, both met user for the actual demand in the application process of audio file, the application that has promoted again audio file is intelligent.
Below in conjunction with Fig. 8-Figure 14, a kind of audio frequency assessment device that the embodiment of the present invention is provided describes in detail.It should be noted that, the audio frequency assessment device described in Fig. 8-Figure 14 can be applicable to carry out the method shown in above-mentioned accompanying drawing 1-accompanying drawing 7.In practical application, described audio frequency assessment device can run on server end, or runs on such as in notebook computer, mobile phone, PAD (panel computer), intelligent wearable device etc. terminal.
Refer to Fig. 8, the structural representation of a kind of audio frequency assessment device providing for the embodiment of the present invention; This device can comprise: mark acquisition module 101, structure module 102 and total points test and appraisal module 103.
Mark acquisition module 101, for obtaining the test and appraisal mark of at least one performance simple sentence of target audio file.
The test and appraisal mark of the performance simple sentence in target audio file is higher, shows that the singing effect of this performance simple sentence is better, and it more approaches the singing effect of reference simple sentence corresponding in source audio file.Otherwise the test and appraisal mark of the performance simple sentence in target audio file is lower, show that the singing effect of this performance simple sentence is poorer, it more departs from the singing effect of reference simple sentence corresponding in source audio file.Described mark acquisition module 101 need to obtain the test and appraisal mark of at least one the performance simple sentence in target audio file.Described target audio file can comprise at least one performance simple sentence, and described mark acquisition module 101 need to obtain the test and appraisal mark of all performance simple sentences of described target audio file including.
Build module 102, for according to the test and appraisal mark of described at least one performance simple sentence, build the simple sentence Grading sequence of described target audio file.
Described in 102 pairs of described structure modules, at least one is sung the test and appraisal mark of respectively singing simple sentence in simple sentence and carries out order arrangement, can be configured to the simple sentence Grading sequence of described target audio file.
Total points test and appraisal module 103, for described simple sentence Grading sequence is carried out to total points computing, obtains the test and appraisal total points of described target audio file.
Wherein, it is computing basis that the test and appraisal total points of described target audio file be take the test and appraisal mark of respectively singing simple sentence in described target audio file, and the test and appraisal total points of described target audio file can be used for reflecting that the integral body of described target audio file sings level.The test and appraisal total points of described target audio file is higher, shows that the performance level of described target audio file is higher, and it more approaches the singing effect of source audio file.Otherwise the test and appraisal total points of described target audio file is lower, show that the performance level of described target audio file is lower, it more departs from the singing effect of source audio file.
In the embodiment of the present invention, can pass through the test and appraisal mark of at least one performance simple sentence of target audio file, the simple sentence Grading sequence of establishing target audio file, based on simple sentence Grading sequence, carry out total points computing, the total points test and appraisal to target audio file have been realized, both met user for the actual demand in the application process of audio file, the application that has promoted again audio file is intelligent.
Referring to Fig. 9, is the structural representation of the embodiment of the mark acquisition module shown in Fig. 8; This mark acquisition module 101 can comprise: order determining unit 1101, the first mark acquiring unit 1102 and the second mark acquiring unit 1103.
Order determining unit 1101, for determining that current performance simple sentence is in the order of target audio file.
Described order determining unit 1101 can, according to the time attribute of current performance simple sentence, be determined the order of this current performance simple sentence in target audio file.Wherein, current performance simple sentence refers to corresponding performance simple sentence of current in progress time in described target audio file, set described target audio file and comprise the individual performance simple sentence of Q (Q is positive integer), if (k is positive integer to corresponding k of current in progress time, and 1≤k≤Q) individual performance simple sentence, current performance simple sentence is k performance simple sentence, and the order of current performance simple sentence in described target audio file is k.Target audio file is song A, the example that is described as with above-mentioned song A: suppose song A totally 5 minutes, the current in progress time is 1895ms, according to the description of song A, 1895ms belongs to the time attribute of audio frequency simple sentence " cccccccc " in the described time period, can determine that thus audio frequency simple sentence " cccccccc " is current performance simple sentence, can determine that the order of current performance simple sentence in target audio file is 3 thus.
The first mark acquiring unit 1102, for obtaining the test and appraisal mark of described current performance simple sentence.
It should be noted that, described the first mark acquiring unit 1102 is preferably carried out the process of obtaining after described current performance simple sentence performance finishes, according to example shown in the present embodiment, for song A, its current performance simple sentence is audio frequency simple sentence " cccccccc ", its time attribute is [1871,245], and described the first mark acquiring unit 1102 can obtain constantly at 1871ms+245ms=2116ms the test and appraisal mark of described current performance simple sentence.
The second mark acquiring unit 1103, for the order at described target audio file according to described current performance simple sentence, obtain in described target audio file order prior to the test and appraisal mark of all performance simple sentences of described current performance simple sentence, and in described target audio file after order the test and appraisal mark in all performance simple sentences of described current performance simple sentence be set to zero.
Set described target audio file and comprise that Q is sung simple sentence, if current, play k and singing simple sentence, current performance simple sentence is k performance simple sentence, in described target audio file, order comprises that prior to all performance simple sentences of described current performance simple sentence singing k-1 of simple sentence to for the 1st sings simple sentence, and in described target audio file, after order, all performance simple sentences in described current performance simple sentence comprise that singing Q of simple sentence to for k+1 sings simple sentence.Described the second mark acquiring unit 1103 need to obtain respectively the 1st and sing k-1 test and appraisal mark of singing simple sentence of simple sentence to the, and k+1 the test and appraisal mark of singing Q performance simple sentence of simple sentence to the is set to zero.It should be noted that, obtaining the 1st process of singing the test and appraisal mark of k-1 performance simple sentence of simple sentence to the can be referring to the process of obtaining the test and appraisal mark of current performance simple sentence.Be understandable that, owing to also playing for user and singing in all performance simple sentences of described current performance simple sentence after order in described target audio file, therefore described the second mark acquiring unit 1103 mark of can being tested and assessed is set to zero.
In the embodiment of the present invention, can pass through the test and appraisal mark of at least one performance simple sentence of target audio file, the simple sentence Grading sequence of establishing target audio file, based on simple sentence Grading sequence, carry out total points computing, the total points test and appraisal to target audio file have been realized, both met user for the actual demand in the application process of audio file, the application that has promoted again audio file is intelligent.
Referring to Figure 10, is the structural representation of the embodiment of the first mark acquiring unit shown in Fig. 9; This first mark acquiring unit 1101 can comprise: retrieval to be measured unit 1111, reference sequences acquiring unit 1112, related operation unit 1113 and mark determining unit 1114.
Retrieval to be measured unit 1111, for obtaining the characteristic sequence to be measured of described current performance simple sentence.
Note is again note, refers to, for recording the symbol of carrying out of the sound of different length, can comprise whole note, minim, crotchet, quaver etc. kind.Audio frequency simple sentence can be expressed as the frame sequence that a plurality of audio frames form, and each audio frame all carries note, and according to each audio frame, the time order and function in this audio frequency simple sentence sequentially forms the melody of this audio frequency simple sentence to each note.Pitch is again pitch, refers to the height of sound.Audio frequency simple sentence can be expressed as the frame sequence that a plurality of audio frames form, and each audio frame all carries pitch, and according to each audio frame, the time order and function in this audio frequency simple sentence sequentially forms the melody of this audio frequency simple sentence to each pitch.To sum up, the sequence of notes of audio frequency simple sentence or pitch sequence all can reflect the melody characteristics of this audio frequency simple sentence.
Described retrieval to be measured unit 1111 can obtain the characteristic sequence to be measured of current performance simple sentence, sequence of notes or pitch sequence that described characteristic sequence to be measured is described current performance simple sentence.
Reference sequences acquiring unit 1112, for singing simple sentence in the order of described target audio file according to described working as, position reference simple sentence in source audio file, and obtain the described fixed reference feature sequence with reference to simple sentence.
In the present embodiment, except special instruction, the described reference simple sentence of locating in source audio file that refers in particular to reference to simple sentence.The order of the reference simple sentence of wherein, locating in described source audio file is identical with the order of described current performance simple sentence in described target audio file.If take song A as target audio file, the original singer song B of song A when publishing and distributing is source audio file, the order of current performance simple sentence is 3, the order of the reference simple sentence that song B locates is also 3, described reference sequences acquiring unit 1112 from song B, choose the 3rd with reference to simple sentence the test and appraisal benchmark as current performance simple sentence.
In a kind of feasible embodiment of the embodiment of the present invention, described characteristic sequence to be measured is the sequence of notes of described current performance simple sentence, and described fixed reference feature sequence is the described sequence of notes with reference to simple sentence.In the feasible embodiment of the another kind of the embodiment of the present invention, described characteristic sequence to be measured is the pitch sequence of described current performance simple sentence, and institute's fixed reference feature sequence is the described pitch sequence with reference to simple sentence.
Related operation unit 1113, for described fixed reference feature sequence and described characteristic sequence to be measured are carried out to related operation, obtains related coefficient sequence.
Because described fixed reference feature sequence can be used for characterizing the audio frequency characteristics of the reference simple sentence of locating in source audio file, described characteristic sequence to be measured can be used for characterizing the audio frequency characteristics of current performance simple sentence in target audio file, described related operation unit 1113 can, to the related operation between described fixed reference feature sequence and described characteristic sequence to be measured, obtain related coefficient sequence.
Mark determining unit 1114, for according to described related coefficient sequence, determines the test and appraisal mark of described current performance simple sentence.
The test and appraisal mark of described current performance simple sentence is higher, shows that the singing effect of described current performance simple sentence is better, the singing effect of its more approaching located reference simple sentence.Otherwise the test and appraisal mark of described current performance simple sentence is lower, show that the singing effect of described current performance simple sentence is poorer, it more departs from the singing effect of located reference simple sentence.
In the embodiment of the present invention, can pass through the test and appraisal mark of at least one performance simple sentence of target audio file, the simple sentence Grading sequence of establishing target audio file, based on simple sentence Grading sequence, carry out total points computing, the total points test and appraisal to target audio file have been realized, both met user for the actual demand in the application process of audio file, the application that has promoted again audio file is intelligent.
Referring to Figure 11, is the structural representation of the embodiment of the related operation unit shown in Figure 10; This related operation unit 1113 can comprise: mean value computation subelement 1311, regular processing subelement 1312, sequence conversion subelement 1313 and related operation subelement 1314.
Mean value computation subelement 1311, for calculating respectively the average of described fixed reference feature sequence and the average of described characteristic sequence to be measured.
The reference simple sentence of locating comprises N audio frame, and described fixed reference feature sequence can be expressed as p (i); Wherein, i is integer, and 0≤i≤N-1.Particularly, if described fixed reference feature sequence is the described sequence of notes with reference to simple sentence, the note of first audio frame in the reference simple sentence that p (0) represents to locate, the note of second audio frame in the reference simple sentence that p (1) represents to locate, the note of N audio frame in the reference simple sentence that by that analogy, p (N-1) represents to locate.If described characteristic sequence to be measured is the pitch sequence of described performance simple sentence, the pitch of first audio frame in the reference simple sentence that p (0) represents to locate, the note of second audio frame in the reference simple sentence that p (1) represents to locate, the note of N audio frame in the reference simple sentence that by that analogy, p (N-1) represents to locate.
Set current performance simple sentence and comprise N audio frame, described characteristic sequence to be measured can be expressed as s (i), and wherein, i is integer, and 0≤i≤N-1.Particularly, if described characteristic sequence to be measured is the sequence of notes of described current performance simple sentence, s (0) represents the note of first audio frame in described current performance simple sentence, s (1) represents the note of second audio frame in described current performance simple sentence, by that analogy, s (N-1) represents the note of N audio frame in described current performance simple sentence.If described characteristic sequence to be measured is the pitch sequence of described current performance simple sentence, s (0) represents the pitch of first audio frame in described current performance simple sentence, s (1) represents the pitch of second audio frame in described current performance simple sentence, by that analogy, s (N-1) represents the pitch of N audio frame in described current performance simple sentence.
Described mean value computation subelement 1311 can adopt the formula (1) in embodiment illustrated in fig. 4 to calculate respectively the average of described fixed reference feature sequence p (i) and the average of described characteristic sequence s to be measured (i).
Regular processing subelement 1312, for adopting the average of described fixed reference feature sequence, carries out regular processing to described fixed reference feature sequence, adopts the average of described characteristic sequence to be measured, and described characteristic sequence to be measured is carried out to regular processing.
The object of regular processing is: described fixed reference feature sequence and described characteristic sequence to be measured are adjusted to same benchmark, to eliminate described fixed reference feature sequence and described characteristic sequence to be measured because average is asked for the inconsistent calculation deviation impact being brought of standard.Described regular processing subelement 1312 can adopt the formula (2) shown in embodiment illustrated in fig. 4 to carry out regular processing to described fixed reference feature sequence, obtain the fixed reference feature sequence p2 (i) after regular processing, and can adopt the formula (3) shown in embodiment illustrated in fig. 4 to carry out regular processing to described characteristic sequence to be measured, obtain the characteristic sequence s2 to be measured (i) obtaining after regular processing.
Sequence conversion subelement 1313, for adopting default slicing threshold value, is converted to referential data sequence by the described fixed reference feature sequence after regular processing, and the characteristic sequence described to be measured after regular processing is converted to sequence of values to be measured.
Wherein, described default slicing threshold value can be set according to actual needs, and preferably, described slicing threshold value can adopt the formula (4) in embodiment illustrated in fig. 4 to set.Described sequence conversion subelement 1313 can adopt the formula (5) in embodiment illustrated in fig. 4 that described fixed reference feature sequence after regular processing is converted to referential data sequence p3 (i), and can adopt the formula (6) in embodiment illustrated in fig. 4 that characteristic sequence described to be measured after regular processing is converted to sequence of values s3 to be measured (i).
Related operation subelement 1314, for adopting cross correlation function to carry out related operation to described referential data sequence and described sequence of values to be measured, obtains related coefficient sequence.
In a kind of feasible embodiment of the present embodiment, described related operation subelement 1314 can adopt the formula (7) in embodiment illustrated in fig. 4 to carry out related operation to described referential data sequence p3 (i) and described sequence of values s3 to be measured (i), obtains related coefficient sequence R (n).In the feasible embodiment of the another kind of the present embodiment, described related operation subelement 1314 can adopt the formula (8) in embodiment illustrated in fig. 4 to carry out related operation to described referential data sequence p3 (i) and described sequence of values s3 to be measured (i), obtains related coefficient sequence R (n).
In the embodiment of the present invention, can pass through the test and appraisal mark of at least one performance simple sentence of target audio file, the simple sentence Grading sequence of establishing target audio file, based on simple sentence Grading sequence, carry out total points computing, the total points test and appraisal to target audio file have been realized, both met user for the actual demand in the application process of audio file, the application that has promoted again audio file is intelligent.
Referring to Figure 12, is the structural representation of the embodiment of the mark determining unit shown in Figure 10; This mark determining unit 1114 can comprise: maximum value calculation subelement 1411, mapping subelement 1412 and mark are determined subelement 1413.
Maximum value calculation subelement 1411, for calculating the maximal value of described related coefficient sequence.
Described maximum value calculation subelement 1411 can adopt the formula (9) in embodiment illustrated in fig. 5 to calculate the maximal value RMAX of described related coefficient sequence.
Mapping subelement 1412, for the maximal value of described related coefficient sequence is mapped to preset fraction interval, obtains the peaked mapping value of described related coefficient sequence.
Described preset fraction interval can be set according to actual needs, for example: described preset fraction interval can be set as [0,10]; Or described preset fraction interval can be set as [0,100].Described mapping subelement 1412 can adopt [score_min, score_max] represent that described preset fraction is interval, by linearity or nonlinear method, the maximal value RMAX of described related coefficient sequence is mapped to described preset fraction interval, the mapping value of acquisition can represent score
k-1, this score
k-1be positioned in the preset fraction interval shown in [score_min, score_max].
Mark is determined subelement 1413, for described mapping value being defined as to the test and appraisal mark of described current performance simple sentence.
Described mark determines that subelement 1413 can be by described mapping value score
kbe defined as the test and appraisal mark of described performance simple sentence to be tested and assessed, the test and appraisal mark of described current performance simple sentence is score
kvalue.
In the embodiment of the present invention, can pass through the test and appraisal mark of at least one performance simple sentence of target audio file, the simple sentence Grading sequence of establishing target audio file, based on simple sentence Grading sequence, carry out total points computing, the total points test and appraisal to target audio file have been realized, both met user for the actual demand in the application process of audio file, the application that has promoted again audio file is intelligent.
Referring to Figure 13, is the structural representation of the embodiment of the structure module shown in Fig. 8; This structure module 102 can comprise: order acquiring unit 1201 and construction unit 1202.
Order acquiring unit 1201, for obtain described at least one sing simple sentence respectively sing the order of simple sentence in described target audio file.
Described order acquiring unit 1201 can, according to described time attribute of respectively singing simple sentence, be determined and respectively sing the order of simple sentence in described target audio file.
Construction unit 1202, for according to the described simple sentence of respectively singing in the order of described target audio file, the described test and appraisal mark of respectively singing simple sentence is arranged, form the simple sentence Grading sequence of described target audio file.
Set described target audio file and comprise that Q is sung simple sentence, the simple sentence Grading sequence of described target audio file can adopt d (j) to represent, wherein, j is integer, and 0≤j≤Q-1.Test and appraisal mark and corresponding index that the simple sentence Grading sequence d (j) of described target audio file sings simple sentence by each form, and index corresponding to described test and appraisal mark refers to the order of the performance simple sentence of acquisition test and appraisal mark.Particularly, d (0) represents the test and appraisal mark of first performance simple sentence in described target audio file, and its corresponding index is 1; D (1) represents second test and appraisal mark of singing simple sentence in described target audio file, and its corresponding index is 2; By that analogy, d (Q-1) represents Q the test and appraisal mark of singing simple sentence in described target audio file, and its corresponding index is Q.According to the example in the embodiment of the present invention, the test and appraisal mark of described current performance simple sentence can be expressed as d (k-1), and the value of d (k-1) is score
k-1, its corresponding index is k.In the embodiment of the present invention, set in described simple sentence Grading sequence d (j), the value of d (0) is score
0, its corresponding index is 1; The value of d (1) is score
1, its corresponding index is 2; By that analogy, the value of d (k-1) is score
k-2, its corresponding index is k-1.It is k+2 that d (k+1) is to the value of d (Q-1) index that 0, d (k+1) is corresponding, and by that analogy, the index that d (Q-1) is corresponding is Q.
In the embodiment of the present invention, can pass through the test and appraisal mark of at least one performance simple sentence of target audio file, the simple sentence Grading sequence of establishing target audio file, based on simple sentence Grading sequence, carry out total points computing, the total points test and appraisal to target audio file have been realized, both met user for the actual demand in the application process of audio file, the application that has promoted again audio file is intelligent.
Referring to Figure 14, is the structural representation of the embodiment of the total points test and appraisal module shown in Fig. 8; This total points test and appraisal module 103 can comprise: computing unit 1301, index acquiring unit 1302 and total points test and appraisal unit 1303.
Computing unit 1301, for calculating average and the maximal value of described simple sentence Grading sequence.
Described computing unit 1301 can adopt the formula (10) in embodiment illustrated in fig. 7 to calculate the average E of described simple sentence Grading sequence; And can adopt formula (11) in embodiment illustrated in fig. 7 to calculate the maximal value d max of described simple sentence Grading sequence.
Index acquiring unit 1302, for obtaining the maximal value of described simple sentence Grading sequence at the index of described simple sentence Grading sequence correspondence.The accessed index of described index acquiring unit 1302 can be the ind in the formula (11) shown in embodiment illustrated in fig. 7.
Total points test and appraisal unit 1303, for index corresponding to the maximal value of the maximal value of the average of described simple sentence Grading sequence, described simple sentence Grading sequence and described simple sentence Grading sequence carried out to total points computing, obtains the test and appraisal total points of described target audio file.
The formula (12) of the process of the total points computing that described total points test and appraisal unit 1303 is performed in can embodiment shown in Figure 7, obtain the test and appraisal total points s of described target audio file, along with the lasting performance of user to described target audio file, the s obtaining in real time can produce respective change.
In the embodiment of the present invention, can pass through the test and appraisal mark of at least one performance simple sentence of target audio file, the simple sentence Grading sequence of establishing target audio file, based on simple sentence Grading sequence, carry out total points computing, the total points test and appraisal to target audio file have been realized, both met user for the actual demand in the application process of audio file, the application that has promoted again audio file is intelligent.
One of ordinary skill in the art will appreciate that all or part of flow process realizing in above-described embodiment method, to come the hardware that instruction is relevant to complete by computer program, described program can be stored in a computer read/write memory medium, this program, when carrying out, can comprise as the flow process of the embodiment of above-mentioned each side method.Wherein, described storage medium can be magnetic disc, CD, read-only store-memory body (Read-Only Memory, ROM) or random store-memory body (Random Access Memory, RAM) etc.
Above disclosed is only preferred embodiment of the present invention, certainly can not limit with this interest field of the present invention, and the equivalent variations of therefore doing according to the claims in the present invention, still belongs to the scope that the present invention is contained.