CN104143340A

CN104143340A - Voice frequency evaluation method and device

Info

Publication number: CN104143340A
Application number: CN201410364103.6A
Authority: CN
Inventors: 赵伟峰
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Guangzhou Kugou Computer Technology Co Ltd
Priority date: 2014-07-28
Filing date: 2014-07-28
Publication date: 2014-11-12
Anticipated expiration: 2034-07-28
Also published as: CN104143340B

Abstract

The embodiment of the invention provides a voice frequency evaluation method and device. The voice frequency evaluation method comprises the first step of obtaining evaluation scores of at least one singing simple sentence in a target voice frequency file, the second step of constructing a simple sentence evaluation score sequence of the target voice frequency file according to the evaluation score of the singing simple sentence, and the third step of conducting total score operation on the simple sentence evaluation score sequence to obtain the total evaluation score of the target voice frequency file. By means of the voice frequency evaluation method and device, total score evaluation can be conducted on the target voice frequency file, the application demands for the voice frequency file can be met, and the application intellectuality of the voice frequency file can be promoted.

Description

A kind of audio frequency assessment method and device

Technical field

The present invention relates to Internet technical field, be specifically related to audio signal processing technique field, relate in particular to a kind of audio frequency assessment method and device.

Background technology

Development along with Internet technology, in internet audio storehouse, included a large amount of audio files such as song, snatch of song, application about internet audio also day by day increases, for example: KTV (Karaoke Television, Karaoke) system, K song system etc.In the process of application audio file, user wishes to know in audio frequency deductive procedure deduction level conventionally, for example: user is when singing certain song, the test and appraisal total points that hope acquisition gives song recitals is to understand the performance level of self, etc., therefore, how to carry out total points test and appraisal such as audio files such as songs, to become a technical matters urgently to be resolved hurrily.

Summary of the invention

The embodiment of the present invention provides a kind of audio frequency assessment method and device, can carry out total points test and appraisal to target audio file, meets the application demand to audio file, and the application that promotes audio file is intelligent.

Embodiment of the present invention first aspect provides a kind of audio frequency assessment method, can comprise:

At least one that obtain in target audio file sung the test and appraisal mark of simple sentence;

According to the test and appraisal mark of described at least one performance simple sentence, build the simple sentence Grading sequence of described target audio file;

Described simple sentence Grading sequence is carried out to total points computing, obtain the test and appraisal total points of described target audio file.

Embodiment of the present invention second aspect provides a kind of audio frequency assessment device, can comprise:

Mark acquisition module, for obtaining the test and appraisal mark of at least one performance simple sentence of target audio file;

Build module, for according to the test and appraisal mark of described at least one performance simple sentence, build the simple sentence Grading sequence of described target audio file;

Total points test and appraisal module, for described simple sentence Grading sequence is carried out to total points computing, obtains the test and appraisal total points of described target audio file.

Implement the embodiment of the present invention, there is following beneficial effect:

In the embodiment of the present invention, can pass through the test and appraisal mark of at least one performance simple sentence of target audio file, the simple sentence Grading sequence of establishing target audio file, based on simple sentence Grading sequence, carry out total points computing, the total points test and appraisal to target audio file have been realized, both met user for the actual demand in the application process of audio file, the application that has promoted again audio file is intelligent.

Accompanying drawing explanation

In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, to the accompanying drawing of required use in embodiment or description of the Prior Art be briefly described below, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skills, do not paying under the prerequisite of creative work, can also obtain according to these accompanying drawings other accompanying drawing.

The process flow diagram of a kind of audio frequency assessment method that Fig. 1 provides for the embodiment of the present invention;

Fig. 2 is the process flow diagram of an embodiment of the step S101 in embodiment illustrated in fig. 1;

Fig. 3 is the process flow diagram of an embodiment of the step s1002 in embodiment illustrated in fig. 2;

Fig. 4 is the process flow diagram of an embodiment of the step s2003 in embodiment illustrated in fig. 3;

Fig. 5 is the process flow diagram of an embodiment of the step s2004 in embodiment illustrated in fig. 3;

Fig. 6 is the process flow diagram of an embodiment of the step S102 in embodiment illustrated in fig. 1;

Fig. 7 is the process flow diagram of an embodiment of the step S103 in embodiment illustrated in fig. 1;

The structural representation of a kind of audio frequency assessment device that Fig. 8 provides for the embodiment of the present invention;

Fig. 9 is the structural representation of the embodiment of the mark acquisition module shown in Fig. 8;

Figure 10 is the structural representation of the embodiment of the first mark acquiring unit shown in Fig. 9;

Figure 11 is the structural representation of the embodiment of the related operation unit shown in Figure 10;

Figure 12 is the structural representation of the embodiment of the mark determining unit shown in Figure 10;

Figure 13 is the structural representation of the embodiment of the structure module shown in Fig. 8;

Figure 14 is the structural representation of the embodiment of the total points test and appraisal module shown in Fig. 8.

Embodiment

Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is clearly and completely described, obviously, described embodiment is only the present invention's part embodiment, rather than whole embodiment.Embodiment based in the present invention, those of ordinary skills, not making the every other embodiment obtaining under creative work prerequisite, belong to the scope of protection of the invention.

In the embodiment of the present invention, audio file can include but not limited to: the files such as song, snatch of song.Source audio file refers to can be used for the file as the reference data of audio frequency test and appraisal, for example: and original singer's song, or the snatch of song intercepting from original singer's song etc.; Target audio file refers to carry out the file of audio frequency test and appraisal, for example: and the song that user sings again for original singer's song, or the snatch of song intercepting the song of again singing from user etc.

In the embodiment of the present invention, an audio file can sequentially be arranged and form by least one audio frequency simple sentence, the set description of this at least one audio frequency simple sentence in audio file, need the part of singing.Take song A as example, and the description of song A can be expressed as follows:

[661，860]aaaaaaaa

[1541，320]bbbbbbbb

[1871，245]cccccccc

……

In the description of above-mentioned song A, such as " aaaaaaaa ", " bbbbbbbb ", " cccccccc " can be respectively used to represent an audio frequency simple sentence, " [] " before each audio frequency simple sentence is for describing the time attribute of corresponding audio frequency simple sentence, its unit interval is generally ms, for example: above-mentioned [661, 860] for the time attribute of description audio simple sentence " aaaaaaaa ", " 661 " wherein represent the start time of audio frequency simple sentence " aaaaaaaa ", " 860 " represent the duration of audio frequency simple sentence " aaaaaaaa ", suppose song A totally 5 minutes, audio frequency simple sentence " aaaaaaaa " is sung since 661ms, lasting 860ms finishes to sing.According to the sequencing of start time, can determine the order of each audio frequency simple sentence that audio file comprises, for example: according to the description of above-mentioned song A, audio frequency simple sentence " aaaaaaaa " is first audio frequency simple sentence, and its order in song A is 1; Audio frequency simple sentence " bbbbbbbb " is second audio frequency simple sentence, and its order in song A is 2; By that analogy.Be understandable that, before each audio frequency simple sentence starts or after finishing, for example also can comprise, without the part of singing: the 0-661ms time period of above-mentioned song A is that this part can comprise prelude information without the part of singing.

In the embodiment of the present invention, described source audio file is sequentially arranged and is formed by least one audio frequency simple sentence, and this audio frequency simple sentence can be described as performance simple sentence.Described target audio file is sequentially arranged and is formed by least one audio frequency simple sentence, and this audio frequency simple sentence can be described as performance simple sentence.

Below in conjunction with Fig. 1-Fig. 7, the audio frequency assessment method that the embodiment of the present invention is provided describes in detail.

Refer to Fig. 1, the process flow diagram of a kind of audio frequency assessment method providing for the embodiment of the present invention; The method can comprise the following steps S101-step S103.

S101, at least one that obtain in target audio file sung the test and appraisal mark of simple sentence.

The test and appraisal mark of the performance simple sentence in target audio file is higher, shows that the singing effect of this performance simple sentence is better, and it more approaches the singing effect of reference simple sentence corresponding in source audio file.Otherwise the test and appraisal mark of the performance simple sentence in target audio file is lower, show that the singing effect of this performance simple sentence is poorer, it more departs from the singing effect of reference simple sentence corresponding in source audio file.This step need to be obtained the test and appraisal mark of at least one the performance simple sentence in target audio file.Described target audio file can comprise at least one performance simple sentence, and this step need to be obtained the test and appraisal mark of all performance simple sentences of described target audio file including.

S102, according to the test and appraisal mark of described at least one performance simple sentence, builds the simple sentence Grading sequence of described target audio file.

In this step, to respectively singing the test and appraisal mark of simple sentence in described at least one performance simple sentence, carry out order arrangement, can be configured to the simple sentence Grading sequence of described target audio file.

S103, carries out total points computing to described simple sentence Grading sequence, obtains the test and appraisal total points of described target audio file.

Wherein, it is computing basis that the test and appraisal total points of described target audio file be take the test and appraisal mark of respectively singing simple sentence in described target audio file, and the test and appraisal total points of described target audio file can be used for reflecting that the integral body of described target audio file sings level.The test and appraisal total points of described target audio file is higher, shows that the performance level of described target audio file is higher, and it more approaches the singing effect of source audio file.Otherwise the test and appraisal total points of described target audio file is lower, show that the performance level of described target audio file is lower, it more departs from the singing effect of source audio file.

Below in conjunction with Fig. 2-Fig. 7, each step in the audio frequency assessment method shown in Fig. 1 is described in detail.

Referring to Fig. 2, is the process flow diagram of an embodiment of the step S101 in embodiment illustrated in fig. 1; This step S101 can comprise the following steps s1001-step s1003.

S1001, determines the order of current performance simple sentence in target audio file.

In this step, can, according to the time attribute of current performance simple sentence, determine the order of this current performance simple sentence in target audio file.Wherein, current performance simple sentence refers to corresponding performance simple sentence of current in progress time in described target audio file, set described target audio file and comprise the individual performance simple sentence of Q (Q is positive integer), if (k is positive integer to corresponding k of current in progress time, and 1≤k≤Q) individual performance simple sentence, current performance simple sentence is k performance simple sentence, and the order of current performance simple sentence in described target audio file is k.Target audio file is song A, the example that is described as with above-mentioned song A: suppose song A totally 5 minutes, the current in progress time is 1895ms, according to the description of song A, 1895ms belongs to the time attribute of audio frequency simple sentence " cccccccc " in the described time period, can determine that thus audio frequency simple sentence " cccccccc " is current performance simple sentence, can determine that the order of current performance simple sentence in target audio file is 3 thus.

S1002, obtains the test and appraisal mark of described current performance simple sentence.

It should be noted that, this step is preferably carried out after described current performance simple sentence performance finishes, according to example shown in step s1001, for song A, its current performance simple sentence is audio frequency simple sentence " cccccccc ", its time attribute is [1871,245], and this step can be obtained constantly at 1871ms+245ms=2116ms the test and appraisal mark of described current performance simple sentence.

In specific implementation, please also refer to Fig. 3, be the process flow diagram of an embodiment of the step s1002 in embodiment illustrated in fig. 2; This step s1002 can comprise the following steps s2001-step s2004.

S2001, obtains the characteristic sequence to be measured of described current performance simple sentence.

Note is again note, refers to, for recording the symbol of carrying out of the sound of different length, can comprise whole note, minim, crotchet, quaver etc. kind.Audio frequency simple sentence can be expressed as the frame sequence that a plurality of audio frames form, and each audio frame all carries note, and according to each audio frame, the time order and function in this audio frequency simple sentence sequentially forms the melody of this audio frequency simple sentence to each note.Pitch is again pitch, refers to the height of sound.Audio frequency simple sentence can be expressed as the frame sequence that a plurality of audio frames form, and each audio frame all carries pitch, and according to each audio frame, the time order and function in this audio frequency simple sentence sequentially forms the melody of this audio frequency simple sentence to each pitch.To sum up, the sequence of notes of audio frequency simple sentence or pitch sequence all can reflect the melody characteristics of this audio frequency simple sentence.

In this step, can obtain the characteristic sequence to be measured of current performance simple sentence, sequence of notes or pitch sequence that described characteristic sequence to be measured is described current performance simple sentence.

S2002, according to the described order of performance simple sentence in described target audio file of working as, position reference simple sentence in source audio file, and obtain the described fixed reference feature sequence with reference to simple sentence.

In the present embodiment, except special instruction, the described reference simple sentence of locating in source audio file that refers in particular to reference to simple sentence.In this step, the order of the reference simple sentence of locating in described source audio file is identical with the order of described current performance simple sentence in described target audio file.According to the above-mentioned example of the present embodiment, if take song A as target audio file, the original singer song B of song A when publishing and distributing is source audio file, the order of current performance simple sentence is 3, the order of the reference simple sentence that song B locates is also 3, from song B, choose the 3rd with reference to simple sentence the test and appraisal benchmark as current performance simple sentence.

In a kind of feasible embodiment of the embodiment of the present invention, described characteristic sequence to be measured is the sequence of notes of described current performance simple sentence, and described fixed reference feature sequence is the described sequence of notes with reference to simple sentence.In the feasible embodiment of the another kind of the embodiment of the present invention, described characteristic sequence to be measured is the pitch sequence of described current performance simple sentence, and institute's fixed reference feature sequence is the described pitch sequence with reference to simple sentence.

S2003, carries out related operation to described fixed reference feature sequence and described characteristic sequence to be measured, obtains related coefficient sequence.

Because described fixed reference feature sequence can be used for characterizing the audio frequency characteristics of the reference simple sentence of locating in source audio file, described characteristic sequence to be measured can be used for characterizing the audio frequency characteristics of current performance simple sentence in target audio file, this step can, to the related operation between described fixed reference feature sequence and described characteristic sequence to be measured, obtain related coefficient sequence.

S2004, according to described related coefficient sequence, determines the test and appraisal mark of described performance simple sentence.

In this step, the test and appraisal mark of described current performance simple sentence is higher, shows that the singing effect of described current performance simple sentence is better, the singing effect of its more approaching located reference simple sentence.Otherwise the test and appraisal mark of described current performance simple sentence is lower, show that the singing effect of described current performance simple sentence is poorer, it more departs from the singing effect of located reference simple sentence.

S1003, order according to described current performance simple sentence in described target audio file, obtain in described target audio file order prior to the test and appraisal mark of all performance simple sentences of described current performance simple sentence, and in described target audio file after order the test and appraisal mark in all performance simple sentences of described current performance simple sentence be set to zero.

Set described target audio file and comprise that Q is sung simple sentence, if current, play k and singing simple sentence, current performance simple sentence is k performance simple sentence, in described target audio file, order comprises that prior to all performance simple sentences of described current performance simple sentence singing k-1 of simple sentence to for the 1st sings simple sentence, and in described target audio file, after order, all performance simple sentences in described current performance simple sentence comprise that singing Q of simple sentence to for k+1 sings simple sentence.This step need to be obtained respectively the 1st and sing k-1 test and appraisal mark of singing simple sentence of simple sentence to the, and k+1 the test and appraisal mark of singing Q performance simple sentence of simple sentence to the is set to zero.It should be noted that, obtaining the 1st process of singing the test and appraisal mark of k-1 performance simple sentence of simple sentence to the can, referring to the process of obtaining the test and appraisal mark of current performance simple sentence, be not repeated herein.Be understandable that, owing to also playing for user and singing in all performance simple sentences of described current performance simple sentence after order in described target audio file, so this step mark of can being tested and assessed is set to zero.

Referring to Fig. 4, is the process flow diagram of an embodiment of the step s2003 in embodiment illustrated in fig. 3; This step s2003 can comprise the following steps s3001-step s3004.

S3001, calculates respectively the average of described fixed reference feature sequence and the average of described characteristic sequence to be measured.

The reference simple sentence of locating comprises N audio frame, and described fixed reference feature sequence can be expressed as p (i); Wherein, i is integer, and 0≤i≤N-1.Particularly, if described fixed reference feature sequence is the described sequence of notes with reference to simple sentence, the note of first audio frame in the reference simple sentence that p (0) represents to locate, the note of second audio frame in the reference simple sentence that p (1) represents to locate, the note of N audio frame in the reference simple sentence that by that analogy, p (N-1) represents to locate.If described characteristic sequence to be measured is the pitch sequence of described performance simple sentence, the pitch of first audio frame in the reference simple sentence that p (0) represents to locate, the note of second audio frame in the reference simple sentence that p (1) represents to locate, the note of N audio frame in the reference simple sentence that by that analogy, p (N-1) represents to locate.

Set current performance simple sentence and comprise N audio frame, described characteristic sequence to be measured can be expressed as s (i), and wherein, i is integer, and 0≤i≤N-1.Particularly, if described characteristic sequence to be measured is the sequence of notes of described current performance simple sentence, s (0) represents the note of first audio frame in described current performance simple sentence, s (1) represents the note of second audio frame in described current performance simple sentence, by that analogy, s (N-1) represents the note of N audio frame in described current performance simple sentence.If described characteristic sequence to be measured is the pitch sequence of described current performance simple sentence, s (0) represents the pitch of first audio frame in described current performance simple sentence, s (1) represents the pitch of second audio frame in described current performance simple sentence, by that analogy, s (N-1) represents the pitch of N audio frame in described current performance simple sentence.

In this step, can adopt following formula (1) to calculate respectively the average of described fixed reference feature sequence p (i) and the average of described characteristic sequence s to be measured (i), this formula (1) is as follows:

MP＝mean(p(i))

MS＝mean(s(i)) (1)

In above-mentioned formula (1), MP represents the average of described fixed reference feature sequence p (i), and MS represents the average of described fixed reference feature sequence p (i), and mean () is the operation of averaging.

S3002, adopts the average of described fixed reference feature sequence, and described fixed reference feature sequence is carried out to regular processing, adopts the average of described characteristic sequence to be measured, and described characteristic sequence to be measured is carried out to regular processing.

The object of regular processing is: described fixed reference feature sequence and described characteristic sequence to be measured are adjusted to same benchmark, to eliminate described fixed reference feature sequence and described characteristic sequence to be measured because average is asked for the inconsistent calculation deviation impact being brought of standard.

In this step, can adopt formula (2) to carry out regular processing to described fixed reference feature sequence, this formula (2) can be expressed as follows:

p2(i)＝p(i)-MP (2)

In above-mentioned formula (2), p2 (i) represents the fixed reference feature sequence obtaining after regular processing.

In this step, can adopt formula (3) to carry out regular processing to described characteristic sequence to be measured, this formula (3) can be expressed as follows:

s2(i)＝s(i)-MS (3)

In above-mentioned formula (3), s2 (i) represents the characteristic sequence to be measured obtaining after regular processing.

S3003, adopts default slicing threshold value, and the described fixed reference feature sequence after regular processing is converted to referential data sequence, and the characteristic sequence described to be measured after regular processing is converted to sequence of values to be measured.

Wherein, described default slicing threshold value can be set according to actual needs, and preferably, described slicing threshold value can adopt formula (4) to set, and this formula (4) can be expressed as follows:

Th_xue＝max(max(abs(p2(i)),max(abs(s2(i))) (4)

In above-mentioned formula (4), Th_xue represents described default slicing threshold value, and max () asks for operation for maximal value, and abs () asks for operation for absolute value.

In this step, can adopt formula (5) that the described fixed reference feature sequence after regular processing is converted to referential data sequence, this formula (5) can be expressed as follows:

p 3 (i) = \{\begin{matrix} 1, p 2 (i) > Th_xue \\ - 1, p 2 (i) < Th_xue \\ 0, else \end{matrix}\} - - - (5)

In above-mentioned formula (5), p3 (i) represents referential data sequence.

In this step, can adopt formula (6) that the characteristic sequence described to be measured after regular processing is converted to sequence of values to be measured, this formula (6) can be expressed as follows:

s 3 (i) = \{\begin{matrix} 1, s 2 (i) > Th_xue \\ - 1, s 2 (i) < Th_xue \\ 0, else \end{matrix}\} - - - (6)

In above-mentioned formula (6), s3 (i) represents sequence of values to be measured.

S3004, adopts cross correlation function to carry out related operation to described referential data sequence and described sequence of values to be measured, obtains related coefficient sequence.

In a kind of feasible embodiment of this step, can adopt formula (7) to carry out related operation to described referential data sequence and described sequence of values to be measured, this formula (7) can be expressed as follows:

R (n) = \frac{1}{N} Σp 3 (i) \cdot s 3 (i - n) - - - (7)

In above-mentioned formula (7), R (n) represents related coefficient sequence; " " represents multiplication operations; S3 (i-n) represents the sequence that s3 (i) loopy moving n forms, wherein, and 0≤n≤N-1.

In the feasible embodiment of the another kind of this step, can adopt formula (8) to carry out related operation to described referential data sequence and described sequence of values to be measured, this formula (8) can be expressed as follows:

R (n) = \frac{1}{N} Σp 3 (i - n) \cdot s 3 (i) - - - (8)

In above-mentioned formula (8), R (n) represents related coefficient sequence; " " represents multiplication operations; P3 (i-n) represents the sequence that p3 (i) loopy moving n forms, wherein, and 0≤n≤N-1.

Referring to Fig. 5, is the process flow diagram of an embodiment of the step s2004 in embodiment illustrated in fig. 3; This step s2004 can comprise the following steps s4001-step s4003.

S4001, calculates the maximal value of described related coefficient sequence.

In this step, can adopt following formula (9) to calculate the maximal value of described related coefficient sequence, this formula (9) can be expressed as follows:

RMAX＝max(R(n)) (9)

In above-mentioned formula (9), R (n) represents related coefficient sequence; Max () asks for operation for maximal value; RMAX represents the maximal value of described related coefficient sequence.

S4002, maps to preset fraction interval by the maximal value of described related coefficient sequence, obtains the peaked mapping value of described related coefficient sequence.

Described preset fraction interval can be set according to actual needs, for example: described preset fraction interval can be set as [0,10]; Or described preset fraction interval can be set as [0,100].In this step, can adopt [score_min, score_max] to represent that described preset fraction is interval, by linearity or nonlinear method, the maximal value RMAX of described related coefficient sequence be mapped to described preset fraction interval, the mapping value of acquisition can represent score _k-1, this score _k-1be positioned in the preset fraction interval shown in [score_min, score_max].

S4003, is defined as described mapping value the test and appraisal mark of described current performance simple sentence.

This step can be by described mapping value score _kbe defined as the test and appraisal mark of described performance simple sentence to be tested and assessed, the test and appraisal mark of described current performance simple sentence is score _kvalue.

Referring to Fig. 6, is the process flow diagram of an embodiment of the step S102 in embodiment illustrated in fig. 1; This step S102 can comprise the following steps s5001-step s5002.

S5001, obtain described at least one sing and respectively sing the order of simple sentence in described target audio file in simple sentence.

In this step, can, according to described time attribute of respectively singing simple sentence, determine and respectively sing the order of simple sentence in described target audio file.

S5002, according to the described order of simple sentence in described target audio file of respectively singing, arranges the described test and appraisal mark of respectively singing simple sentence, forms the simple sentence Grading sequence of described target audio file.

Set described target audio file and comprise that Q is sung simple sentence, the simple sentence Grading sequence of described target audio file can adopt d (j) to represent, wherein, j is integer, and 0≤j≤Q-1.Test and appraisal mark and corresponding index that the simple sentence Grading sequence d (j) of described target audio file sings simple sentence by each form, and index corresponding to described test and appraisal mark refers to the order of the performance simple sentence of acquisition test and appraisal mark.Particularly, d (0) represents the test and appraisal mark of first performance simple sentence in described target audio file, and its corresponding index is 1; D (1) represents second test and appraisal mark of singing simple sentence in described target audio file, and its corresponding index is 2; By that analogy, d (Q-1) represents Q the test and appraisal mark of singing simple sentence in described target audio file, and its corresponding index is Q.According to the example in the embodiment of the present invention, the test and appraisal mark of described current performance simple sentence can be expressed as d (k-1), and the value of d (k-1) is score _k-1, its corresponding index is k.In the embodiment of the present invention, set in described simple sentence Grading sequence d (j), the value of d (0) is score ₀, its corresponding index is 1; The value of d (1) is score ₁, its corresponding index is 2; By that analogy, the value of d (k-1) is score _k-2, its corresponding index is k-1.It is k+2 that d (k+1) is to the value of d (Q-1) index that 0, d (k+1) is corresponding, and by that analogy, the index that d (Q-1) is corresponding is Q.

Referring to Fig. 7, is the process flow diagram of an embodiment of the step S103 in embodiment illustrated in fig. 1; This step S103 can comprise the following steps s6001-step s6003.

S6001, calculates average and the maximal value of described simple sentence Grading sequence.

In this step, can adopt following formula (10) to calculate the average of described simple sentence Grading sequence, this formula (10) can be expressed as follows:

E＝mean(d(j)) (10)

In above-mentioned formula (10), mean () is the operation of averaging.

In this step, can adopt following formula (11) to calculate the maximal value of described simple sentence Grading sequence, this formula (11) can be expressed as follows:

[dmax,ind]＝max(d(j)) (11)

In above-mentioned formula (11), max () asks for operation for maximal value, and d max represents the maximal value in d (j), and ind represents d (j) corresponding index while getting maximal value.

S6002, the maximal value of obtaining described simple sentence Grading sequence corresponding index in described simple sentence Grading sequence.The accessed index of this step is the ind in above-mentioned formula (11).

S6003, the index corresponding to the maximal value of the maximal value of the average of described simple sentence Grading sequence, described simple sentence Grading sequence and described simple sentence Grading sequence carries out total points computing, obtains the test and appraisal total points of described target audio file.

In this step, the process of total points computing can be referring to formula (12), and this formula (12) can be expressed as follows:

s＝max{E+d max*exp[(ind-(k+1))/(k+1)],E} (12)

In above-mentioned formula (12), max{} is that maximal value is asked for operation; Exp represents to take the exponential function that e is the end; K represents the order of current performance simple sentence in described target audio file; S represents the test and appraisal total points of described target audio file, and along with the lasting performance of user to described target audio file, k value constantly changes, and d (j) constantly changes, and the s obtaining in real time also can respective change.

Below in conjunction with Fig. 8-Figure 14, a kind of audio frequency assessment device that the embodiment of the present invention is provided describes in detail.It should be noted that, the audio frequency assessment device described in Fig. 8-Figure 14 can be applicable to carry out the method shown in above-mentioned accompanying drawing 1-accompanying drawing 7.In practical application, described audio frequency assessment device can run on server end, or runs on such as in notebook computer, mobile phone, PAD (panel computer), intelligent wearable device etc. terminal.

Refer to Fig. 8, the structural representation of a kind of audio frequency assessment device providing for the embodiment of the present invention; This device can comprise: mark acquisition module 101, structure module 102 and total points test and appraisal module 103.

Mark acquisition module 101, for obtaining the test and appraisal mark of at least one performance simple sentence of target audio file.

The test and appraisal mark of the performance simple sentence in target audio file is higher, shows that the singing effect of this performance simple sentence is better, and it more approaches the singing effect of reference simple sentence corresponding in source audio file.Otherwise the test and appraisal mark of the performance simple sentence in target audio file is lower, show that the singing effect of this performance simple sentence is poorer, it more departs from the singing effect of reference simple sentence corresponding in source audio file.Described mark acquisition module 101 need to obtain the test and appraisal mark of at least one the performance simple sentence in target audio file.Described target audio file can comprise at least one performance simple sentence, and described mark acquisition module 101 need to obtain the test and appraisal mark of all performance simple sentences of described target audio file including.

Build module 102, for according to the test and appraisal mark of described at least one performance simple sentence, build the simple sentence Grading sequence of described target audio file.

Described in 102 pairs of described structure modules, at least one is sung the test and appraisal mark of respectively singing simple sentence in simple sentence and carries out order arrangement, can be configured to the simple sentence Grading sequence of described target audio file.

Total points test and appraisal module 103, for described simple sentence Grading sequence is carried out to total points computing, obtains the test and appraisal total points of described target audio file.

Referring to Fig. 9, is the structural representation of the embodiment of the mark acquisition module shown in Fig. 8; This mark acquisition module 101 can comprise: order determining unit 1101, the first mark acquiring unit 1102 and the second mark acquiring unit 1103.

Order determining unit 1101, for determining that current performance simple sentence is in the order of target audio file.

Described order determining unit 1101 can, according to the time attribute of current performance simple sentence, be determined the order of this current performance simple sentence in target audio file.Wherein, current performance simple sentence refers to corresponding performance simple sentence of current in progress time in described target audio file, set described target audio file and comprise the individual performance simple sentence of Q (Q is positive integer), if (k is positive integer to corresponding k of current in progress time, and 1≤k≤Q) individual performance simple sentence, current performance simple sentence is k performance simple sentence, and the order of current performance simple sentence in described target audio file is k.Target audio file is song A, the example that is described as with above-mentioned song A: suppose song A totally 5 minutes, the current in progress time is 1895ms, according to the description of song A, 1895ms belongs to the time attribute of audio frequency simple sentence " cccccccc " in the described time period, can determine that thus audio frequency simple sentence " cccccccc " is current performance simple sentence, can determine that the order of current performance simple sentence in target audio file is 3 thus.

The first mark acquiring unit 1102, for obtaining the test and appraisal mark of described current performance simple sentence.

It should be noted that, described the first mark acquiring unit 1102 is preferably carried out the process of obtaining after described current performance simple sentence performance finishes, according to example shown in the present embodiment, for song A, its current performance simple sentence is audio frequency simple sentence " cccccccc ", its time attribute is [1871,245], and described the first mark acquiring unit 1102 can obtain constantly at 1871ms+245ms=2116ms the test and appraisal mark of described current performance simple sentence.

The second mark acquiring unit 1103, for the order at described target audio file according to described current performance simple sentence, obtain in described target audio file order prior to the test and appraisal mark of all performance simple sentences of described current performance simple sentence, and in described target audio file after order the test and appraisal mark in all performance simple sentences of described current performance simple sentence be set to zero.

Set described target audio file and comprise that Q is sung simple sentence, if current, play k and singing simple sentence, current performance simple sentence is k performance simple sentence, in described target audio file, order comprises that prior to all performance simple sentences of described current performance simple sentence singing k-1 of simple sentence to for the 1st sings simple sentence, and in described target audio file, after order, all performance simple sentences in described current performance simple sentence comprise that singing Q of simple sentence to for k+1 sings simple sentence.Described the second mark acquiring unit 1103 need to obtain respectively the 1st and sing k-1 test and appraisal mark of singing simple sentence of simple sentence to the, and k+1 the test and appraisal mark of singing Q performance simple sentence of simple sentence to the is set to zero.It should be noted that, obtaining the 1st process of singing the test and appraisal mark of k-1 performance simple sentence of simple sentence to the can be referring to the process of obtaining the test and appraisal mark of current performance simple sentence.Be understandable that, owing to also playing for user and singing in all performance simple sentences of described current performance simple sentence after order in described target audio file, therefore described the second mark acquiring unit 1103 mark of can being tested and assessed is set to zero.

Referring to Figure 10, is the structural representation of the embodiment of the first mark acquiring unit shown in Fig. 9; This first mark acquiring unit 1101 can comprise: retrieval to be measured unit 1111, reference sequences acquiring unit 1112, related operation unit 1113 and mark determining unit 1114.

Retrieval to be measured unit 1111, for obtaining the characteristic sequence to be measured of described current performance simple sentence.

Described retrieval to be measured unit 1111 can obtain the characteristic sequence to be measured of current performance simple sentence, sequence of notes or pitch sequence that described characteristic sequence to be measured is described current performance simple sentence.

Reference sequences acquiring unit 1112, for singing simple sentence in the order of described target audio file according to described working as, position reference simple sentence in source audio file, and obtain the described fixed reference feature sequence with reference to simple sentence.

In the present embodiment, except special instruction, the described reference simple sentence of locating in source audio file that refers in particular to reference to simple sentence.The order of the reference simple sentence of wherein, locating in described source audio file is identical with the order of described current performance simple sentence in described target audio file.If take song A as target audio file, the original singer song B of song A when publishing and distributing is source audio file, the order of current performance simple sentence is 3, the order of the reference simple sentence that song B locates is also 3, described reference sequences acquiring unit 1112 from song B, choose the 3rd with reference to simple sentence the test and appraisal benchmark as current performance simple sentence.

Related operation unit 1113, for described fixed reference feature sequence and described characteristic sequence to be measured are carried out to related operation, obtains related coefficient sequence.

Because described fixed reference feature sequence can be used for characterizing the audio frequency characteristics of the reference simple sentence of locating in source audio file, described characteristic sequence to be measured can be used for characterizing the audio frequency characteristics of current performance simple sentence in target audio file, described related operation unit 1113 can, to the related operation between described fixed reference feature sequence and described characteristic sequence to be measured, obtain related coefficient sequence.

Mark determining unit 1114, for according to described related coefficient sequence, determines the test and appraisal mark of described current performance simple sentence.

The test and appraisal mark of described current performance simple sentence is higher, shows that the singing effect of described current performance simple sentence is better, the singing effect of its more approaching located reference simple sentence.Otherwise the test and appraisal mark of described current performance simple sentence is lower, show that the singing effect of described current performance simple sentence is poorer, it more departs from the singing effect of located reference simple sentence.

Referring to Figure 11, is the structural representation of the embodiment of the related operation unit shown in Figure 10; This related operation unit 1113 can comprise: mean value computation subelement 1311, regular processing subelement 1312, sequence conversion subelement 1313 and related operation subelement 1314.

Mean value computation subelement 1311, for calculating respectively the average of described fixed reference feature sequence and the average of described characteristic sequence to be measured.

Described mean value computation subelement 1311 can adopt the formula (1) in embodiment illustrated in fig. 4 to calculate respectively the average of described fixed reference feature sequence p (i) and the average of described characteristic sequence s to be measured (i).

Regular processing subelement 1312, for adopting the average of described fixed reference feature sequence, carries out regular processing to described fixed reference feature sequence, adopts the average of described characteristic sequence to be measured, and described characteristic sequence to be measured is carried out to regular processing.

The object of regular processing is: described fixed reference feature sequence and described characteristic sequence to be measured are adjusted to same benchmark, to eliminate described fixed reference feature sequence and described characteristic sequence to be measured because average is asked for the inconsistent calculation deviation impact being brought of standard.Described regular processing subelement 1312 can adopt the formula (2) shown in embodiment illustrated in fig. 4 to carry out regular processing to described fixed reference feature sequence, obtain the fixed reference feature sequence p2 (i) after regular processing, and can adopt the formula (3) shown in embodiment illustrated in fig. 4 to carry out regular processing to described characteristic sequence to be measured, obtain the characteristic sequence s2 to be measured (i) obtaining after regular processing.

Sequence conversion subelement 1313, for adopting default slicing threshold value, is converted to referential data sequence by the described fixed reference feature sequence after regular processing, and the characteristic sequence described to be measured after regular processing is converted to sequence of values to be measured.

Wherein, described default slicing threshold value can be set according to actual needs, and preferably, described slicing threshold value can adopt the formula (4) in embodiment illustrated in fig. 4 to set.Described sequence conversion subelement 1313 can adopt the formula (5) in embodiment illustrated in fig. 4 that described fixed reference feature sequence after regular processing is converted to referential data sequence p3 (i), and can adopt the formula (6) in embodiment illustrated in fig. 4 that characteristic sequence described to be measured after regular processing is converted to sequence of values s3 to be measured (i).

Related operation subelement 1314, for adopting cross correlation function to carry out related operation to described referential data sequence and described sequence of values to be measured, obtains related coefficient sequence.

In a kind of feasible embodiment of the present embodiment, described related operation subelement 1314 can adopt the formula (7) in embodiment illustrated in fig. 4 to carry out related operation to described referential data sequence p3 (i) and described sequence of values s3 to be measured (i), obtains related coefficient sequence R (n).In the feasible embodiment of the another kind of the present embodiment, described related operation subelement 1314 can adopt the formula (8) in embodiment illustrated in fig. 4 to carry out related operation to described referential data sequence p3 (i) and described sequence of values s3 to be measured (i), obtains related coefficient sequence R (n).

Referring to Figure 12, is the structural representation of the embodiment of the mark determining unit shown in Figure 10; This mark determining unit 1114 can comprise: maximum value calculation subelement 1411, mapping subelement 1412 and mark are determined subelement 1413.

Maximum value calculation subelement 1411, for calculating the maximal value of described related coefficient sequence.

Described maximum value calculation subelement 1411 can adopt the formula (9) in embodiment illustrated in fig. 5 to calculate the maximal value RMAX of described related coefficient sequence.

Mapping subelement 1412, for the maximal value of described related coefficient sequence is mapped to preset fraction interval, obtains the peaked mapping value of described related coefficient sequence.

Described preset fraction interval can be set according to actual needs, for example: described preset fraction interval can be set as [0,10]; Or described preset fraction interval can be set as [0,100].Described mapping subelement 1412 can adopt [score_min, score_max] represent that described preset fraction is interval, by linearity or nonlinear method, the maximal value RMAX of described related coefficient sequence is mapped to described preset fraction interval, the mapping value of acquisition can represent score _k-1, this score _k-1be positioned in the preset fraction interval shown in [score_min, score_max].

Mark is determined subelement 1413, for described mapping value being defined as to the test and appraisal mark of described current performance simple sentence.

Described mark determines that subelement 1413 can be by described mapping value score _kbe defined as the test and appraisal mark of described performance simple sentence to be tested and assessed, the test and appraisal mark of described current performance simple sentence is score _kvalue.

Referring to Figure 13, is the structural representation of the embodiment of the structure module shown in Fig. 8; This structure module 102 can comprise: order acquiring unit 1201 and construction unit 1202.

Order acquiring unit 1201, for obtain described at least one sing simple sentence respectively sing the order of simple sentence in described target audio file.

Described order acquiring unit 1201 can, according to described time attribute of respectively singing simple sentence, be determined and respectively sing the order of simple sentence in described target audio file.

Construction unit 1202, for according to the described simple sentence of respectively singing in the order of described target audio file, the described test and appraisal mark of respectively singing simple sentence is arranged, form the simple sentence Grading sequence of described target audio file.

Referring to Figure 14, is the structural representation of the embodiment of the total points test and appraisal module shown in Fig. 8; This total points test and appraisal module 103 can comprise: computing unit 1301, index acquiring unit 1302 and total points test and appraisal unit 1303.

Computing unit 1301, for calculating average and the maximal value of described simple sentence Grading sequence.

Described computing unit 1301 can adopt the formula (10) in embodiment illustrated in fig. 7 to calculate the average E of described simple sentence Grading sequence; And can adopt formula (11) in embodiment illustrated in fig. 7 to calculate the maximal value d max of described simple sentence Grading sequence.

Index acquiring unit 1302, for obtaining the maximal value of described simple sentence Grading sequence at the index of described simple sentence Grading sequence correspondence.The accessed index of described index acquiring unit 1302 can be the ind in the formula (11) shown in embodiment illustrated in fig. 7.

Total points test and appraisal unit 1303, for index corresponding to the maximal value of the maximal value of the average of described simple sentence Grading sequence, described simple sentence Grading sequence and described simple sentence Grading sequence carried out to total points computing, obtains the test and appraisal total points of described target audio file.

The formula (12) of the process of the total points computing that described total points test and appraisal unit 1303 is performed in can embodiment shown in Figure 7, obtain the test and appraisal total points s of described target audio file, along with the lasting performance of user to described target audio file, the s obtaining in real time can produce respective change.

One of ordinary skill in the art will appreciate that all or part of flow process realizing in above-described embodiment method, to come the hardware that instruction is relevant to complete by computer program, described program can be stored in a computer read/write memory medium, this program, when carrying out, can comprise as the flow process of the embodiment of above-mentioned each side method.Wherein, described storage medium can be magnetic disc, CD, read-only store-memory body (Read-Only Memory, ROM) or random store-memory body (Random Access Memory, RAM) etc.

Above disclosed is only preferred embodiment of the present invention, certainly can not limit with this interest field of the present invention, and the equivalent variations of therefore doing according to the claims in the present invention, still belongs to the scope that the present invention is contained.

Claims

1. an audio frequency assessment method, is characterized in that, comprising:

2. the method for claim 1, is characterized in that, described in obtain in target audio file at least one sing the test and appraisal mark of simple sentence, comprising:

Determine the order of current performance simple sentence in target audio file;

Obtain the test and appraisal mark of described current performance simple sentence;

Order according to described current performance simple sentence in described target audio file, obtain in described target audio file order prior to the test and appraisal mark of all performance simple sentences of described current performance simple sentence, and in described target audio file after order the test and appraisal mark in all performance simple sentences of described current performance simple sentence be set to zero.

3. method as claimed in claim 2, described in obtain the test and appraisal mark of described current performance simple sentence, comprising:

Obtain the characteristic sequence to be measured of described current performance simple sentence;

According to the described order of performance simple sentence in described target audio file of working as, position reference simple sentence in source audio file, and obtain the described fixed reference feature sequence with reference to simple sentence;

Described fixed reference feature sequence and described characteristic sequence to be measured are carried out to related operation, obtain related coefficient sequence;

According to described related coefficient sequence, determine the test and appraisal mark of described current performance simple sentence.

4. method as claimed in claim 3, is characterized in that, described characteristic sequence to be measured is the sequence of notes of described current performance simple sentence, and described fixed reference feature sequence is the described sequence of notes with reference to simple sentence; Or,

Described characteristic sequence to be measured is the pitch sequence of described current performance simple sentence, and institute's fixed reference feature sequence is the described pitch sequence with reference to simple sentence.

5. method as claimed in claim 3, is characterized in that, described described fixed reference feature sequence and described characteristic sequence to be measured is carried out to related operation, obtains related coefficient sequence, comprising:

Calculate respectively the average of described fixed reference feature sequence and the average of described characteristic sequence to be measured;

Adopt the average of described fixed reference feature sequence, described fixed reference feature sequence is carried out to regular processing, adopt the average of described characteristic sequence to be measured, described characteristic sequence to be measured is carried out to regular processing;

Adopt default slicing threshold value, the described fixed reference feature sequence after regular processing is converted to referential data sequence, the characteristic sequence described to be measured after regular processing is converted to sequence of values to be measured;

Adopt cross correlation function to carry out related operation to described referential data sequence and described sequence of values to be measured, obtain related coefficient sequence.

6. method as claimed in claim 3, is characterized in that, described according to described related coefficient sequence, determines the test and appraisal mark of described current performance simple sentence, comprising:

Calculate the maximal value of described related coefficient sequence;

The maximal value of described related coefficient sequence is mapped to preset fraction interval, obtain the peaked mapping value of described related coefficient sequence;

Described mapping value is defined as to the test and appraisal mark of described current performance simple sentence.

7. the method as described in claim 1-6 any one, is characterized in that, at least one sings the test and appraisal mark of simple sentence described in described basis, builds the simple sentence Grading sequence of described target audio file, comprising:

Obtain described at least one sing and respectively sing the order of simple sentence in described target audio file in simple sentence;

According to the described order of simple sentence in described target audio file of respectively singing, the described test and appraisal mark of respectively singing simple sentence is arranged, form the simple sentence Grading sequence of described target audio file.

8. method as claimed in claim 7, is characterized in that, described described simple sentence Grading sequence is carried out to total points computing, obtains the test and appraisal total points of described target audio file, comprising:

Calculate average and the maximal value of described simple sentence Grading sequence;

Obtain the index of maximal value correspondence in described simple sentence Grading sequence of described simple sentence Grading sequence;

The index corresponding to the maximal value of the maximal value of the average of described simple sentence Grading sequence, described simple sentence Grading sequence and described simple sentence Grading sequence carries out total points computing, obtains the test and appraisal total points of described target audio file.

9. an audio frequency assessment device, is characterized in that, comprising:

10. device as claimed in claim 9, is characterized in that, described mark acquisition module comprises:

Order determining unit, for determining that current performance simple sentence is in the order of target audio file;

The first mark acquiring unit, for obtaining the test and appraisal mark of described current performance simple sentence;

The second mark acquiring unit, for the order at described target audio file according to described current performance simple sentence, obtain in described target audio file order prior to the test and appraisal mark of all performance simple sentences of described current performance simple sentence, and in described target audio file after order the test and appraisal mark in all performance simple sentences of described current performance simple sentence be set to zero.

11. devices as claimed in claim 10, is characterized in that, described the first mark acquiring unit comprises:

Retrieval to be measured unit, for obtaining the characteristic sequence to be measured of described current performance simple sentence;

Reference sequences acquiring unit, for singing simple sentence in the order of described target audio file according to described working as, position reference simple sentence in source audio file, and obtain the described fixed reference feature sequence with reference to simple sentence;

Related operation unit, for described fixed reference feature sequence and described characteristic sequence to be measured are carried out to related operation, obtains related coefficient sequence;

Mark determining unit, for according to described related coefficient sequence, determines the test and appraisal mark of described current performance simple sentence.

12. devices as claimed in claim 11, is characterized in that, described characteristic sequence to be measured is the sequence of notes of described current performance simple sentence, and described fixed reference feature sequence is the described sequence of notes with reference to simple sentence; Or,

13. devices as claimed in claim 11, is characterized in that, described related operation unit comprises:

Mean value computation subelement, for calculating respectively the average of described fixed reference feature sequence and the average of described characteristic sequence to be measured;

Regular processing subelement, for adopting the average of described fixed reference feature sequence, carries out regular processing to described fixed reference feature sequence, adopts the average of described characteristic sequence to be measured, and described characteristic sequence to be measured is carried out to regular processing;

Sequence conversion subelement, for adopting default slicing threshold value, is converted to referential data sequence by the described fixed reference feature sequence after regular processing, and the characteristic sequence described to be measured after regular processing is converted to sequence of values to be measured;

Related operation subelement, for adopting cross correlation function to carry out related operation to described referential data sequence and described sequence of values to be measured, obtains related coefficient sequence.

14. devices as claimed in claim 11, is characterized in that, described mark determining unit comprises:

Maximum value calculation subelement, for calculating the maximal value of described related coefficient sequence;

Mapping subelement, for the maximal value of described related coefficient sequence is mapped to preset fraction interval, obtains the peaked mapping value of described related coefficient sequence;

Mark is determined subelement, for described mapping value being defined as to the test and appraisal mark of described current performance simple sentence.

15. devices as described in claim 9-14 any one, is characterized in that, described structure module comprises:

Order acquiring unit, for obtain described at least one sing simple sentence respectively sing the order of simple sentence in described target audio file;

Construction unit, for according to the described simple sentence of respectively singing in the order of described target audio file, the described test and appraisal mark of respectively singing simple sentence is arranged, form the simple sentence Grading sequence of described target audio file.

16. devices as claimed in claim 15, is characterized in that, described total points test and appraisal module comprises:

Computing unit, for calculating average and the maximal value of described simple sentence Grading sequence;

Index acquiring unit, for obtaining the maximal value of described simple sentence Grading sequence at the index of described simple sentence Grading sequence correspondence;

Total points test and appraisal unit, for index corresponding to the maximal value of the maximal value of the average of described simple sentence Grading sequence, described simple sentence Grading sequence and described simple sentence Grading sequence carried out to total points computing, obtains the test and appraisal total points of described target audio file.