CN102034471A - Music extraction device and music recording device - Google Patents

Music extraction device and music recording device Download PDF

Info

Publication number
CN102034471A
CN102034471A CN2010102943740A CN201010294374A CN102034471A CN 102034471 A CN102034471 A CN 102034471A CN 2010102943740 A CN2010102943740 A CN 2010102943740A CN 201010294374 A CN201010294374 A CN 201010294374A CN 102034471 A CN102034471 A CN 102034471A
Authority
CN
China
Prior art keywords
melody
sound power
time
variable quantity
differential signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2010102943740A
Other languages
Chinese (zh)
Inventor
古贺达雄
大前寿敏
岛冈秀人
山本友二
松本悟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sanyo Electric Co Ltd
Original Assignee
Sanyo Electric Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from JP2010195431A external-priority patent/JP2011090290A/en
Application filed by Sanyo Electric Co Ltd filed Critical Sanyo Electric Co Ltd
Publication of CN102034471A publication Critical patent/CN102034471A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Electrophonic Musical Instruments (AREA)

Abstract

The invention provides a music extraction device and a music recording device. The music extraction device includes: a voice power computation part for computing the voice power by a voice signal; and a determination part for determining a music part or a non-music part based on the voice power.

Description

Melody extraction element and melody recording device
The application is willing to 2010-195431 number based on Japanese Patent Application 2009-223066 number of application on September 28th, 2009 and the spy of application on September 1st, 2010.
Technical field
The present invention relates to a kind of extraction only the melody melody extraction element partly and the melody recording device that melody is recorded of radio broadcasting.
Background technology
Has from the radio broadcasting that receives the digit regeneration device that automatically extracts musical portions, preserved.For example have: by judging stereo data or monophonic data in L channel (channel) data of broadcast data and the right data, be that melody, monophonic partly are non-melody, extract the digit regeneration device of melody part by making stereo part.
But, in this digit regeneration device, because when the received electric field strength of radio broadcasting is low, the degree of separation of left and right acoustic channels data diminishes, so the voice signal as original stereo part also can be judged to be non-stereo signal, the such problem of melody part can not be correctly extracted in existence.And, in this digit regeneration device,, will exist and can not extract the such problem of melody part if not the broadcasting that transmits the left and right acoustic channels data at least (for example FM (Frequency Modulation) frequency modulation broadcasting).Particularly, for example, can not extract the melody part transmitting in the only AM of monophonic data (Amplitude Modulation, the amplitude modulation) broadcasting.
Summary of the invention
Melody extraction element of the present invention comprises following structure:
The sound power calculating part, it calculates sound power by voice signal; And
Detection unit, its state based on sound power carry out the judgement of melody part or non-melody part.
Melody recording device of the present invention comprises following structure:
Above-mentioned melody extraction element; And
Recording portion, it is judged to be above-mentioned melody extraction element is that the voice signal in interval of melody is recorded.
Description of drawings
Fig. 1 is the hardware structure diagram of the record regeneration device 100 of first embodiment.
Fig. 2 is the process flow diagram that the recording of the record regeneration device 100 of first embodiment is handled.
Fig. 3 is the image of variable quantity of waveform, sound power, the sound power of voice signal.
Fig. 4 is the image of LR difference.
Fig. 5 is that expression electric field intensity is when high and the LR differential signal when low and the figure of sound power.
Fig. 6 is playlist (melody positional information) the product process figure of the record regeneration device 100 of first embodiment.
Fig. 7 is the regeneration process flow diagram of the record regeneration device 100 of first embodiment.
Fig. 8 is the hardware structure diagram of the record regeneration device 100a of second embodiment.
Fig. 9 is the functional-block diagram at main position of the record regeneration device 100a of second embodiment.
Figure 10 is the image of frequency of waveform, second change point of voice signal.
Figure 11 is the process flow diagram that the recording of the record regeneration device 100a of second embodiment is handled.
Figure 12 is the image of the very first time, second time.
Figure 13 is the functional-block diagram at main position of the record regeneration device 100a of second embodiment (another example).
Embodiment
By the explanation of embodiment shown below, meaning of the present invention and effect will be clearer and more definite.But embodiment hereinafter is one of embodiments of the present invention only, and the meaning of the term of the present invention and each constitutive requirements is not limited to the content put down in writing in the following embodiment.
<the first embodiment 〉
At first, describe the i.e. record regeneration device 100 of first embodiment of first embodiment of the present invention in detail based on accompanying drawing.
The i.e. hardware structure diagram of the record regeneration device 100 of first embodiment of one embodiment of the present invention has been shown among Fig. 1.The record regeneration device 100 of present embodiment comprises: FM tuner 1, A/D portion 2, DSP3, D/A portion 4, CPU5, storer 6, recording medium 7.
FM tuner 1 demodulation FM broadcast wave, output analoging sound signal.A/D portion 2 converts analoging sound signal to digital audio signal.DSP3 comprises: melody extraction unit (extracting, export the only part of melody part from voice signal); With sound Codec portion (being combined into the demoder of non-compressed word voice signal with the non-compressed word sound signal encoding scrambler that is compression sound data, with compression sound data).D/A portion 4 converts digital audio signal to analoging sound signal and output.When voice signal is stereophonic signal, the signal of 2 sound channels about exporting respectively.CPU5 is an arithmetic processing apparatus.Storer 6 is working storage of so-called CPU5.Recording medium 7 recording compressed voice datas (through the music data of recording) and pair band set information thereon.
The process flow diagram that the recording of the record regeneration device 100 of first embodiment shown in Fig. 2 is handled.
At first, start the scrambler in FM tuner 1 and the DSP3, on one side voice signal is encoded, be recorded on one side in the recording file in the recording medium 7 (for example HDD) (S1, S2).Based on sound waveform through coding, the calculating of the calculating of beginning sound power value, the variable quantity of sound power value, about the calculating (S3, S4, S5) of differential signal (LR difference) between 2 sound channels.
At this, use Fig. 3, the image of variable quantity of waveform, sound power, the sound power of voice signal is shown.The chart of left side epimere is the chart of the folk prescription (for example Lch) of expression voice signal.The chart in stage casing, left side is the chart of the sound power that calculated by voice signal of expression.The chart of left side hypomere is the chart of the variable quantity of expression sound power.
In addition, use Fig. 4 that the image of LR difference is shown.The chart of left side epimere is the chart of waveform of the L channel voice signal of expression stereo sound.The chart in stage casing, left side is the chart of the waveform of expression right channel sound signal.The chart of left side hypomere is the chart of the waveform of difference (LR difference) signal of the voice signal of 2 sound channels about expression.The chart on right side is the chart of mean value of the set time of expression LR difference value.
When detecting variable quantity at sound power when becoming change point more than the setting (for example in the chart of the left side of Fig. 3 hypomere, being shown in broken lines) ("Yes" among the S6), just calculate the mean value (for example chart on Fig. 3 right side) of the sound power in the set time before and after this change point and the mean value (chart on Fig. 4 right side) (S7, S8) of LR difference.Mean value at sound power (for example surpasses threshold value, in the chart on Fig. 3 right side, be shown in broken lines) situation under or surpass under the situation of threshold value (being shown in broken lines in the chart in the stage casing, right side of Fig. 4) ("Yes" among the S9) at the mean value of LR difference, judge that this change point is the melody part, returns S6 once more.Then, similarly carry out the judgement of S7~S9 about this change point.
On the other hand, when the mean value both sides of the mean value of power and LR difference surpassed threshold value, the position (apart from the relative moment of recording beginning) of writing down this change point as non-melody point (TA (i)) (S10).Until there is recording to stop all to carry out repetition (S11, S12) till the indication.
Under the situation that has recording to stop to indicate ("Yes" among the S12), stop coding, preserve non-melody point (TA (i)), close recording file (S13).Non-melody point (TA (i)) both can be kept in the recording file mutually distinctively with compression sound data, also can be used as the file different with recording file and was preserved.
Have again, in above-mentioned record, only writing down non-melody point, not writing down the melody point is because in the record regeneration device 100 of present embodiment, the interval of satisfying following condition is judged to be melody interval (about this, the process flow diagram of reference Fig. 6 described later describes): (1) is that interval, (2) and this length of an interval degree between non-melody point and next the non-melody point is stipulated time above (for example more than 90 seconds).Applicant's result of experiment finds, in non-melody portion such as talk, with melody portion relatively, change point produces quite a lot ofly.Therefore, as mentioned above,, also no problem in practicality even regard the interval between non-melody point and next the non-melody point as the melody interval.
In addition, in above-mentioned, if it is non-melody point that the mean value both sides of the mean value of power and LR difference are no more than the situation of threshold value, surpassing under the situation of threshold value as the melody point at the mean value of the mean value of sound power or LR difference is basis, (1) mean value of sound power is in the tendency that melody part partly uprises than non-melody, (2) mean value of sound power, even electric field intensity descends, the mean value of sound power does not descend so yet.Be explained with reference to Fig. 5.
The chart of the epimere of Fig. 5 is the synoptic diagram of the LR differential signal of electric field intensity when high.When electric field intensity was high, because the LR difference value of melody part becomes big (surpassing the threshold value that dots with figure), the LR difference value of the part (non-melody part) of talking diminished (not surpassing threshold value), so can correctly extract the melody part.
The chart in the stage casing of Fig. 5 is the synoptic diagram of the LR differential signal of electric field intensity when low.When electric field intensity was low, the difference of the LR difference value of melody portion and non-melody portion diminished.In this example, owing to the first melody LC difference value partly bent and the 3rd song surpasses threshold value, so can misinterpretation be non-melody part partly for this reason.
The chart of the hypomere of Fig. 5 be overlap expression electric field intensity when low the LR differential signal and the synoptic diagram of performance number.With respect to the melody LR difference value step-down partly of first song and the 3rd song, the melody performance number partly about first song and the 3rd song does not descend so.So, even electric field intensity descends as can be known, also be difficult to be affected about performance number.In addition, about the talk part, performance number is low as can be known.But, about the melody part of second song, because performance number is not too big, so also can there be the situation that produces misinterpretation in hypothesis when only judging with performance number.Based on more than, under the low situation of electric field intensity,, just can improve the extraction precision of melody part by utilizing LR differential signal and performance number both sides.
The playlist of the record regeneration device 100 of first embodiment shown in Figure 6 (melody positional information) product process figure.Playlist is the tabulation that where records melody that is illustrated in recording file.
At first, from recording file etc., read non-melody point TA (i) (S21).Then, (for example (TA (1)-TA (0)) (S22) to calculate the interval of adjacent TA (i).If TM second above (for example more than 90 seconds), then write down TA (0) as the initial point of melody, record TA (1) terminal point (S23) as melody.If, then (on i, add 1) and return S22 once more less than TM second, calculate TA (2)-TA (1), with TM second relatively.(up to be judged to be "Yes" in S26 till) all carries out repetition to this till the candidate point data that does not have melody.
The regeneration process flow diagram of the record regeneration device 100 of first embodiment shown in Figure 7.In the moment (S31) of the starting point of the melody of first song of playback record in recording file from playlist, begin regeneration (S32) from here.In case regenerate, just stop regeneration to the terminal point ("Yes" among the S33) of the melody of first song.Read moment of starting point of the melody of second song, begin regeneration.Till the start point/end point data that in playlist, do not have melody, (in S34, become "Yes"), all this is carried out repetition.
<the second embodiment 〉
At first, explain record regeneration device 100a based on accompanying drawing as second embodiment of a mode of the invention process.Have again, second embodiment be the above-mentioned feature of utilizing the applicant to find (in non-melody portion such as talk, with melody portion relatively change point take place more), the concrete example that carries out the judgement of melody part or non-melody part.
Figure 8 illustrates hardware structure diagram as the record regeneration device 100a of second embodiment of one embodiment of the present invention.Have, Fig. 8 is equivalent to represent Fig. 1 of the record regeneration device 100 of first embodiment, in this figure, gives identical symbol for the structure identical with Fig. 1, omits its detailed explanation again.
The record regeneration device 100a of present embodiment comprises: FM tuner 1, AM tuner 1a, A/D portion 2, DSP3a, D/A portion 4, CPU5, storer 6, recording medium 7.
AM tuner 1a demodulation AM broadcast wave, output analoging sound signal.A/D portion 2 will convert digital audio signal to from the analoging sound signal of FM tuner 1 and AM tuner 1a output.Though DSP3a comprises melody extraction unit and sound Codec portion, the DSP3 different (details aftermentioneds) of the structure of melody extraction unit and work and the record regeneration device 100 of first embodiment.D/A portion 4 converts digital audio signal to analoging sound signal and output.CPU5, storer 6 and recording medium 7 are identical with the record regeneration device 100 of first embodiment.
Having, in Fig. 8, though example illustrates the structure that AM tuner 1a will export as the signal of 2 sound channels of M1 and M2 by the non-stereo signal that demodulation obtains, also can be the structure of the non-stereo signal of output 1 sound channel again.Similarly, 2a of A/D portion and D/A portion 4 also can be the structures of the non-stereo signal of output 1 sound channel.In addition, though illustrating, example comprises and the corresponding different tuner (FM tuner 1 and AM tuner 1a) of the broadcast wave of process object, other part (particularly 2a of A/D portion and D/A portion 4) is common structure, but make which structure common, which structure is not both can at random change.In addition, FM tuner 1 and AM tuner 1a both can be the structures that can start simultaneously, also can be the structures that can start any one tuner.
Then, describe the melody extraction unit that is comprised among the DSP3a of record regeneration device 100a of second embodiment in detail based on accompanying drawing.
The functional-block diagram at main position of the record regeneration device 100a of second embodiment has been shown among Fig. 9.Fig. 9 represents the part that is associated with the work of the melody extraction unit of DSP3a.
The melody extraction unit that is comprised among the DSP3a of the record regeneration device 100a of present embodiment comprises: sound power calculating part 301, the second variable quantity calculating part 302, the second change point test section 303, the second change point frequency computation part portion 304, sound power average computation portion 305, differential signal calculating part 306, differential signal average computation portion 307, the interval detection unit 308 of melody.
The record regeneration device 100 of the sound power calculating part 301 and first embodiment is same, calculates sound power (with reference to Fig. 3) by voice signal.For example, the signal value of a sound channel by asking voice signal square, just can calculate sound power.Have, sound power calculating part 301 also can use the signal value of a plurality of sound channels of voice signal to calculate sound power again.For example, can after a plurality of sound channels of voice signal being collected in a sound channel, calculate sound power by equalization and known monophonic processing etc.In addition, the record regeneration device 100 of first embodiment also can calculate sound power with identical method.
The record regeneration device 100 of the second variable quantity calculating part 302 and first embodiment is same, second variable quantity of the sound power that calculating is calculated by sound power calculating part 301 is (in the present embodiment, in order to distinguish, show as second variable quantity with the variable quantity of first embodiment.Below identical.) (with reference to Fig. 3).For example, can calculate the size (for example positive value) of second variable quantity as the variation of the sound power in the very first time described later.Have again, the calculating variable quantity though the record regeneration device 100 of first embodiment can use the same method, the time of calculating is not limited to the very first time.
The second change point test section 303 is identical with the record regeneration device 100 of first embodiment, second variable quantity that detection is calculated by the second variable quantity calculating part 302 becomes second setting (in the present embodiment, in order to distinguish, show as second setting with the setting of first embodiment.Below identical.) above second change point (and in the present embodiment, for can with the change point difference of first embodiment, show as second change point.Below identical.) (with reference to Fig. 3).
The frequency that the second change point frequency computation part portion 304 is calculated by the second change point test section, 303 detected second change points.For example, the number of second contained in the second time described later change point is counted, just can be calculated the frequency of this number as second change point.
Sound power average computation portion 305 is identical with the record regeneration device 100 of first embodiment, by the sound power that sound power calculating part 301 calculates, calculates the mean value (with reference to Fig. 3) of sound power by equalization in official hour.For example, by the sound power of equalization in the very first time described later, calculate the mean value of sound power.Have, calculate the mean value of sound power though the record regeneration device 100 of first embodiment can use the same method, the time of calculating is not limited to the very first time.
Differential signal calculating part 306 is identical with the record regeneration device 100 of first embodiment, and the difference (for example positive value) of the signal value of a plurality of sound channels by asking voice signal is calculated differential signal (with reference to Fig. 4).
Differential signal average computation portion 307 is identical with the record regeneration device 100 of first embodiment, and the differential signal that is calculated by differential signal calculating part 306 by equalization in official hour calculates the mean value (with reference to Fig. 3) of differential signal.For example, calculate the mean value of differential signal by the differential signal of equalization in the very first time described later.Have, calculate the mean value of differential signal though the record regeneration device 100 of first embodiment can use the same method, the time of calculating is not limited to the very first time.
The interval detection unit 308 of melody is identical with the record regeneration device 100 of first embodiment, carries out the judgement of melody part or non-melody part based on the size (above-mentioned performance number) of sound power and the size (above-mentioned difference value) of differential signal.Particularly, the interval detection unit 308 of melody becomes (with reference to Fig. 3 and Fig. 5) more than the threshold value and the mean value of the differential signal that calculated by differential signal average computation portion 307 becomes under the situation of at least a situation in (with reference to Fig. 4 and Fig. 5) more than the threshold value at the mean value of the sound power of confirming to be calculated by sound power average computation portion 305, and at least a portion of time of confirming is judged to be the melody part.On the contrary, the interval detection unit 308 of melody is judged to be non-melody part with at least a portion of time of confirming under not enough threshold value (with reference to Fig. 4 and Fig. 5) both sides' of mean value of the not enough threshold value (with reference to Fig. 3 and Fig. 5) of mean value of the sound power of confirming to be calculated by sound power average computation portion 305 and the differential signal that calculated by differential signal average computation portion 307 situation.
And in the record regeneration device 100a of present embodiment, the interval detection unit 308 of melody becomes the above frequency of size of regulation based on the variable quantity of sound power, carries out the judgement of melody part or non-melody part.The overview of this decision method is described in detail in detail based on accompanying drawing.
The image of the frequency of the waveform of voice signal shown in Figure 10, second change point.As mentioned above, and as shown in figure 10, the variable quantity of the sound power frequency that becomes the size above (being detected as second change point by the second change point test section 303) of regulation becomes big (becoming close) in non-melody part (part of for example talking), (becoming thin) partly diminishes at melody.
For this reason, the interval detection unit 308 of melody is judged to be the melody part with at least a portion of time of confirming under the frequency of second change point of confirming to be calculated by the second change point frequency computation part portion 304 becomes situation below the threshold value.In addition, the interval detection unit 308 of melody is judged to be non-melody part with at least a portion of time of confirming under the big situation of the frequency ratio threshold value of second change point of confirming to be calculated by the second change point frequency computation part portion 304.
Promptly, the interval detection unit 308 of melody becomes more than the threshold value at the mean value of confirming sound power, the mean value of differential signal becomes more than the threshold value, under the situation of at least a situation of the frequency of second change point in becoming below the threshold value, at least a portion of time of confirming is judged to be the melody part.On the contrary, the interval detection unit 308 of melody becomes under the situation of the whole situations below the threshold value in the not enough threshold value of mean value of the not enough threshold value of the mean value of confirming sound power, differential signal, the frequency of second change point, and at least a portion of time of confirming is judged to be non-melody part.
If be above structure, just judge the melody part or the non-melody part of voice signal based on the state of sound power.For this reason, transmit only monophonic data, also can precision judge the melody part or the non-melody part of voice signal well even establish the broadcasting of the low situation of received electric field strength, reception.This is not limited only to the record regeneration device 100a of present embodiment, even the record regeneration device 100 of first embodiment also is the same.
Have again, in the record regeneration device 100a of present embodiment, though the interval detection unit 308 of melody is judged the melody part or the non-melody part of voice signal based on the size of sound power, the size of differential signal, big this 3 aspect of frequency of variation quantitative change of sound power, also can not carry out based at least one the judgement in the size of the size of sound power and differential signal.That is, also can constitute in the sound power average computation portion 305 that do not comprise, differential signal calculating part 306 and the differential signal mean value calculation portion 307 at least one.In addition, even the record regeneration device 100 of first embodiment also is similarly, can not carry out judgement based on the size of differential signal.
But, when using various decision methods to carry out the judgement of the melody part of voice signal or non-melody part, even if first embodiment is also as described, owing to can precision judge well, so preferred.In addition, as mentioned above,, just may become not have and omit the melody part that voice signal is judged on ground if promptly use any one decision method in a plurality of decision methods also the part that is judged to be the melody part can be judged to be the melody part.
The concrete work example of the record regeneration device 100a of the Fig. 8 and second embodiment shown in Figure 9 then, is described in detail in detail based on accompanying drawing.The process flow diagram that the recording of the record regeneration device 100a of second embodiment shown in Figure 11 is handled.In addition, Figure 11 is equivalent to represent Fig. 2 of the process flow diagram that the recording of the record regeneration device 100 of first embodiment is handled.
As shown in figure 11, the record regeneration device 100a of present embodiment at first starts at least one tuner of FM tuner 1 and AM tuner 1a, begins obtain (S41) of voice signal.In addition, start the scrambler in the DSP3a, the coding (S42) of the voice signal in the recording file of opening entry in recording medium 7.In addition, initialization is used to discern the variable n (for example being set at 1) of the timing of judging (the aftermentioned very first time and second time).For example, this variable n is by management such as CPU5 or DSP3a.
Then, will be read in proper order audio frequency FIFO (the First In First Out) 61 (S43) from the voice signal of the 2a of A/D portion output.Then, the melody extraction unit of DSP3a is carried out above-mentioned judgement to the voice signal that calls over from audio frequency FIFO61.Have, audio frequency FIFO61 is interpreted as the part of storer 6 again.
At first, sound power calculating part 301 calculates sound power (S44) as mentioned above.In addition, differential signal calculating part 306 calculates differential signal (S45) as mentioned above.The processing that the calculating of sound power and the calculating of differential signal are performed until the voice signal of very first time T1 (n) finishes (among the S46 up to becoming "Yes").
Very first time T1 (n) is used for cutting apart the unit interval that voice signal is handled (judgement) at official hour.1 very first time for example is the time of tens ms (millisecond).
In case calculate the sound power and the differential signal of the voice signal of very first time T1 (n), sound power average computation portion 305 just calculates the mean value (S47) of the sound power of very first time T1 (n) as mentioned above.In addition, differential signal average computation portion 307 calculates the mean value (S48) of the differential signal of very first time T1 (n) as mentioned above.And, the second variable quantity calculating part 302 calculate as mentioned above very first time T1 (n) sound power the second variable quantity c (n) (S49).
If the second variable quantity c (n) is threshold value above ("Yes" of S50), just the data " 1 " that record expression second change point exists in change point FIFO62 (S51).On the other hand, if the not enough threshold value of the second variable quantity c (n) ("No" of S50), just record is represented the non-existent data of second change point " 0 " (S52) in change point FIFO62.Have, change point FIFO62 is interpreted as the part of storer 6 again.
In addition, the frequency (S53) of second change point is calculated by the data of reference record in change point FIFO62 by the second change point frequency computation part portion 304.At this moment, in change point FIFO62, record the data of detected second change point from the music signal of second time T 2 (n) at least.The frequency (S53) of second change point is calculated by the number of data " 1 " in the data of second time T 2 (n) of reading, that expression second change point exists is counted by the second change point frequency computation part portion 304 from change point FIFO62.
Second time T 2 (n) is also the same with very first time T1 (n), is to be used for cutting apart the unit interval that voice signal is handled (judgement) at official hour.1 second time T 2 (n) for example is several s times of (second).Have again, because second time T 2 (n) is the time of calculating the frequency of second change point, so preferred at least its is the time longer than very first time T1 (n).
Based on accompanying drawing in detail the very first time T1 (n) and second time T 2 (n) are described in detail.The image of the very first time shown in Figure 12, second time.As shown in figure 12, second time T 2 (n) comprises k+1 very first time T1 (n-k)~T1 (n) (k is a natural number).In addition, in S50~S52, because journal in change point FIFO62 (renewal) data, so next second time T 2 (n+1) of second time T 2 (n) becomes the time that the very first time only is offset 1.That is, second time T 2 (n+1) becomes the time that comprises k+1 very first time T1 (n-k-1)~T1 (n+1).
In addition, as mentioned above, the interval detection unit 308 of melody is judged the melody part or the non-melody part (S54) of voice signal based on the size of sound power, the size of differential signal, big this 3 aspect of frequency of variation quantitative change of sound power.Have, the interval detection unit 308 of melody also can be identical with the record regeneration device 100 of first embodiment, exports non-melody point TA (i) as result of determination again.
The time of the voice signal that the interval detection unit 308 of melody is judged based on the size of the size of sound power and differential signal becomes at least a portion (for example moment of the substantial middle of very first time T1 (n)) of very first time T1 (n).On the other hand, the time of judging based on the big frequency of the variation quantitative change of sound power becomes at least a portion (for example moment of the substantial middle of second time T 2 (n)) of second time T 2 (n).
So, in the record regeneration device 100a of present embodiment, there is the situation that time of the voice signal that the interval detection unit 308 of melody judges is offset in each determination methods.For this reason, can be for example, the result of determination that maintenance obtains in proper order in result of determination maintaining part 63 is (for example, based on each result of determination of the size of the size of sound power and differential signal, after the result of determination unanimity of obtaining by above-mentioned three methods, export final result of determination.Have, result of determination maintaining part 63 may be interpreted as the part of storer 6 again.
If carry out the judgement of voice signal among the S54, will on variable n, add 1 (S55) by for example CPU5, DSP3a etc.Then, all repeat above-mentioned judgement (S43~S55) up to there being recording to stop indication (in S56, becoming "Yes").
When having recording to stop to indicate ("Yes" among the S56), stop coding, preserve result of determination (for example non-melody point TA (i)), close recording file (S57).Result of determination both can be kept in the recording file mutually distinctively with compression sound data, also can be used as the file different with recording file and was preserved.
According to such structure, just can cooperate the decision method of the big frequency of the variation quantitative change of size, sound power of the size of carrying out respectively, differential signal satisfactorily based on sound power.
Have again, when the beginning of judging and when end, may be created in the situation that does not write down sufficient data (judging the data of the second required time T 2 (n)) among the change point FIFO62.In such cases, for example both can adopt the result of determination of decision method by other (based on the judgement of the size of the size of sound power, differential signal), also can also replenish not enough data with pseudo-data and judge with reference to judging than the data of the time that is recorded in second time T 2 (n) weak point among the change point FIFO62.
In addition, also can make according to the result of determination of judging the decision method that precision is high and have precedence over result of determination according to other decision method.In the case, can give (weighting) according to the result of determination of each decision method with relative importance value, merge by making result of determination according to each decision method, carry out final judgement.
In addition, the interval detection unit 308 of melody also can be used the generation method (with reference to Fig. 6) and the renovation process (with reference to Fig. 7) of playlist of the record regeneration device 100 of first embodiment in the record regeneration device 100a of present embodiment when the non-melody point TA of output (i) is as result of determination.
Other example of<the second embodiment 〉
The record regeneration device 100a of second embodiment, by in the judging of the interval detection unit 380 of melody based on each of the size of the size of sound power and differential signal, can adopt the decision method identical with the record regeneration device 100 of first embodiment.The structure of this moment is described in detail in detail based on accompanying drawing.
The functional-block diagram at main position of the record regeneration device 100a of second embodiment (another example) has been shown among Figure 13.Have, Figure 13 is equivalent to represent Fig. 9 of the record regeneration device 100a of the second conventional embodiment, in this figure, gives identical symbol for the structure identical with Fig. 9, omits its detailed explanation again.
Melody extraction unit contained among the DSP3a of the record regeneration device 100a that this is routine comprises: sound power calculating part 301, the second variable quantity calculating part 302, the second change point test section 303, the second change point frequency computation part portion 304, the sound power average computation 305b of portion, differential signal calculating part 306, the differential signal average computation 307b of portion, the interval detection unit 308b of melody, the first variable quantity calculating part 309b, the first change point test section 310b.
The first variable quantity calculating part 309b calculates the variable quantity identical with the record regeneration device 100 of first embodiment (below be made as first variable quantity) (with reference to Fig. 3).In addition, the first change point test section 310b calculates the change point identical with the record regeneration device 100 of first embodiment (below be made as first change point) (with reference to Fig. 3).
Then, the sound power average computation 305b of portion is identical with the record regeneration device 100 of first embodiment, calculates the mean value (with reference to Fig. 3) by the sound power in the set time before and after detected first change point of the first change point test section 310b.
In addition, the differential signal average computation 307b of portion is identical with the record regeneration device 100 of first embodiment, calculates the mean value (with reference to Fig. 4) by the differential signal in the set time before and after detected first change point of the first change point test section 310b.
Interval detection unit 308b is identical with the record regeneration device 100 of first embodiment for melody, carries out the judgement in the moment of first change point of voice signal based on the size of the size of sound power and differential signal.In addition, interval detection unit 308b is identical with the record regeneration device 100a of second embodiment of routine for melody, changes the big frequency (number of second change point in second time T 2 (n)) of quantitative change based on second of sound power and carries out the judgement of the time (for example moment of the substantial middle of second time T 2 (n)) of at least a portion of second time T 2 (n).
Even if such structure also can cooperate the decision method of the big frequency of the variation quantitative change of size, sound power of the size of carrying out respectively based on sound power, differential signal.
Have again, second setting that also the second change point test section 303 can be used to detect second change point set be used to detect first change point than the first change point test section 310b setting (, below be made as first setting with reference to Fig. 3.) littler.
When constituting like this, owing to can detect first change point and second change point that is suitable for each decision method, so can improve judgement precision according to each decision method.Particularly, for example, in decision method based on the size of the size of sound power and differential signal, up to can higher determinacy ground judge be the degree of melody part and non-melody portion boundary before, if increase first setting, just can improve the judgement precision.In addition, for example, in decision method, before the degree that can distinguish thin and close state (difference of the number of second change point becomes big in each state) clearly based on the big frequency of the variation quantitative change of sound power, if reduce second setting, just can improve and judge precision.
In addition, in this example, can make the second variable quantity calculating part 302 and first commonization of variable quantity calculating part 309b.And, can make the second change point test section 303 and first commonization of change point test section 310b.According to such structure, just can reduce the treatment capacity of DSP3a.
<change routine 〉
For record regeneration device 100,100a, also can be part or all work that the microcomputer equal controller carries out DSP3,3a etc. as enforcement one mode of the present invention.And all or part of that will realize function by this control device is recited as program, also can realize all or part of of its function by this program of execution in program executing apparatus (for example computing machine).
In addition, be not limited to said circumstances, Fig. 1, Fig. 8, Fig. 9 and record regeneration device 100,100a shown in Figure 13 can realize by the combination of hardware or hardware and software.In addition, when using software to constitute the part of record regeneration device 100,100a, the piece at the position of being realized by software is represented the functional block at this position.
The explanation of the various embodiments described above is used to illustrate the present invention, should not be construed as limiting invention or the reduction scope described in the scope of claim.Have, each structure of the present invention is not limited to the foregoing description again, undoubtedly can carry out various distortion in the scope of the technology described in the scope of claim.

Claims (11)

1. melody extraction element comprises following structure:
The sound power calculating part, it calculates sound power by voice signal; And
Detection unit, its state based on sound power carry out the judgement of melody part or non-melody part.
2. melody extraction element according to claim 1 is characterized in that,
This melody extraction element also comprises following structure:
The differential signal calculating part, the differential signal between a plurality of sound channels of its calculating voice signal,
Above-mentioned detection unit carries out the judgement of melody part or non-melody part based on sound power and differential signal.
3. melody extraction element according to claim 2 is characterized in that,
Above-mentioned detection unit,
The size of any one in differential signal and sound power is judged to be melody under the situation more than the threshold value separately,
Under the situation of differential signal and sound power both sides' the not enough threshold value separately of size, be judged to be non-melody.
4. melody extraction element according to claim 2 is characterized in that,
This melody extraction element also comprises following structure:
The first variable quantity calculating part, it calculates the variable quantity of sound power,
The variable quantity that above-mentioned detection unit is calculated based on the above-mentioned first variable quantity calculating part becomes the sound power and the differential signal of first change point front and back more than first setting, judges.
5. melody extraction element according to claim 4 is characterized in that,
Above-mentioned detection unit will be judged to be the interval that becomes the above voice signal of official hour between first change point of non-melody and be judged to be the melody interval.
6. melody extraction element according to claim 1 is characterized in that,
This melody extraction element also comprises following structure:
The second variable quantity calculating part, it calculates the variable quantity of sound power,
The variable quantity that above-mentioned detection unit is calculated based on the above-mentioned second variable quantity calculating part becomes the frequency more than second setting, judges.
7. melody extraction element according to claim 1 is characterized in that,
This melody extraction element also comprises following structure:
The second variable quantity calculating part, it calculates the variable quantity of sound power; And
The differential signal calculating part, the differential signal between a plurality of sound channels of its calculating voice signal,
Above-mentioned detection unit becomes frequency more than second setting based on the size of the sound power in the very first time, the size of differential signal in the very first time and the above-mentioned second variable quantity calculating part is calculated in second time variable quantity, judges.
8. melody extraction element according to claim 7 is characterized in that,
Above-mentioned detection unit,
Under the situation more than the threshold value separately, at least a portion of this very first time is judged to be melody in the differential signal of the very first time and any one the size in the sound power,
Under the situation of the differential signal of the very first time and sound power both sides' the not enough threshold value separately of size, at least a portion of this very first time is judged to be non-melody.
9. melody extraction element according to claim 6 is characterized in that,
Above-mentioned detection unit,
Second change point that the variable quantity that the above-mentioned second variable quantity calculating part is calculated becomes more than second setting is counted,
The number of second change point in second time is judged to be melody with at least a portion of this second time when threshold value is following,
When the number of second change point in second time is bigger than threshold value, at least a portion of this second time is judged to be non-melody.
10. melody extraction element according to claim 9 is characterized in that,
Above-mentioned detection unit is by counting second change point in second time, carries out the judgement in the moment of the substantial middle of this second time.
11. a melody recording device comprises following structure:
The described melody extraction element of claim 1; And
Recording portion, it is judged to be above-mentioned melody extraction element is that the voice signal in interval of melody is recorded.
CN2010102943740A 2009-09-28 2010-09-21 Music extraction device and music recording device Pending CN102034471A (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
JP2009223066 2009-09-28
JP2009-223066 2009-09-28
JP2010195431A JP2011090290A (en) 2009-09-28 2010-09-01 Music extraction device and music recording apparatus
JP2010-195431 2010-09-01

Publications (1)

Publication Number Publication Date
CN102034471A true CN102034471A (en) 2011-04-27

Family

ID=43887276

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2010102943740A Pending CN102034471A (en) 2009-09-28 2010-09-21 Music extraction device and music recording device

Country Status (1)

Country Link
CN (1) CN102034471A (en)

Similar Documents

Publication Publication Date Title
US10026410B2 (en) Multi-mode audio recognition and auxiliary data encoding and decoding
JP4560269B2 (en) Silence detection
US20020120456A1 (en) Method and arrangement for search and recording of media signals
CN109545242A (en) A kind of audio data processing method, system, device and readable storage medium storing program for executing
CN103578470A (en) Telephone recording data processing method and system
CN102292769A (en) Stereo encoding method and device
CN102884571A (en) Watermark generator, watermark decoder, method for providing a watermark signal, method for providing binary message data in dependence on a watermarked signal and a computer program using improved synchronization concept
JP2009015119A (en) Bridge position detection apparatus
CN103050116A (en) Voice command identification method and system
CN105611400A (en) Content processing device and method for transmitting segment of variable size
JP5377974B2 (en) Signal processing device
CN102959621A (en) Watermark decoder and method for providing binary message data
CN102170528B (en) Segmentation method of news program
CN105283916A (en) Digital-watermark embedding device, digital-watermark embedding method, and digital-watermark embedding program
JP2011090290A (en) Music extraction device and music recording apparatus
CN102034471A (en) Music extraction device and music recording device
KR101382356B1 (en) Apparatus for forgery detection of audio file
CN115731943A (en) Plosive detection method, plosive detection system, storage medium and electronic equipment
JP2010078984A (en) Musical piece extraction device and musical piece recording device
CN108665905B (en) Digital voice resampling detection method based on frequency band bandwidth inconsistency
US20160050452A1 (en) Methods and systems to monitor a media device using a digital audio signal
CN1062365C (en) A method of transmitting and receiving coded speech
CN108877816B (en) QMDCT coefficient-based AAC audio frequency recompression detection method
WO2009101808A1 (en) Music recorder
KR20080072451A (en) Method for inserting data for enhancing quality of audio signal and apparatus therefor

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20110427