JP2000099097A - Signal reproducing device and method, voice signal reproducing device, and speed conversion method for voice signal - Google Patents

Signal reproducing device and method, voice signal reproducing device, and speed conversion method for voice signal

Info

Publication number
JP2000099097A
JP2000099097A JP27024498A JP27024498A JP2000099097A JP 2000099097 A JP2000099097 A JP 2000099097A JP 27024498 A JP27024498 A JP 27024498A JP 27024498 A JP27024498 A JP 27024498A JP 2000099097 A JP2000099097 A JP 2000099097A
Authority
JP
Japan
Prior art keywords
signal
waveform
predetermined time
signal waveform
time unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP27024498A
Other languages
Japanese (ja)
Inventor
Noboru Murabayashi
Takao Takahashi
昇 村林
孝夫 高橋
Original Assignee
Sony Corp
ソニー株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sony Corp, ソニー株式会社 filed Critical Sony Corp
Priority to JP27024498A priority Critical patent/JP2000099097A/en
Publication of JP2000099097A publication Critical patent/JP2000099097A/en
Pending legal-status Critical Current

Links

Abstract

(57) [Summary] [PROBLEMS] To provide an audio signal reproduction device capable of converting the reproduction speed of an audio signal without lowering the degree of understanding of contents. An audio signal reproducing device (10) extracts a signal waveform of an audio signal for each predetermined time unit.
A level detection circuit 13 for detecting a signal level of the cut signal waveform, a feature extraction circuit 14 for extracting a feature of the cut signal waveform, and a signal level and a feature extraction circuit 14 of the signal waveform detected by the level detection circuit 13. And a waveform processing circuit 15 for deleting and / or adding a signal waveform for each predetermined time unit based on the characteristic of the signal waveform extracted by the above-described processing to process the waveform of the audio signal. The audio signal reproducing device 10 converts the reproduction speed of the audio signal without changing the pitch by deleting the portion where the silent portion and the feature are continuous.

Description

DETAILED DESCRIPTION OF THE INVENTION

[0001]

BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a signal reproducing apparatus and method for reproducing a signal recorded on a recording medium, converting the time axis of the signal, and outputting the signal, and a method for reproducing an audio signal at a normal reproducing speed. TECHNICAL FIELD The present invention relates to an audio signal reproducing apparatus for performing special reproduction faster or slower than a speed and a speed conversion method of the audio signal.

[0002]

2. Description of the Related Art In general, a home video tape recorder (VTR) has a fast-forward playback function capable of playing a video tape at a higher speed than a normal playback speed and allowing a user to view recorded video and audio in a short time. Is provided. However, when a video tape is reproduced using this fast-forward reproduction function, the pitch of an output audio signal changes, and the contents cannot be understood even when the audio is heard. Therefore, in recent VTRs, during fast-forward playback, speech speed conversion processing is performed to delete a silent section of the audio signal, the pitch of the output audio signal is made the same as the pitch at the time of normal playback, and the content is heard by listening to the audio. So that you can understand.

[0003]

However, in a VTR in which a silent section is deleted and speech speed conversion processing is performed, when fast forward reproduction is performed at a higher speed, it is effective for a voice including a relatively large number of silent sections. Although it worked, the voiced sections had to be deleted for voices with few silent sections, and the contents could not be understood by listening to the output voices.

The present invention has been made in view of such circumstances, and has as its object to provide a signal reproducing apparatus and method capable of converting a time axis without deleting an effective portion of a signal. I do.

It is another object of the present invention to provide an audio signal reproducing apparatus and an audio signal speed conversion method capable of converting an audio signal reproduction speed without lowering the degree of understanding of contents.

[0006]

A signal reproducing apparatus according to the present invention comprises a reproducing means for reproducing a signal from a recording medium, and a signal cutting means for cutting out a signal waveform of a reproduced signal reproduced by the reproducing means at predetermined time units. Output means, level detection means for detecting the signal level of the signal waveform extracted by the signal extraction means, feature extraction means for extracting the characteristics of the signal waveform extracted by the signal extraction means, and the level detection means Based on the detected signal level of the predetermined time-unit signal waveform and the characteristics of the predetermined time-unit signal waveform extracted by the feature extracting means, the signal waveform is deleted and / or added for each predetermined time unit. And a time axis converting means for processing the waveform of the reproduced signal and converting the time axis of the reproduced signal.

In this signal reproducing apparatus, a signal level and a characteristic are detected from a signal waveform of a reproduced signal cut out every predetermined time unit, and the signal waveform is deleted and / or added every predetermined time unit. The waveform of the reproduced signal is processed to convert the time axis of the reproduced signal.

In the signal reproducing method according to the present invention, a signal is reproduced from a recording medium, a signal waveform of the reproduced signal is cut out at predetermined time units, a signal level of the cut signal waveform is detected, and the cut out signal is detected. The characteristic of the waveform is extracted, and the deletion and / or addition of the signal waveform for each predetermined time unit is performed based on the detected signal level of the signal waveform in the predetermined time unit and the extracted characteristic of the signal waveform in the predetermined time unit. Then, the waveform of the reproduced signal is processed to convert the time axis of the reproduced signal.

In this signal reproducing method, a signal level and a characteristic are detected from a signal waveform of a reproduced signal cut out for each predetermined time unit, and the signal waveform is deleted and / or added for each predetermined time unit. The waveform of the reproduced signal is processed to convert the time axis of the reproduced signal.

In the audio signal reproducing apparatus according to the present invention, a signal extracting means for extracting a signal waveform of the audio signal at predetermined time units, and a level for detecting a signal level of the signal waveform extracted by the signal extracting means. Detecting means, a characteristic extracting means for extracting a characteristic of the signal waveform extracted by the signal extracting means, a signal level of a signal waveform in a predetermined time unit detected by the level detecting means, and a predetermined level extracted by the characteristic extracting means. Speed conversion means for processing the waveform of the audio signal by deleting and / or adding a signal waveform for each predetermined time unit based on the characteristics of the signal waveform in the time unit of It is characterized by having.

In this audio signal reproducing apparatus, a signal level and a characteristic are detected from a signal waveform of an audio signal cut out for each predetermined time unit, and deletion and / or addition of the signal waveform are performed for each predetermined time unit. Then, waveform processing of the audio signal is performed, and the reproduction speed of the audio signal is converted.

In the method for converting the speed of an audio signal according to the present invention, the signal waveform of the audio signal is cut out at predetermined time units, the signal level of the cut out signal waveform is detected, and the characteristics of the cut out signal waveform are extracted. The waveform of the audio signal is deleted and / or added for each predetermined time unit based on the extracted signal level of the predetermined time unit signal waveform and the extracted characteristic of the predetermined time unit signal waveform. It is characterized by processing and converting the reproduction speed of the audio signal.

In this method of converting the speed of an audio signal, a signal level and a characteristic are detected from a signal waveform of the audio signal cut out at a predetermined time unit, and the signal waveform is deleted and / or added at the predetermined time unit. To perform waveform processing on the audio signal to convert the reproduction speed of the audio signal.

[0014]

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS First, a speech speed conversion apparatus according to a first embodiment of the present invention will be described.

FIG. 1 shows a block diagram of a speech speed converter according to a first embodiment of the present invention.

The speech speed conversion device 10 shown in FIG.
This device is used in an audio output stage of a video tape recorder or the like, and performs a speech speed conversion of an audio signal when fast forward reproduction or the like is performed.

The speech speed conversion device 10 includes an analog / digital (A / D) conversion circuit 11, a waveform extraction circuit 12, a level detection circuit 13, a feature extraction circuit 14, a waveform processing circuit 15, And an analog (D / A) conversion circuit 16.

The speech speed converter 10 is supplied with, for example, an analog audio signal reproduced from a video tape.

The A / D conversion circuit 11 A / D converts the input analog audio signal and converts it into a digital audio signal.

The A / D conversion circuit 1 is provided in the waveform extraction circuit 12.
1 supplies an audio signal converted into digital data. The waveform extracting circuit 12 divides a temporally continuous audio signal for each predetermined time unit, and extracts a signal waveform of the audio signal within the time unit. The time unit for cutting out the signal waveform is, for example, about 10 msec to 200 msec. This time unit is hereinafter referred to as an audio block.

The level detecting circuit 13 includes a waveform extracting circuit 1
2, the audio signal whose signal waveform is cut out is supplied for each audio block. The level detection circuit 13 detects the level of the signal signal for each audio block, and determines whether the audio block is a silent section or a sound section. For example, the level detection circuit 13 obtains an average power (power) P or an average level M of the audio signal for each audio block, and determines that the audio section is a sound section if the average power P or the average level M is higher than a predetermined threshold. , If it is lower than a predetermined threshold, it is determined to be a silent section. The average power P and average level M of the audio signal in the audio block can be calculated as follows.

Average power P = (1 / N) Σi 2 Average level M = (1 / N) Σ | i | where cumulative addition is performed in the audio block, N is the number of sampling data in the audio block, and i is This is the signal level (amplitude) of the audio signal.

The feature extracting circuit 14 includes a waveform extracting circuit 12
The audio signal whose signal waveform has been cut out is supplied for each audio block. The feature extraction circuit 14 extracts features of the audio signal for each audio block from the signal waveform of the supplied audio signal. The characteristics of the audio signal include, for example, the pitch of the audio signal and the frequency characteristics of the audio signal.

The waveform processing circuit 15 includes a waveform extraction circuit 12
The audio signal whose signal waveform has been cut out is supplied for each audio block. The waveform processing circuit 15 includes:
The result of the determination as to whether the voice block determined by the level detection circuit 13 is a voiced section or a silent section, and the features of the voice signal for each voice block extracted by the feature extraction circuit 14 are supplied.

The waveform processing circuit 15 performs waveform processing on the basis of the result of determining whether the audio block is a sound section or a silent section and the characteristics of the audio signal, and converts the reproduction speed of the audio signal.

The D / A conversion circuit 16 includes a waveform processing circuit 1
5, a digital audio signal whose waveform has been processed is supplied. The D / A conversion circuit 16 converts a digital audio signal into an analog signal and outputs it.

In the speech speed converter 10 having the above-described configuration, when the video tape is fast-forward-reproduced, the waveform processing circuit 15 processes the audio signal so that the audio signal does not change in pitch. Speed up the talk speed and enable quick listening.

The processing contents of the waveform processing circuit 15 will be described in more detail.

The waveform processing circuit 15 deletes a voice block in a silent section. Further, the waveform processing circuit 15 deletes a part of an audio block in which audio signals having similar characteristics are continuous. At this time, the waveform processing circuit 15 determines a deletion ratio of the audio block according to the reproduction speed of the audio signal and deletes the audio block. That is, the waveform processing circuit 15
The number of blocks to be deleted is determined according to the speed at the time of fast-forward playback. Then, the waveform processing circuit 15 connects and outputs the remaining audio blocks that have not been deleted, and performs a conversion process of the reproduction speed of the audio signal, that is, a conversion process of the speech speed.

Here, the waveform processing circuit 15 simply connects the remaining audio blocks that have not been deleted and causes an auditory problem if they are directly connected. Therefore, the waveform processing circuit 15 performs smooth connection as follows. .

For example, as shown in FIG. 2, there are audio blocks A, B, and C which are temporally continuous, and the middle audio block B is to be deleted. If the audio block A and the audio block C are simply connected, the audio block A
And the earliest part of the signal waveform of the audio block C (the signal at time t2) become discontinuous, and this discontinuous part may become noise.

Therefore, the waveform processing circuit 15 generates a waveform connection weighting function fa such that, for example, 1 is set at the earliest portion (time ta) of the audio block A and 0 at the last portion (time t2) of the audio block B. (T) is multiplied by the audio signal of the audio block A. Also, 0 is set at the earliest part (time t1) of the audio block B, and is set at the last part (time t1) of the audio block C.
The waveform connection weighting function fc (t) that becomes 1 in c)
Is multiplied by the audio signal of the audio block C. Then, the audio block A multiplied by the weighting function and the audio signal of the audio section C are connected.

More specifically, fa (t) relating to the weighting of the audio signal of the audio block A and fc (t) relating to the weighting of the audio signal of the audio block C are as follows. fa (t) =-(t / (t2-ta)) + t2 / (t2
−ta) fc (t) = (t / (tc−t1)) − t1 / (tc)
-T1) Then, the audio signals of the audio blocks A and B are converted into AB (t),
Assuming that the audio signals of the audio blocks B to C are BC (t),
The signal AC (t) after the waveform connection is as follows. AC (t) = AB (t) · fa (t) + BC (t) · f
In the c (t) waveform processing circuit 15, by performing processing using the waveform connection weighting function in this manner, audio blocks that have not been deleted can be smoothly connected to each other, and audio signal connection that is relatively free of auditory discomfort can be performed. .

Although the linear connection function such as f (t) = at + b has been described as an example of the waveform connection weighting function f (t), a linear function such as f (t) = at 2 + b is also used.
The following function or f (t) = a · exp (−t / τ) +
An exponential function such as b may be used.

In the waveform processing circuit 15, as shown in FIG. 3, when the point a serving as the connection point of the audio block A is connected to the audio signal of the audio block C, the sign of the differential coefficient at the point a is different. Instead of connecting at the point c1 of the differential coefficient, the audio block B may be deleted so as to connect at the point c2 having the same sign as the sign of the differential coefficient at the point a.

Next, signal waveforms in the case where the actual voice signal is subjected to speech speed conversion by the speech speed conversion device 10 will be described with reference to the waveform diagrams of FIGS.

FIG. 4 is a diagram showing a waveform of an input audio signal. FIG. 5 is a diagram showing a waveform of the output audio signal after the speech speed conversion. FIG. 6 is an enlarged view of a part of the waveform of the input audio signal shown in FIG. FIG. 7 is an enlarged view of a part of the waveform of the output audio signal shown in FIG. 4 and 5, one scale on the horizontal axis indicates 5000 samples, which is about 0.104 seconds in time. 6 and 7, one graduation on the horizontal axis indicates 1000 samples, which is about 0.0208 seconds in time. The audio signal has a sampling frequency fs
= 48000 KHz, signal of 16 bits of quantization bits.

As can be seen from these figures, the pitch between the input voice signal and the output voice signal does not change even when the waveform deletion processing is performed, the speech speed processing is performed, and the waveform connection processing is performed. Audio waveforms are also connected smoothly.

Here, the X part of the waveforms shown in FIGS. 8 and 9 is a consonant section of the voice signal, and the Y part of the waveform is a vowel section. Comparing the consonant sections of the input speech signal and the output speech signal, it can be seen that the consonant section of the output speech signal after the speech speed conversion processing is shorter. Similarly, comparing the vowel sections of the input speech signal and the output speech signal, it can be seen that the vowel section of the output speech signal after the speech speed conversion processing is shorter.

Therefore, when the speech waveform is deleted by performing the speech speed processing, if the rate of deletion is too large, the contents of the conversation voice cannot be understood.

According to the results of the experiment, for example, if the pitch of the input voice signal is about 10λ in the vowel section, the output voice signal will have a pitch of about 1/2 to 1/3 in the vowel section. The conversation content is good with little deterioration. The same applies to the consonant section.

As described above, the speech speed converter 10 according to the first embodiment of the present invention detects the signal level of the audio signal and the characteristics of the audio signal from the signal waveform of the audio signal cut out for each audio block. By performing the waveform processing of the audio signal by deleting the audio block, the reproduction speed of the audio signal can be converted without lowering the degree of understanding of the content.

In the case where the audio block is deleted in the waveform processing circuit 15, for example, in the case of double-speed reproduction, the number of blocks may be reduced to の of the whole, and it is simply because the silent block is very long. It does not delete more blocks than necessary. Also, the characteristics of the audio signal are fed back to a video tape rotation speed controller or the like (not shown), and a very fast fast-forward reproduction such as triple speed or quadruple speed is performed in a portion having many silent sections. Variable-speed playback such as fast-forward playback to the extent that the contents can be understood.

The waveform processing circuit 15 may change the method of deleting the audio block according to the reproduction speed of the fast-forward reproduction. For example, as shown in the table below, low speed (1 to
The deletion method may be changed for each of 1.5 times speed, medium speed (1.5 to 2.5 times speed), and high speed (2.5 times or more).

[0045]

[Table 1]

If the speed does not reach the predetermined speed in the above processing, for example, a vowel section is deleted at a low speed, a consonant section is deleted at a medium speed, and a high speed. A part with a low audio level may be deleted.

As for the detection of a vowel section and a consonant section in the case of high-speed high-speed reproduction, first, a vowel section is detected, and a process in which a preceding voice section is set as a consonant section is performed. As for the detection of a vowel section, pitch detection is performed using an autocorrelation function, and processing such as setting the detected section as a vowel section is performed. In this case, in addition to performing the autocorrelation processing on the audio signal as it is, the pitch detection may be performed by performing the autocorrelation processing after the logarithmic spectrum processing.

Next, a speech speed conversion device according to a second embodiment of the present invention will be described.

FIG. 8 shows a block diagram of a speech speed converter according to a second embodiment of the present invention. In describing the speech speed conversion device of the second embodiment, the same circuits as those of the speech speed conversion device 10 of the first embodiment are denoted by the same reference numerals in the drawings. Detailed description is omitted. The same applies to the third and subsequent embodiments.

The speech speed conversion device 20 shown in FIG.
This device is used in an audio output stage of a video tape recorder or the like, and performs a speech speed conversion of an audio signal when fast forward reproduction or the like is performed.

The speech speed conversion device 20 includes an analog / digital (A / D) conversion circuit 11, a waveform extraction circuit 12, a level detection circuit 13, a correlation detection circuit 21, a waveform processing circuit 15, / Analog (D / A) conversion circuit 16
And The waveform processing circuit 15 includes a thinning circuit 22
And a waveform connection circuit 23.

The correlation detecting circuit 21 includes a waveform extracting circuit 1
An audio signal whose signal waveform has been cut out by 2 is supplied for each audio block. The correlation detection circuit 21 obtains an autocorrelation function between the audio blocks extracted by the waveform extraction circuit 12.

The thinning circuit 22 deletes the correlated audio block obtained by the correlation detection circuit 21 and the audio block of the silent section obtained by the level detection circuit 13.

The waveform connection circuit 23 performs connection processing so that the audio blocks remaining without being deleted are connected smoothly with the audio waveform.

In the speech speed conversion device 20 according to the second embodiment of the present invention, by detecting the correlation between the waveforms as described above, it is possible to detect a portion having a similar audio signal.
By removing similar parts of the detected audio signal,
Speak speed conversion can be performed.

Next, a speech speed conversion device according to a third embodiment of the present invention will be described.

FIG. 9 shows a block diagram of a speech speed converter according to a third embodiment of the present invention.

The speech speed converter 30 shown in FIG.
This device is used in an audio output stage of a video tape recorder or the like, and performs a speech speed conversion of an audio signal when fast forward reproduction or the like is performed.

The speech speed conversion device 30 includes an analog / digital (A / D) conversion circuit 11, a waveform extraction circuit 12, a level detection circuit 13, a level comparison circuit 31, a waveform processing circuit 15, Analog (D / A) conversion circuit 16
And

The level comparing circuit 31 includes a waveform extracting circuit 1
An audio signal whose signal waveform has been cut out by 2 is supplied for each audio block. The level comparison circuit 31 detects the level of the audio signal for each audio block.

The decimation circuit 32 deletes the audio block in the silent section, and performs level detection for each audio block based on the level detection result of the level comparison circuit 31.
When the audio block having the same level continues for a predetermined number of times, some of the audio blocks are deleted. The number of blocks to be deleted may be determined according to the reproduction speed or the like. In addition, of the consecutive audio blocks of the same level, two audio blocks adjacent to audio blocks of different levels are left, and the other audio blocks are removed. You may delete it.

The waveform connection circuit 23 performs connection processing so that audio blocks remaining without being deleted are connected smoothly with each other.

For example, this speech speed conversion device 30
As shown in (5), when the audio blocks A to F are input and the audio blocks B, C, and D have the same level, the audio block C is deleted, and the waveform connection processing is performed between the B and D sections. When the levels of the audio blocks B, C, D, and E are the same, the audio blocks C and D are deleted, and the waveform connection processing is performed between B and E.

In the speech speed conversion device 30 according to the third embodiment of the present invention, by detecting the voice level in this manner, a portion where the voice signal is similar can be detected, and the similarity of the detected voice signal can be detected. Speech speed conversion can be performed by deleting the part.

Next, a speech speed converter according to a fourth embodiment of the present invention will be described.

FIG. 11 is a block diagram showing a speech speed converter according to a fourth embodiment of the present invention.

The speech speed converter 40 shown in FIG. 11 is used in, for example, an audio output stage of a video tape recorder or the like.
This is a device that performs speech speed conversion of an audio signal when fast-forward playback or the like is performed.

The speech speed conversion device 40 includes an analog / digital (A / D) conversion circuit 11, a waveform extraction circuit 12, a level detection circuit 13, a frequency analysis circuit 41, a peak frequency detection circuit 42, Frequency continuity detection circuit 43
, A waveform processing circuit 15 and a digital / analog (D /
A) The conversion circuit 16 is provided.

The frequency analysis circuit 41 includes a waveform extraction circuit 1
An audio signal whose signal waveform has been cut out by 2 is supplied for each audio block. The frequency analysis circuit 41 performs a frequency analysis of the audio signal for each audio block.

The peak frequency detection circuit 42, based on the analysis result of the frequency analysis circuit 41,
A peak frequency and a second peak frequency are detected.

The peak frequency continuity detecting circuit 43 detects whether or not a sound block having the same peak frequency continues based on the detection result of the peak frequency detecting circuit 42.

The decimating circuit 44 deletes a voice block in a silent section and performs a peak frequency continuity detecting circuit 4.
When a predetermined number of audio blocks having the same peak frequency continue based on the detection result of No. 3, some of the audio blocks are deleted. The number of blocks to be deleted may be determined according to the reproduction speed or the like. Of the continuous audio blocks having the same peak frequency, two audio blocks adjacent to an audio block having a different peak frequency are left and other audio blocks are left. Blocks may be deleted.

The waveform connection circuit 23 performs connection processing so that the audio blocks remaining without being deleted are smoothly connected by the audio waveform.

In the speech speed conversion device 40 according to the fourth embodiment of the present invention, by detecting the peak frequency in this manner, it is possible to detect a portion where the voice signal is similar,
By removing similar parts of the detected audio signal,
Speak speed conversion can be performed.

Next, a fifth embodiment of the present invention will be described. The fifth embodiment is a disk reproducing apparatus to which the speech speed converter according to the first to fourth embodiments of the present invention is applied.

FIG. 12 is a block diagram showing a disk reproducing apparatus according to a fifth embodiment of the present invention.

The disk reproducing apparatus 50 shown in FIG. 12 reproduces, for example, an optical disk or the like on which video and audio are digitally recorded, and when a user performs a fast forward reproduction operation or the like, reproduces a video signal at a high speed. This is a device that can convert the speech speed of a voice signal.

The disk reproducing device 50 is provided with an optical disk 51.
And a demultiplexer 53 that separates the digital data read by the reproduction processing circuit 52 into a video signal and an audio signal.

The disk reproducing device 50 is supplied with the video signal separated by the demultiplexer 53, and performs a video signal processing circuit 54 for performing a decoding process, an error correction process, and the like of the video signal. The video reproduction system includes an image processing circuit 55 that performs a thinning process and the like, and a video D / A conversion circuit 56 that converts a video signal processed by the image processing circuit 44 into an analog signal and outputs the analog signal. In this video reproduction system, when the user performs a fast-forward reproduction operation, the image processing circuit 55 performs a frame thinning process or the like, and outputs a video signal at a predetermined reproduction speed.

The disc reproducing apparatus 50 is supplied with the audio signal separated by the demultiplexer 53, performs decoding processing and error correction processing of the audio signal, and converts the audio signal into audio blocks in predetermined time units. An audio signal processing circuit 57 for performing a waveform cutting process, a level detection circuit 13, a feature extraction circuit 14, and a waveform processing circuit 1
5 and a D / A conversion circuit 16 for audio.

The disk reproducing apparatus 50 has a system controller 58 for controlling each circuit and accepting an operation input from a user, and a servo control circuit 59 for performing servo control of the optical disk 51 based on the control of the system controller 58. ing.

In the disk reproducing apparatus 50 having such a configuration, when the user performs the fast forward reproduction operation processing, the image processing circuit 55 processes the image data and outputs the fast forward reproduced image. In addition, the waveform processing circuit 14 deletes a silent audio block and an audio block having continuous features,
An audio signal having a playback speed corresponding to the fast-forward playback image is output. Further, the system controller 58 outputs a signal via the servo control circuit 59 based on the determination result of whether the sound block from the level detection circuit 13 is silent or sound, and the feature of the sound signal of the feature extraction circuit 14. Optical disk 51
To control the rotation speed of the. Thereby, for example, in a portion having many silent sections, a very fast fast-forward playback such as a triple speed or a quadruple speed is performed, and in a sound section, a fast-forward playback is performed such that the contents of an audio signal can be understood. It can be performed.

In the disk reproducing apparatus 50 according to the fifth embodiment of the present invention, the voice speed can be converted even for the audio signal recorded on the optical disk 51 as described above.

The first to fifth embodiments of the present invention have been described. In the speech speed conversion device and the disc reproducing device of each embodiment, in addition to the removal of the silent portion, the features of the audio signal such as the waveform correlation, the peak frequency characteristic, and the level change are detected, and the portion where the feature is continuous is deleted. ing. As a result, the reproduction speed of the audio signal can be changed without changing the audio pitch from the normal time.

In each embodiment, an example has been described in which high-speed reproduction is realized by deleting a part of an audio signal. For example, when a silent part is detected, the silent part is further added to the silent part. , Or when a portion where similar features are continuous is detected, a signal having a feature similar to the feature is added, thereby realizing low-speed playback lower than the normal speed. Specifically, the waveform processing circuit 14 can be realized by adding an audio block and connecting the added audio block.

In this embodiment, an example in which the speed of an audio signal is converted has been described. However, the present invention is not limited to an audio signal, and may be, for example, an image signal.

[0087]

According to the present invention, a signal level and a characteristic are detected from a signal waveform of a reproduced signal cut out for each predetermined time unit, and the signal waveform is deleted and / or added for each predetermined time unit. The waveform of the reproduced signal is processed to convert the time axis of the reproduced signal. As a result, in the present invention, the time axis can be converted without deleting the effective portion of the signal.

Further, in the present invention, the signal level and the characteristic are detected from the signal waveform of the audio signal cut out every predetermined time unit, and the signal waveform is deleted and / or deleted every this predetermined time unit.
Alternatively, the waveform processing of the audio signal is performed by adding, and the reproduction speed of the audio signal is converted. Thus, in the present invention,
With a simple configuration, fluctuations in the voice pitch can be eliminated, and the reproduction speed of the voice signal can be converted without lowering the degree of understanding of the contents.

[Brief description of the drawings]

FIG. 1 is a block diagram of a speech speed conversion device according to a first embodiment of the present invention.

FIG. 2 is a diagram for explaining connection processing of an audio signal using a weighting function by a waveform processing circuit of the speech speed conversion device.

FIG. 3 is a diagram for explaining connection processing of an audio signal using a differential code by a waveform processing circuit of the speech speed conversion device.

FIG. 4 is a waveform diagram showing an example of an audio signal input to the speech speed conversion device.

FIG. 5 is a waveform diagram showing an audio signal after the audio signal shown in FIG. 4 is subjected to speech speed conversion by the speech speed conversion device.

6 is an enlarged view of the audio signal shown in FIG.

FIG. 7 is an enlarged view of the audio signal shown in FIG.

FIG. 8 is a block diagram of a speech speed conversion device according to a second embodiment of the present invention.

FIG. 9 is a block diagram of a speech speed conversion device according to a third embodiment of the present invention.

FIG. 10 is a diagram for explaining an audio signal deletion process performed by the waveform processing circuit of the speech speed conversion device according to the third embodiment.

FIG. 11 is a block diagram of a speech speed conversion device according to a fourth embodiment of the present invention.

FIG. 12 is a block diagram of a disc reproducing apparatus according to a fifth embodiment of the present invention.

[Explanation of symbols]

10, 20, 30, 40 speech rate converter, 12 waveform extraction circuit, 13 level detection circuit, 14 feature extraction circuit,
15 Waveform processing circuit, 21 Correlation detection circuit, 31 Level comparison circuit, 41 Frequency analysis circuit, 42 Peak frequency detection circuit, 43 Peak frequency continuity detection circuit, 5
0 Disc playback device

Claims (30)

[Claims]
1. A reproducing unit for reproducing a signal from a recording medium, a signal extracting unit for extracting a signal waveform of a reproduced signal reproduced by the reproducing unit at predetermined time units, and a signal extracted by the signal extracting unit. Level detection means for detecting the signal level of the waveform; feature extraction means for extracting the characteristics of the signal waveform extracted by the signal extraction means; signal level of the signal waveform in a predetermined time unit detected by the level detection means; The waveform of the reproduction signal is processed by deleting and / or adding the signal waveform for each predetermined time unit based on the characteristics of the signal waveform for the predetermined time unit extracted by the characteristic extraction means. A signal reproducing apparatus comprising: a time axis converting unit for converting an axis.
2. The reproduction control of the recording medium according to a signal level of a signal waveform in a predetermined time unit detected by the level detecting means and a characteristic of the signal waveform in a predetermined time unit extracted by the characteristic extracting means. 2. The signal reproducing apparatus according to claim 1, further comprising control means for performing the following.
3. The signal reproducing apparatus according to claim 1, wherein the time axis converting means connects the signal waveforms before and after the deleted signal waveform by performing waveform processing using a weighting function.
4. The signal reproducing apparatus according to claim 1, wherein said time axis conversion means connects signal waveforms before and after the deleted signal waveform based on a differential value of the signal waveform.
5. The level detecting means detects a signal level from an average power and / or an average level of a signal waveform for each predetermined time unit, and the time axis converting means determines that the signal level is equal to or less than a predetermined threshold value. The signal reproducing device according to claim 1, wherein the signal waveform is deleted and / or added every predetermined time unit.
6. The characteristic extracting means extracts a characteristic of a signal waveform from a waveform correlation of the signal waveform for each predetermined time unit, and the time axis converting means converts a signal waveform having a similar characteristic to a predetermined characteristic. The signal reproducing apparatus according to claim 1, wherein the signal is deleted and / or added every time unit.
7. The feature extracting means frequency-analyzes a signal waveform for each predetermined time unit, and extracts a characteristic of the signal waveform from the continuity of a peak frequency. 2. The signal reproducing apparatus according to claim 1, wherein the generated signal waveform is deleted and / or added every predetermined time unit.
8. The feature extracting means extracts a characteristic of a signal waveform from a level change of the signal waveform for each predetermined time unit, and the time axis converting means converts a signal waveform having a similar characteristic to a predetermined waveform. 2. The signal reproducing apparatus according to claim 1, wherein the signal is deleted and / or added every time unit.
9. A signal is reproduced from a recording medium, a signal waveform of the reproduced signal is clipped at predetermined time units, a signal level of the clipped signal waveform is detected, and a characteristic of the clipped signal waveform is extracted. Based on the detected signal level of the predetermined time unit signal waveform and the extracted characteristic of the predetermined time unit signal waveform, the signal waveform of the predetermined time unit is deleted and / or added to perform the waveform of the reproduction signal. A signal reproducing method comprising processing and converting a time axis of a reproduced signal.
10. The reproduction control of the recording medium according to the detected signal level of the signal waveform in the predetermined time unit and the extracted characteristic of the signal waveform in the predetermined time unit. 3. The signal reproducing method according to 1.
11. The signal reproducing method according to claim 9, wherein the signal waveforms before and after the deleted signal waveform are processed by using a weighting function and connected to convert the time axis of the reproduced signal. .
12. The signal reproducing method according to claim 9, wherein signal waveforms before and after the deleted signal waveform are connected based on a differential value of the signal waveform, and a time axis of the reproduced signal is converted.
13. A signal level is detected from an average power and / or an average level of a signal waveform for each predetermined time unit, and a signal waveform whose signal level is equal to or less than a predetermined threshold is deleted and deleted for each predetermined time unit. The signal reproducing method according to claim 9, wherein the signal is added.
14. A method for extracting a characteristic of a signal waveform from a waveform correlation of a signal waveform for each predetermined time unit, and deleting and / or adding a signal waveform having a similar characteristic for each predetermined time unit. The signal reproducing method according to claim 9, wherein:
15. A frequency analysis of a signal waveform for each predetermined time unit, extracting a characteristic of the signal waveform from the continuity of the peak frequency, deleting a signal waveform having a similar characteristic for each predetermined time unit, and The signal reproducing method according to claim 9, wherein the signal is added.
16. A method for extracting a characteristic of a signal waveform from a level change of the signal waveform for each predetermined time unit, and deleting and / or adding a signal waveform having a similar characteristic for each predetermined time unit. The signal reproducing method according to claim 9, wherein
17. A signal extracting means for extracting a signal waveform of an audio signal every predetermined time unit, a level detecting means for detecting a signal level of the signal waveform extracted by the signal extracting means, and a signal extracting means. A characteristic extracting means for extracting a characteristic of the signal waveform extracted by: a signal level of the signal waveform in a predetermined time unit detected by the level detecting means; and a characteristic of the signal waveform in a predetermined time unit extracted by the characteristic extracting means. Audio signal reproduction device comprising: a signal conversion unit that deletes and / or adds a signal waveform for each predetermined time unit to process the waveform of the audio signal, and converts a reproduction speed of the audio signal.
18. The audio signal reproducing apparatus according to claim 17, wherein the speed conversion means connects the signal waveforms before and after the deleted signal waveform by processing the waveform using a weighting function.
19. The audio signal reproducing apparatus according to claim 17, wherein said speed conversion means connects signal waveforms before and after the deleted signal waveform based on a differential value of the signal waveform.
20. The level detecting means detects a signal level from an average power and / or an average level of a signal waveform for each predetermined time unit, and the speed converting means detects a signal whose signal level is equal to or less than a predetermined threshold value. The audio signal reproducing apparatus according to claim 17, wherein the waveform is deleted and / or added every predetermined time unit.
21. The characteristic extracting unit extracts a characteristic of a signal waveform from a waveform correlation of the signal waveform for each predetermined time unit, and the speed converting unit extracts a signal waveform having a similar characteristic.
18. The audio signal reproducing device according to claim 17, wherein the audio signal is deleted and / or added every predetermined time unit.
22. The characteristic extracting means frequency-analyzes a signal waveform for each predetermined time unit, and extracts a characteristic of the signal waveform from the continuity of a peak frequency. The speed converting means has a similar characteristic. The signal waveform
18. The audio signal reproducing device according to claim 17, wherein the audio signal is deleted and / or added every predetermined time unit.
23. The characteristic extracting means extracts a characteristic of a signal waveform from a level change of the signal waveform for each predetermined time unit, and the speed converting means converts a signal waveform having a similar characteristic into a signal waveform.
18. The audio signal reproducing device according to claim 17, wherein the audio signal is deleted and / or added every predetermined time unit.
24. A signal waveform of an audio signal is cut out in predetermined time units, a signal level of the cut out signal waveform is detected, a characteristic of the cut out signal waveform is extracted, and a signal waveform of the extracted predetermined time unit is extracted. Based on the signal level and the extracted characteristic of the signal waveform in the predetermined time unit, the waveform of the audio signal is processed by deleting and / or adding the signal waveform in the predetermined time unit, and the reproduction speed of the audio signal is reduced. A method for converting the speed of an audio signal, characterized by performing the conversion.
25. The audio signal according to claim 24, wherein the signal waveforms before and after the deleted signal waveform are waveform processed using a weighting function and connected to convert the audio signal reproduction speed. Speed conversion method.
26. The audio signal speed according to claim 24, wherein the signal waveforms before and after the deleted signal waveform are connected based on the differential value of the signal waveform, and the reproduction speed of the audio signal is converted. Conversion method.
27. A signal level is detected from an average power and / or an average level of a signal waveform for each predetermined time unit, and a signal waveform whose signal level is equal to or lower than a predetermined threshold is deleted and deleted for each predetermined time unit. The method for converting the speed of an audio signal according to claim 24, wherein the speed conversion is performed.
28. A method for extracting a characteristic of a signal waveform from a waveform correlation of the signal waveform for each predetermined time unit, and deleting and / or adding a signal waveform having a similar characteristic for each predetermined time unit. The method for converting the speed of an audio signal according to claim 24, wherein:
29. A frequency analysis of a signal waveform for each predetermined time unit, extracting characteristics of the signal waveform from the continuity of the peak frequency, and deleting a signal waveform having a similar feature in each predetermined time unit. The method for converting the speed of an audio signal according to claim 24, wherein the speed conversion is performed.
30. A method for extracting a characteristic of a signal waveform from a level change of the signal waveform for each predetermined time unit, and deleting and / or adding a signal waveform having a similar characteristic for each predetermined time unit. The speed conversion method of an audio signal according to claim 24, wherein
JP27024498A 1998-09-24 1998-09-24 Signal reproducing device and method, voice signal reproducing device, and speed conversion method for voice signal Pending JP2000099097A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP27024498A JP2000099097A (en) 1998-09-24 1998-09-24 Signal reproducing device and method, voice signal reproducing device, and speed conversion method for voice signal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP27024498A JP2000099097A (en) 1998-09-24 1998-09-24 Signal reproducing device and method, voice signal reproducing device, and speed conversion method for voice signal

Publications (1)

Publication Number Publication Date
JP2000099097A true JP2000099097A (en) 2000-04-07

Family

ID=17483564

Family Applications (1)

Application Number Title Priority Date Filing Date
JP27024498A Pending JP2000099097A (en) 1998-09-24 1998-09-24 Signal reproducing device and method, voice signal reproducing device, and speed conversion method for voice signal

Country Status (1)

Country Link
JP (1) JP2000099097A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003019535A1 (en) * 2000-02-28 2003-03-06 Linguamaster Corporation Data structure, generating method, reproducing method, recording method, recording medium and distributing method of voice data, and reproducing method of multimedia
KR100656968B1 (en) 2003-05-27 2006-12-13 가부시끼가이샤 도시바 Speech rate conversion apparatus, method and computer-readable record medium thereof
US7418393B2 (en) 2000-05-26 2008-08-26 Fujitsu Limited Data reproduction device, method thereof and storage medium
US8153879B2 (en) 2005-04-15 2012-04-10 Sony Corporation Data processing apparatus, data reproduction apparatus, data processing method and data processing program
US8391669B2 (en) 2009-06-04 2013-03-05 Canon Kabushiki Kaisha Video processing apparatus and video processing method
JP2013148654A (en) * 2012-01-18 2013-08-01 Nippon Hoso Kyokai <Nhk> Speech speed conversion device and program thereof, and recording medium recording program

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003019535A1 (en) * 2000-02-28 2003-03-06 Linguamaster Corporation Data structure, generating method, reproducing method, recording method, recording medium and distributing method of voice data, and reproducing method of multimedia
US7418393B2 (en) 2000-05-26 2008-08-26 Fujitsu Limited Data reproduction device, method thereof and storage medium
KR100656968B1 (en) 2003-05-27 2006-12-13 가부시끼가이샤 도시바 Speech rate conversion apparatus, method and computer-readable record medium thereof
US8153879B2 (en) 2005-04-15 2012-04-10 Sony Corporation Data processing apparatus, data reproduction apparatus, data processing method and data processing program
US8391669B2 (en) 2009-06-04 2013-03-05 Canon Kabushiki Kaisha Video processing apparatus and video processing method
JP2013148654A (en) * 2012-01-18 2013-08-01 Nippon Hoso Kyokai <Nhk> Speech speed conversion device and program thereof, and recording medium recording program

Similar Documents

Publication Publication Date Title
KR100519866B1 (en) Commercial detection apparatus and commercial detection method
DE3534064C2 (en)
DE69816221T2 (en) Language speed change method and device
US5611018A (en) System for controlling voice speed of an input signal
US5719344A (en) Method and system for karaoke scoring
DE69734430T2 (en) Information recording and playback
JP4334355B2 (en) Trick mode audio playback
JP4630876B2 (en) Speech speed conversion method and speech speed converter
EP1481544B1 (en) Gated silence removal during video trick modes
JP2005518560A (en) Digital playback apparatus and method for automatically selecting and storing music parts
JP4000095B2 (en) Speech recognition method, apparatus and program
JP3925306B2 (en) Digital audio signal reproduction device
KR100806155B1 (en) Method and system for enabling audio speed conversion
US20010024568A1 (en) Compressed audio data reproduction apparatus and compressed audio data reproducing method
JP4895418B2 (en) Audio reproduction method and audio reproduction apparatus
US6654317B2 (en) Method and apparatus for reproducing information
JP3840928B2 (en) Signal processing apparatus and method, recording medium, and program
EP1906661A2 (en) Picture recorder and commercial message detection method
JPH1074138A (en) Method and device for segmenting voice
JP2976860B2 (en) Playback device
US7599836B2 (en) Voice recording system, recording device, voice analysis device, voice recording method and program
JP4952469B2 (en) Information processing apparatus, information processing method, and program
JP4698453B2 (en) Commercial detection device, video playback device
EP1483908B1 (en) Audio frequency scaling during video trick modes utilizing digital signal processing
JP2004212665A (en) Apparatus and method for varying speaking speed

Legal Events

Date Code Title Description
A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20050617

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20071030

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20071106

A521 Written amendment

Effective date: 20071225

Free format text: JAPANESE INTERMEDIATE CODE: A523

A02 Decision of refusal

Effective date: 20081224

Free format text: JAPANESE INTERMEDIATE CODE: A02