EP1136981A1

EP1136981A1 - Method of sampling pitch period of voice signal and device for time-axis compression/decompression of voice signal

Info

Publication number: EP1136981A1
Application number: EP99944864A
Authority: EP
Inventors: Takeo Sanyo Electric Co. Ltd. INOUE
Original assignee: Sanyo Electric Co Ltd
Current assignee: Sanyo Electric Co Ltd
Priority date: 1998-09-29
Filing date: 1999-09-27
Publication date: 2001-09-26
Also published as: JP2000305581A; WO2000019407A1; JP3639461B2; CN1320256A; CN1158640C; EP1136981A4; CA2345712A1

Abstract

In a voice signal pitch period detecting method for detecting the pitch period of an input voice waveform by taking a predetermined number of pitch periods on the basis of the input voice waveform of a predetermined time period, a voice signal pitch period detecting method is characterized by reducing, when the detected pitch period is not more than a predetermined reference value, the number of times of pitch period detecting processing by considering the pitch period of a waveform of a predetermined number of pitch periods subsequent to a waveform of the predetermined number of pitch periods detected the same as the currently detected pitch period.

Description

The present invention relates generally to a voice signal pitch period detecting method, a voice signal pitch period detecting device, a voice signal time-axis compressing device, a voice signal time-axis decompressing device, and a voice signal time-axis compressing/decompressing device.

When a voice signal is compressed or decompressed, the pitch period of a voice waveform must be found. The pitch period generally represents the voice height. One of pitch period detecting methods is one utilizing auto-correlation.
An example of the pitch period detecting method using auto-correlation is a method of finding auto-correlation (short-time auto-correlation), assuming that a signal is time-limited, and a signal exists only within a section having a time period Ts, while no signal exists (always zero) outside the section having the time period Ts. When a voice waveform is represented by digital voice data x(n), a short-time auto-correlation value Rn(k) is as follows, as also described in "Digital Signal Processing of Voice" (First Volume) issued by CORONA PUBLISHING CO., LTD. written by L. R .Rabiner & R. W. Schafer and translated by Suzuki Hisayoshi, pp.152 to pp.152.
where m = 0,1,2,........,Ts-1-k
In this case, Ts denotes a time section where it is assumed that a voice signal exists, k denotes a delay time period in a case where a voice waveform is delayed when the short-time auto-correlation value Rn(k) is calculated, and the relationship of Ts >> k holds. When such a value of k that the short-time auto-correlation value Rn(k) is the maximum is found, k is a pitch period.
A time period Ts for detecting pitch periods is generally set to twice the estimated maximum pitch (i.e., the longest pitch period). Generally, the pitch period of the input voice waveform is detected by taking two pitch periods on the basis of an input voice waveform of a time length corresponding to the time period Ts.
In the case of a waveform having a long pitch period, therefore, even if the waveform is detected by taking two pitch periods for each time period Ts, as shown in Fig. 14 (b) , a time period during which time periods Ts(1) and Ts(2) for detecting pitch periods are overlapped is short.
In the case of the waveform having a short pitch period, however, a time period during which the time periods Ts (1) and Ts (2) (and Ts (2) and Ts(3)) for detecting pitch periods are overlapped is lengthened, as shown in Fig. 14 (a). The reason for this is that the time period Ts for detecting pitch periods is set to twice the estimated maximum pitch.
In the case of the waveform having a short pitch period, therefore, the number of times of pitch period detecting processing per unit time (the number of times of calculation of a correlation value) is thus larger, as compared with that in the case of the waveform having a long pitch period. Accordingly, a processing load on processing means (a processor) for performing pitch period detecting processing is heavy.
The voice of a human being-may, in some cases, be composed of a waveform which is repeated at the same pitch period. In the case of a voice composed of a waveform having a short pitch period (for example, a high voice of woman or the like), the number of waveforms having the same pitch period within a predetermined time period is larger, as compared with a voice composed of a waveform having a long pitch period (for example, a low voice of man or the like).
It is found that in the case of the voice composed of the waveform having a short pihch period, the effect thereof is small even if the number of times of pitch period detecting processing per unit time is decreased.
The present invention has been made on the basis of such a viewpoint and its object is to provide a voice signal pitch period detecting method and a voice signal pitch period detecting device, a voice signal time-axis compressing device, a voice signal time-axis decompressing device, and a voice signal time-axis compressing/decompressing device which can reduce a processing load and shorten a processing time period.

In a voice signal pitch period detecting method for detecting the pitch period of an input voice waveform by taking a predetermined number of pitch periods on the basis of the input voice waveform of a predetermined time period, a first voice signal pitch period detecting method according to the present invention is characterized by reducing, when the detected pitch period is not more than a predetermined reference value, the number of times of pitch period detecting processing by considering the pitch period of a waveform of a predetermined number of pitch periods subsequent to a waveform of the predetermined number of pitch periods detected the same as the currently detected pitch period.
In avoice signal pitch period detecting method for detecting the pitch period of an input voice waveform by taking a predetermined number of pitch periods on the basis of the input voice waveform of a predetermined time period, a second voice signal pitch period detecting method according to the present invention is characterized by judging whether the detected pitch period is long or short on the basis of the ratio of the detected pitch period to the predetermined time period, and reducing, when it is judged that the detected pitch period is short, the number of times of pitch period detecting processing by considering the pitch period of a waveform of the predetermined number of pitch periods subsequent to a waveform of the predetermined number of pitch periods detected the same as the detected pitch period.
A first voice signal pitch period detecting device according to the present invention is characterized by comprising first means for detecting the pitch period of an input voice waveform by taking a predetermined number of pitch periods on the basis of the input voice waveform of a predetermined time period; second means for judging whether or not the detected pitch period is not more than a predetermined reference value; third means for causing, when it is judged that the detected pitch period is more than the predetermined reference value, the first means to detect the pitch period of a waveform of a predetermined number of pitch periods subsequent to a waveform of the predetermined number of pitch periods detected; and fourth means for determining, when it is judged that the detected pitch period is not more than the predetermined reference value, the pitch period of the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected to be the same as the currently detected pitch period, and omitting the pitch period detecting processing by the first means with respect to the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected.
A second voice signal pitch period detecting device according to the present invention is characterized by comprising first means for detecting the pitch period of an input voice waveform by taking a predetermined number of pitch periods on the basis of the input voice waveform of a predetermined time period; second means for judging whether the detected pitch period is long or short on the basis of the ratio of the detected pitch period to the predetermined time period; third means for causing, when it is judged that the detected pitch period is long, the first means to detect the pitch period of a waveform of a predetermined number of pitch periods subsequent to a waveform of the predetermined number of pitch periods detected; and fourth means for determining, when it is judged that the detected pitch period is short, the pitch period of the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected to be the same as the currently detected pitch period, and omitting the pitch period detecting processing by the first means with respect to the waveform of the predetermined number of pitch periods subsequent . to the waveform of the predetermined number of pitch periods detected.
A first voice signal time-axis compressing device according to the present invention is characterized by comprising pitch period detecting means for detecting the pitch period of an input voice waveform; and time-axis compressing means for time-axis compressing the input voice waveform on the basis of the pitch period detected by the pitch period detecting means, the pitch period detecting means comprising first means for detecting the pitch period of the input voice waveform by taking a predetermined number of pitch periods on the basis of the input voice waveform of a predetermined time period, second means for judging whether or not the detected pitch period is not more than a predetermined reference value, third means for causing, when it is judged that the detected pitch period is more than the predetermined reference value, the first means to detect the pitch period of a waveform of a predetermined number of pitch periods subsequent to a waveform of the predetermined number of pitch periods detected, and fourth means for determining, when it is judged that the detected pitch period is not more than the predetermined reference value, the pitch period of the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected to be the same as the currently detected pitch period, and omitting the pitch period detecting processing by the first means with respect to the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected.
A second voice signal time-axis compressing device according to the present invention is characterized by comprising pitch period detecting means for detecting the pitch period of an input voice waveform; and time-axis compressing means for time-axis compressing the input voice waveform on the basis of the pitch period detected by the pitch period detecting means, the pitch period detecting means comprising first means for detecting the pitch period of the input voice waveform by taking a predetermined number of pitch periods on the basis of the input voice waveform of a predetermined time period, second means for judging whether the detected pitch period is long or short on the basis of the ratio of the detected pitch period to the predetermined time period, third means for causing, when it is judged that the detected pitch period is long, the first means to detect the pitch period of a waveform of a predetermined number of pitch periods subsequent to a waveform of the predetermined number of pitch periods detected, and fourth means for determining, when it is judged that the detected pitch period is short, the pitch period of the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected to be the same as the currently detected pitch period, and omitting the pitch period detecting processing by the first means with respect to the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected.
A first voice signal time-axis decompressing device according to the present invention is characterized by comprising pitch period detecting means for detecting the pitch period of an input voice waveform; and time-axis decompressing means for time-axis decompressing the input voice waveform on the basis of the pitch period detected by the pitch period detecting means, the pitch period detecting means comprising first means for detecting the pitch period of the input voice waveform by taking a predetermined number of pitch periods on the basis. of the input voice waveform of a predetermined time period, second means for judging whether or not the detected pitch period is not more than a predetermined reference value, third means for causing, when it is judged that the detected pitch period is more than the predetermined reference value, the first means to detect the pitch period of a waveform of a predetermined number of pitch periods subsequent to a waveform of the predetermined number of pitch periods detected, and fourth means for determining, when it is judged that the detected pitch period is not more than the predetermined reference value, the pitch period of the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected to be the same as the currently detected pitch period, and omitting the pitch period detecting processing by the first means with respect to the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected.
A second voice signal time-axis decompressing device according to the present invention is characterized by comprising pitch period detecting means for detecting the pitch period of an input voice waveform; and time-axis decompressing means for time-axis decompressing the input voice waveform on the basis of the pitch period detected by the pitch period detecting means, the pitch period detecting means comprising first means for detecting the pitch period of the input voice waveform by taking a predetermined number of pitch periods on the basis of the input voice waveform of a predetermined time period, second means for judging whether the detected pitch period is long or short on the basis of the ratio of the detected pitch period to the predetermined time period, third means for causing, when it is judged that the detected pitch period is long, the first means to detect the pitch period of a waveform of a predetermined number of pitch periods subsequent to a waveform of the predetermined number of pitch periods detected, and fourth means for determining, when it is judged that the detected pitch period is short, the pitch period of the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected to be the same as the currently detected pitch period, and omitting the pitch period detecting processing by the first means with respect to the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected.
A first voice signal time-axis compressing/decompressing device according to the present invention is characterized by comprising pitch period detecting means for detecting the pitch period of an input voice waveform; time-axis compressing means for time-axis compressing the input voice waveform on the basis of the pitch period detected by the pitch period detecting means; time-axis decompressing means for time-axis decompressing the input voice waveform on the basis of the pitch period detected by the pitch period detecting means; and switching means for switching the time-axis compressing means and the time-axis decompressing means, the pitch period detecting means comprising first means for detecting the pitch period of the input voice waveform by taking a predetermined number of pitch periods on the basis of the input voice waveform of a predetermined time period, second means for judging whether or not the detected pitch period is not more than a predetermined reference value, third means for causing, when it is judged that the detected pitch period is more than the predetermined reference value, the first means to detect the pitch period of a waveform of a predetermined number of pitch periods subsequent to a waveform of the predetermined number of pitch periods detected, and fourth means for determining, when it is judged that the detected pitch period is not more than the predetermined reference value, the pitch period of the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected to be the same as the currently detected pitch period, and omitting the pitch period detecting processing by the first means with respect to the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected.
A second voice signal time-axis compressing/decompressing device according to the present invention is characterized by comprising pitch period detecting means for detecting the pitch period of an input voice waveform; time-axis compressing means for time-axis compressing the input voice waveform on the basis of the pitch period detected by the pitch period detecting means; time-axis decompressing means for time-axis decompressing the input voice waveform on the basis of the pitch period detected by the pitch period detecting means; and switching means for switching the time-axis compressing means and the time-axis decompressing means, the pitch period detecting means comprising first means for detecting the pitch period of the input voice waveform by taking a predetermined number of pitch periods on the basis of the input voice waveform of a predetermined time period, second means for judging whether the detected pitch period is long or short on the basis of the ratio of the detected pitch period to the predetermined time period, third means for causing, when it is judged that the detected pitch period is long, the first means to detect the pitch period of a waveform of a predetermined number of pitch periods subsequent to a waveform of the predetermined number of pitch periods detected, and fourth means for determining, when it is judged that the detected pitch period is short, the pitch period of the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected to be the same as the currently detected pitch period, and omitting the pitch period detecting processing by the first means with respect to the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected.
A third voice signal time-axis compressing/decompressing device according to the present invention is characterized by comprising pitch period detecting means for detecting the pitch period of an input voice waveform; time-axis compressing means for time-axis compressing the input voice waveform on the basis of the pitch period detected by the pitch period detecting means; a memory storing the pitch period detected by the pitch period detecting means and the voice waveform, which has been time-axis compressed, obtained by the time-axis compressing mans; and time-axis decompressing means for reading out the pitch period and the voice waveform from the memory and time-axis decompressing the read voice waveform on the basis of the read pitch period, the pitch period detecting means comprising first means for detecting the pitch period of the input voice waveform by taking a predetermined number of pitch periods on the basis of the input voice waveform of a predetermined time period, second means for judging whether or not the detected pitch period is not more than a predetermined reference value, third means for causing, when it is judged that the detected pitch period is more than the predetermined reference value, the first means to detect the pitch period of a waveform of a predetermined number of pitch periods subsequent to a waveform of the predetermined number of pitch periods detected, and fourth means for determining, when it is judged that the detected pitch period is not more than the predetermined reference value, the pitch period of the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected to be the same as the currently detected pitch period, and omitting the pitch period detecting processing by the first means with respect to the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected.
A fourth voice signal time-axis compressing/decompressing device according to the present invention is characterized by comprising pitch period detecting means for detecting the pitch period of an input voice waveform; time-axis compressing means for time-axis compressing the input voice waveform on the basis of the pitch period detected by the pitch period detecting means; a memory storing the pitch period detected by the pitch period detecting means and the voice waveform, which has been time-axis compressed, obtained by the time-axis compressing mans; and time-axis decompressing means for reading out the pitch period and the voice waveform from the memory and time-axis decompressing the read voice waveform on the basis of the read pitch period, the pitch period detecting means comprising first means for detecting the pitch period of the input voice waveform by taking a predetermined number of pitch periods on the basis of the input voice waveform of a predetermined time period, second means for judging whether the detected pitch period is long or short on the basis of the ratio of the detected pitch period to the predetermined time period, third means for causing, when it is judged that the detected pitch period is long, the first means to detect the pitch period of a waveform of a predetermined number of pitch periods subsequent to a waveform of the predetermined number of pitch periods detected, and fourth means for determining, when it is judged that the detected pitch period is short, the pitch period of the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected to be the same as the currently detected pitch period, and omitting the pitch period detecting processing by the first means with respect to the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected.
A fifth voice signal time-axis compressing/decompressing device according to the present invention is characterized by comprising first pitch period detecting means for detecting the pitch period of an input voice waveform; time-axis compressing means for time-axis compressing the input voice waveform on the basis of the pitch period detected by the pitch period detecting means; a memory storing the voice waveform, which has been time-axis compressed, obtained by the time-axis compressing mans; second pitch period detecting means for reading out the voice waveform from the memory and detecting the pitch period of the read voice waveform; and time-axis decompressing means for time-axis decompressing the voice waveform read out of the memory on the basis of the pitch period detected by the second pitch period detecting means, each of the pitch period detecting means comprising first means for detecting the pitch period of the voice waveform by taking a predetermined number of pitch periods on the basis of the voice waveform of a predetermined time period, second means for judging whether or not the detected pitch period is not more than a predetermined reference value; third means for causing, when it is judged that the detected pitch period is more than the predetermined reference value, the first means to detect the pitch period of a waveform of a predetermined number of pitch periods subsequent to a waveform of the predetermined number of pitch periods detected, and fourth means for determining, when it is judged that the detected pitch period is not more than the predetermined reference value,.the pitch period of the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected to be the same as the currently detected pitch period, and omitting the pitch period detecting processing by the first means with respect to the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected.
A sixth voice signal time-axis compressing/decompressing device according to the present invention is characterized by comprising first pitch period detecting means for detecting the pitch period of an input voice waveform; time-axis compressing means for time-axis compressing the input voice waveform on the basis of the pitch period detected by the pitch period detecting means; a memory storing the voice waveform, which has been time-axis compressed, obtained by the time-axis compressing mans, second pitch period detecting means for reading out the voice waveform from the memory and detecting the pitch period of the read voice waveform; and time-axis decompressing means for time-axis decompressing the voice waveform read out of the memory on the basis of the pitch period detected by the second pitch period detecting means, each of the pitch period detecting means comprising first means for detecting the pitch period of the voice waveform by taking a predetermined number of pitch periods on the basis of the voice waveform of a predetermined time period, second means for judging whether the detected pitch period is long or short on the basis of the ratio of the detected pitch period to the predetermined time period, third means for causing, when it is judged that the detected pitch period is long, the first means to detect the pitch period of a waveform of a predetermined number of pitch periods subsequent to a waveform of the predetermined number of pitch periods detected, and fourth means for determining, when it is judged that the detected pitch period is short, the pitch period of the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected to be the same as the currently detected pitch period, and omitting the pitch period detecting processing by the first means with respect to the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected.
A seventh voice signal time-axis compressing/decompressing device according to the present invention is characterized by comprising pitch period detecting means for detecting the pitch period of an input voice waveform; time-axis compressing means for time-axis compressing the input voice waveform on the basis of the pitch period detected by the pitch period detecting means; band dividing and coding means for dividing the voice waveform, which has been time-axis compressed, obtained by the time-axis compressing means into a plurality of bands and coding the voice waveform for each of the obtained bands; a memory storing the pitch period detected by the pitch period detecting means and codes obtained by the band dividing and coding means; band dividing and decoding means for reading out the codes from the memory and decoding the read codes to obtain the voice waveform which has been time-axis compressed; and time-axis decompressing means for reading out the pitch period from the memory and time-axis decompressing the voice waveform obtained by the band dividing and decoding means on the basis of the read pitch period, the pitch period detecting means comprising first means for detecting the pitch period of the input voice waveform by taking a predetermined number of pitch periods on the basis of the input voice waveform of a predetermined time period, second means for judging whether or not the detected pitch period is not more than a predetermined reference value, third means for causing, when it is judged that the detected pitch period is more than the predetermined reference value, the first means to detect the pitch period of a waveform of a predetermined number of pitch periods subsequent to a waveform of the predetermined number of pitch periods detected, and fourth means for determining, when it is judged that the detected pitch period is not more than the predetermined reference value, the pitch period of the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected to be the same as the currently detected pitch period, and omitting the pitch period detecting processing by the first means with respect to the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected.
A eighth voice signal time-axis compressing/decompressing device according to the present invention is characterized by comprising pitch period detecting means for detecting the pitch period of an input voice waveform; time-axis compressing means for time-axis compressing the input voice waveform on the basis of the pitch period detected by the pitch period detecting means; band dividing and coding means for dividing the voice waveform, which has been time-axis compressed, obtained by the time-axis compressing means into a plurality of bands and coding the voice waveform for each of the obtained bands; a memory storing the pitch period detected by the pitch period detecting means and codes obtained by the band dividing and coding means; band dividing and decoding means for reading out the codes from the memory and decoding the read codes to obtain the voice waveform which has been time-axis compressed; and time-axis decompressing means for reading out the pitch period from the memory and time-axis decompressing the voice waveform obtained by the band dividing and decoding means on the basis of the read pitch period, the pitch period detecting means comprising first means for detecting the pitch period of the input voice waveform by taking a predetermined number of pitch periods on the basis of the input voice waveform of a predetermined time period, second means for judging whether the detected pitch period is long or short on the basis of the ratio of the detected pitch period to the predetermined time period, third means for causing, when it is judged that the detected pitch period is long, the first means to detect the pitch period of a waveform of a predetermined number of pitch periods subsequent to a waveform of the predetermined number of pitch periods detected, and fourth means for determining, when it is judged that the detected pitch period is short, the pitch period of the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected to be the same as the currently detected pitch period, and omitting the pitch period detecting processing by the first means with respect to the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected.
A ninth voice signal time-axis compressing/decompressing device according to the present invention is characterized by comprising first pitch period detecting means for detecting the pitch period of an input voice waveform; time-axis compressing means for time-axis compressing the input voice waveform on the basis of the pitch period detected by the first pitch period detecting means; band dividing and coding means for dividing the voice waveform, which has been time-axis compressed, obtained by the time-axis compressing means into a plurality of bands and coding the voice waveform for each of the obtained bands; a memory storing codes obtained by the band dividing and coding means; band dividing and decoding means for reading out the codes from the memory and decoding the read codes to obtain the voice waveform which has been time-axis compressed; second pitch period detecting means for detecting the pitch period of the voice waveform obtained by the band dividing and decoding means; and time-axis decompressing means for time-axis decompressing the voice waveform obtained by the band dividing and decoding means on the basis of the pitch period detected by the second pitch period detecting means, each of the pitch period detecting means comprising first means for detecting the pitch period of a voice waveform by taking a predetermined number of pitch periods on the basis of the voice waveform of a predetermined time period, second means for judging whether or not the detected pitch period is not more than a predetermined reference value, third means for causing, when it is judged that the detected pitch period is more than the predetermined reference value, the first means to detect the pitch period of a waveform of a predetermined number of pitch periods subsequent to a waveform of the predetermined number of pitch periods detected, and fourth means for determining, when it is judged that the detected pitch period is not more than the predetermined reference value, the pitch period of the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected to be the same as the currently detected pitch period, and omitting the pitch period detecting processing by the first means with respect to the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected.
A tenth voice signal time-axis compressing/decompressing device according to the present invention is characterized by comprising first pitch period detecting means for detecting the pitch period of an input voice waveform; time-axis compressing means for time-axis compressing the input voice waveform on the basis of the pitch period detected by the pitch period detecting means; band dividing and coding means for dividing the voice waveform, which has been time-axis compressed, obtained by the time-axis compressing means into a plurality of bands and coding the voice waveform for each of the obtained bands; a memory storing codes obtained by the band dividing and coding means; band dividing and decoding means for reading out the codes from the memory and decoding the read codes to obtain the voice waveform which has been time-axis compressed; second pitch period detecting means for detecting the pitch period of the voice waveform obtained by the band dividing and decoding means; and time-axis decompressing means for time-axis decompressing the voice waveform obtained by the band dividing and decoding means on the basis of the pitch period detected by the second pitch period detecting means, each of the pitch period detecting means comprising first means for detecting the pitch period of the voice waveform by taking a predetermined number of pitch periods on the basis of the voice waveform of a predetermined time period, second means for judging whether the detected pitch period is long or short on the basis of the ratio of the detected pitch period to the predetermined time period, third means for causing, when it is judged that the detected pitch period is long, the first means to detect the pitch period of a waveform of a predetermined number of pitch periods subsequent to a waveform of the predetermined number of pitch periods detected, and fourth means for determining, when it is judged that the detected pitch period is short, the pitch period of the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected to be the same as the currently detected pitch period, and omitting the pitch period detecting processing by the first means with respect to the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected.
A third voice signal time-axis compressing device according to the present invention is characterized by comprising pitch period detecting means for detecting the pitch period of an input voice waveform; reproduction speed inputting means for inputting reproduction speed information; and time-axis compressing means for time-axis compressing the input voice waveform on the basis of the reproduction speed information inputted by the reproduction speed inputting means and the pitch period detected by the pitch period detecting means, the pitch period detecting means comprising first means for detecting the pitch period of the input voice waveform by taking a predetermined number of pitch periods on the basis of the input voice waveform of a predetermined time period, second means for judging whether or not the detected pitch period is not more than a predetermined reference value, third means for causing, when it is judged that the detected pitch period is more than the predetermined reference value, the first means to detect the pitch period of a waveform of a predetermined number of pitch periods subsequent to a waveform of the predetermined number of pitch periods detected, and fourth means for determining, when it is judged that the detected pitch period is not more than the predetermined reference value, the pitch period of the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected to be the same as the currently detected pitch period, and omitting the pitch period detecting processing by the first means with respect to the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected.
It is preferable that there is provided means for changing the reference value on the basis of the reproduction speed information inputted by the reproduction speed inputting means.
A fourth voice signal time-axis compressing device according to the present invention is characterized by comprising pitch period detecting means for detecting the pitch period of an input voice waveform; reproduction speed inputting means for inputting reproduction speed information; and time-axis compressing means for time-axis compressing the input voice waveform on the basis of the reproduction speed information inputted by the reproduction speed inputting means and the pitch period detected by the pitch period detecting means, the pitch period detecting means comprising first means for detecting the pitch period of the input voice waveform by taking a predetermined number of pitch periods on the basis of the input voice waveform of a predetermined time period, second means for judging whether the detected pitch period is long or short on the basis of the ratio of the detected pitch period to the predetermined time period, third means for causing, when it is judged that the detected pitch period is long, the first means to detect the pitch period of a waveform of a predetermined number of pitch periods subsequent to a waveform of the predetermined number of pitch periods detected, and fourth means for determining, when it is judged that the detected pitch period is short, the pitch period of the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected to be the same as the currently detected pitch period, and omitting the pitch period detecting processing by the first means with respect to the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected.
An example of the means for judging whether the detected pitch period is long or short is one for judging whether the detected pitch period is long or short on the basis of the ratio of the detected pitch period to the predetermined time period and the reproduction speed information inputted by the reproduction speed inputting means.
A third voice signal time-axis decompressing device according to the present invention is characterized by comprising pitch period detecting means for detecting the pitch period of an input voice waveform; reproduction speed inputting means for inputting reproduction speed information; and time-axis decompressing means for time-axis decompressing the input voice waveform on the basis of the reproduction speed information inputted by the reproduction speed inputting means and the pitch period detected by the pitch period detecting means, the pitch period detecting means comprising first means for detecting the pitch period of the input voice waveform by taking a predetermined number of pitch periods on the basis of the input voice waveform of a predetermined time period, second means for judging whether or not the detected pitch period is not more than a predetermined reference value, third means for causing, when it is judged that the detected pitch period is more than the predetermined reference value, the first means to detect the pitch period of a waveform of a predetermined number of pitch periods subsequent to a waveform of the predetermined number of pitch periods detected, and fourth means for determining, when it is judged that the detected pitch period is not more than the predetermined reference value, the pitch period of the waveform of the predetermined number of pitch periods, subsequent to the waveform of the predetermined number of pitch periods detected to be the same as the currently detected pitch period, and omitting the pitch period detecting processing by the first means with respect to the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected.
It is preferable that there is provided means for changing the reference value on the basis of the reproduction speed information inputted by the reproduction speed inputting means.
A fourth voice signal time-axis decompressing device according to the present invention is characterized by comprising pitch period detecting means for detecting the pitch period of an input voice waveform; reproduction speed inputting means for inputting reproduction speed information; and time-axis decompressing means for time-axis decompressing the input voice waveform on the basis of the reproduction speed information inputted by the reproduction speed inputting means and the pitch period detected by the pitch period detecting means, the pitch period detecting means comprising first means for detecting the pitch period of the input voice waveform by taking a predetermined number of pitch periods on the basis of the input voice waveform of a predetermined time period, second means for judging whether the detected pitch period is long or short on the basis of the ratio of the detected pitch period to the. predetermined time period, third means for causing, when it is judged that the detected pitch period is long, the first means to detect the pitch period of a waveform of a predetermined number of pitch periods subsequent to a waveform of the predetermined number of pitch periods detected, and fourth means for determining, when it is judged that the detected pitch period is short, the pitch period of the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected to be the same as the currently detected pitch period, and omitting the pitch period detecting processing by the first means with respect to the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected.
An example of the means for judging whether the detected pitch period is long or short is one for judging whether the detected pitch period is long or short on the basis of the ratio of the detected pitch period to the predetermined time period and the reproduction speed information inputted by the reproduction speed inputting means.
An eleventh voice signal time-axis compressing/decompressing device according to the present invention is characterized by comprising pitch period detecting means for detecting the pitch period of an input voice waveform; reproduction speed inputting means for inputting reproduction speed information; time-axis compressing means for time-axis compressing the input voice waveform on the basis of the reproduction speed information inputted by the reproduction speed inputting means and the pitch period detected by the pitch period detecting means; time-axis decompressing means for time-axis decompressing the input voice waveform on the basis of the reproduction speed information inputted by the reproduction speed inputting means and the pitch period detected by the pitch period detecting means; and switching means for switching the time-axis compressing means and the time-axis decompressing means, the pitch period detecting means comprising first means for detecting the pitch period of the input voice waveform by taking a predetermined number of pitch periods on the basis of the input voice waveform of a predetermined time period, second means for judging whether or not the detected pitch period is not more than a predetermined reference value, third means for causing, when it is judged that the detected pitch period is more than the predetermined reference value, the first means to detect the pitch period of a waveform of a predetermined number of pitch periods subsequent to a waveform of the predetermined number of pitch periods detected, and fourth means for determining, when it is judged that the detected pitch period is not more than the predetermined reference value, the pitch period of the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected to be the same as the currently detected pitch period, and omitting the pitch period detecting processing by the first means with respect to the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected.
It is characterized in that there is provided means for changing the reference value on the basis of the reproduction speed information inputted by the reproduction speed inputting means.
A twelfth voice signal time-axis compressing/decompressing device according to the present invention is characterized by comprising pitch period detecting means for detecting the pitch period of an input voice waveform; reproduction speed inputting means for inputting reproduction speed information; time-axis compressing means for time-axis compressing the input voice waveform on the basis of the reproduction speed information inputted by the reproduction speed inputting means and the pitch period detected by the pitch period detecting means; time-axis decompressing means for time-axis decompressing the input voice waveform on the basis of the reproduction speed information inputted by the reproduction speed inputting means and the pitch period detected by the pitch period detecting means; and switching means for switching the time-axis compressing means and the time-axis decompressing means, the pitch period detecting means comprising first means for detecting the pitch period of the input voice waveform by taking a predetermined number of pitch periods on the basis of the input voice waveform of a predetermined time period, second means for judging whether the detected pitch period is long or short on the basis of the ratio of the detected pitch period to the predetermined time period, third means for causing, when it is judged that the detected p.itch period is long, the first means to detect the pitch period of a waveform of a predetermined number of pitch periods subsequent to a waveform of the predetermined number of pitch periods detected, and fourth means for determining, when it is judged that the detected pitch period is short, the pitch period of the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected to be the same as the currently detected pitch period, and omitting the pitch period detecting processing by the first means with respect to the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected.
An example of the means for judging whether the detected pitch period is long or short is one for judging whether the detected pitch period is long or short on the basis of the ratio of the detected pitch period to the predetermined time period and the reproduction speed information inputted by the reproduction speed inputting means.

Fig. 1 is a block diagram showing the configuration of a pitch period detecting device according to the present invention.
Fig. 2 is a flow chart for explaining time-axis compression processing.
Fig. 3 is a flow chart showing the operations of the pitch period detecting device according to the present invention.
Figs. 4 (a) and 4 (b) are diagrams for explaining time-axis compression processing.
Fig. 5 is a block diagram showing the configuration of a time-axis decompressing device according to the present invention.
Fig. 6 is a flow chart showing the operations of the time-axis decompressing device according to the present invention.
Fig. 7 is a diagram for explaining time-axis decompression processing.
Fig. 8 is a block diagram showing the configuration of a time-axis compressing/decompressing device according to the present invention.
Fig. 9 is a block diagram showing the configuration of a voice signal recording/reproducing apparatus according to the present invention.
Fig. 10 is a block diagram showing the configuration of another voice signal recording/reproducing apparatus according to the present invention.
Fig. 11 is a block diagram showing the configuration of still another voice signal recording/reproducing apparatus according to the present invention.
Fig. 12 is a block diagram showing the configuration of a voice signal recording/reproducing apparatus according to the present invention.
Figs. 13(a) and 13(b) are diagrams for explaining the effect of reducing strain.
Figs. 14(a) and 14(b) are diagrams showing conventional time-axis compression processing.

Referring now to Figs. 1 to 10, embodiments of the present invention will be described.

[1] Description of First Embodiment

Fig. 1 illustrates the configuration of a reproducing apparatus for performing fast listening(fast forward reproduction) of a voice signal.
Reference numeral 1 denotes pitch period detecting means for detecting a pitch period Tp representing the voice height of an inputted digital voice signal (hereinafter referred to as a "voice signal") . The pitch period detecting means 1 detects the pitch period Tp by a pitch period detecting method using well-known auto-correlation. The pitch period detecting means 1 detects the pitch period Tp on the basis of an input voice signal corresponding to a time period Ts by performing pitch period detecting processing once. Let Tp be the pitch period of each of voice signals corresponding to two pitch periods (2Tp) from the head of the input voice signal corresponding to the time period Ts. The pitch period detected by the pitch period detecting means 1 is fed to a switch 5 and is fed to a buffer 3.
The buffer 3 temporarily stores the newest pitch period found by the pitch period detecting means 1. The switch 5 is for selecting one of the pitch period Tp obtained by the pitch period detecting means 1 and the pitch period Tp most newly detected in the past which is stored'in the buffer 3 and feeding the selected pitch period to time-axis compressing means 4. When the switch 5 is switched to a contact a, the pitch period Tp obtained by the pitch period detecting means 1 is fed to the time-axis compressing means 4. When the switch 5 is switched to a contact b, the pitch period Tp most newly detected in the past which is stored. in the buffer 3 is fed to the time-axis compressing means 4.
Reference numeral 2 denotes pitch period judging means for judging whether or not a branching variable upd calculated on the basis of the pitch period Tp detected by the pitch period detecting means 1 and a reproduction speed set by reproduction speed setting means 7 is larger than a threshold SH set by threshold setting means 6 and switching the switch 5 on the basis of the results of the judgment.
The pitch period judging means 2 switches the switch 5 to the contact a when the branching variable upd is more than the threshold SH, while switching the switch 5 to the contact b when the branching variable upd is not more than the threshold SH.
The time-axis compressing means 4 subjects the input voice signal to time-axis compression processing on the basis of the pitch period fed from the switch 5. The time-axis compressing means 4 performs the following time-axis compression processing when the reproduction speed is twice the standard speed, for example. That is, respective voice waveforms A and B of first pitch periods Tp from the input voice signal corresponding to the time period Ts are cut down, as shown in Fig. 2. A waveform A' obtained by multiplexing the cut waveform A by a weighting factor S1 which linearly changes from one to zero and a waveform B' obtained by multiplexing the cut waveform B by a weighting factor S2 which linearly changes from zero to one are produced. The obtained waveforms A' and B' are added, to obtain a waveform C of one pitch period Tp.
Fig. 3 shows the operations of the reproducing apparatus shown in Fig. 1.
A user first operates the reproduction speed setting means 7 to set a reproduction speed (step 1). Set reproduction speed information is fed to the pitch period judging means 2. The reproduction speed may be set by the user selecting a desired reproduction speed from previously determined patterns, for example, 1.0 times the standard speed to 2.0 times the standard speed or may be set by the user numerically entering the reproduction speed.
When the reproduction speed is set, a threshold SH is set, and the initial value of a branching variable upd is set (step 2). As the threshold SH, a time period for detecting pitch periods Tp (a time period for finding auto-correlation) Ts is set. Here, 240 (samples) shall be set. As the initial value of the branching variable upd, a value larger than the threshold SH (the threshold SH < the branching variable upd) is set. Here, 300 (samples) shall be set as the initial value of the branching variable upd. "Sample" means the number of voice signals sampled in accordance with a desired sampling frequency when a voice signal is a digital signal.
Description is now made of a case where twice the standard speed is set as the reproduction speed.
When the threshold SH and the initial value of the branching variable upd are set, the pitch period detecting means 1 and the time-axis compressing means 4 start to read an input voice signal (step 3).
At the step 3 first carried out after the reproduction operation is started, the reading of input voice signals which are 240 samples corresponding to the time period Ts for detecting pitch periods is started. At the step 3 second and later carried out, the reading of input voice signals corresponding to the number of samples which have been compressed at the step 8, described later, is started. When data corresponding to 100 samples, for example, are compressed, the reading of the data corresponding to 100 samples is started at the step 3 next carried out.
The pitch period judging means 2 2 compares the branching variable upd with the threshold SH (step 4). Ts < upd immediately after the initial value of the branching variable upd is set at the step 2. At the step 4 first carried out, therefore, it is judged that the branching variable upd is more than the threshold SH. Thereafter, the program proceeds to the step 5. At the step 5, the switch 5 is switched to the contact a. Further, the pitch period detecting processing by the pitch period detecting means 1 is performed (step 6). That is, the pitch period Tp is detected on the basis of voice signals corresponding to two pitch periods included in the time period Ts. The pitch period Tp obtained by the pitch period detecting means 1 is fed to the time-axis compressing means 4 through the switch 5, and is fed to the buffer 3 and stored therein.
The branching variable upd is updated to a factor Q1 times Tp (Q1 × Tp) (step 7). The factor Q1 is a value determined by the set reproduction speed, and is set to "4" when the reproduction speed is twice the standard speed, as shown in Table 1. Thereafter, the time-axis compressing means 4 subjects the input voice signal whose pitch period has been found to time-axis compression processing on the basis of the pitch period Tp fed from the pitch period detecting means 1 through the switch 5 (step 8).

reproduction speed Q1 Q2

2.0 4 2

1.75 4.5 2

1.5 5 2

1.3 6 2

1.2 6 2
The time-axis compressing means 4 performs time-axis compression processing corresponding to the reproduction speed. In this case, the reproduction speed is twice the standard speed. Accordingly, the time-axis compression processing, described using Fig. 2, is performed.
Thereafter, it is judged whether or not the reproduction processing is terminated (step 9). The reproduction processing is terminated when the user operates a stop button (not shown) so as to stop the reproduction of a voice, for example.
When the reproduction processing is not terminated (NO at step 9), the program is returned to the step 3. At the step 3, the reading of the input voice signals is started. The branching variable upd and the threshold SH are compared with each other (step 4). At the step 4 second carried out, used as the branching variable upd is the branching variable (upd = Q1 × Tp) obtained by the updating at the step 7. That is, it is judged whether or not Q1 times the pitch, period Tp detected last time is larger than the time period Ts for detecting pitch periods (= the threshold SH). When the reproduction speed is twice the standard speed, it is judged whether or not four times the pitch period Tp detected last time is larger than the threshold SH.
When the branching variable upd is more than the threshold SH, it is judged that the pitch period is relatively long. Accordingly, the processing at the steps 5 to 9 is performed again. Thereafter, the program is returned to the step 3. The processing at the steps 3 to 9 is repeatedly performed until it is judged at the step 4 that the branching variable upd is not more than the threshold SH.
When it is judged at the step 4 that the branching variable upd is not more than the threshold SH, it is judged that the pitch period is relatively short. Accordingly, pitch detecting processing for an input voice waveform of two pitch periods subsequent to an input voice waveform of two pitch periods found last time is omitted. That is, it is assumed that the same pitch period as the pitch period found last time is further repeated twice. In this case, in order to perform time-axis compression processing, considering the pitch period most newly detected in the past which is stored in the buffer 3 as the pitch period of input voice signals corresponding to two pitch periods to be currently found, the switch 5 is switched to the contact b (step 10).
Furthermore, the branching variable upd is updated to a value {upd + (Q2 × Tp)} obtained by adding Q2 times the pitch period Tp most newly detected at the step 6 to the current branching variable upd (step 11).
Q2 is determined as "2" irrespective of the set reproduction speed, as shown in Table 1. Letting Tp be the pitch period most newly detected at the step 6, therefore, the branching variable upd is updated to 6 Tp at the step 11.
Thereafter, the time-axis compressing means 4 subjects the input voice signal to time-axis compression processing on the basis of the pitch period Tp fed from the buffer 3 through the switch 5 (step 8). When the reproduction processing is not terminated (NO at step 9), the program is returned to the step 3. At the step 3, the reading of the input voice signals is started.
Processing at the steps 3, 4, 10, 11, 8, and 9 is repeated until it is judged at the step 4 that the branching variable upd is more than the threshold SH. In this case, the branching variable is updated to a value which is larger by (2Tp) at the step 11.
As described in the foregoing, when it is judged at the step 4 that the branching variable upd is not more than the threshold SH, no pitch period detecting processing is performed. Further, the switch 5 is switched to the contact b. Accordingly, the time-axis compression processing is performed on the basis of the pitch period most newly detected which is stored in the buffer 3.
By repeating the foregoing processing, when the pitch period Tp is short, as shown in Fig. 4 (a), the necessity of detecting the pitch period is eliminated with respect to the voice waveform of two pitch periods subsequent to the voice waveform of two pitch periods detected. Accordingly, a processing load on the pitch period detecting means 1 is reduced.
On the other hand, when the pitch period Tp is long, as shown in Fig. 4 (b), the number of times of pitch period detecting processing per unit time is small. Accordingly, a processing load on the pitch period detecting means 1 is not changed from before.
As shown in Table 1, the reason why the factor Q1 by which the variable Tp is multiplexed is changed depending on the reproduction speed (reproduction speed information from the reproduction speed setting means) will be described.
When the reproduction speed is twice the standard speed, for example, a waveform of the first pitch period and a waveform of the second pitch period are compressed into one waveform, to constitute the first output waveform, and a waveform of the third pitch period and a waveform of the fourth pitch period are then compressed, to constitute the second output waveform, as shown in Fig. 13 (a). If waveforms of four or more pitch periods are included in a time period Ts for detecting pitch periods (= a threshold SH), therefore, the pitch period need not be detected when the second output waveform is produced. Therefore, the factor Q1 by which the variable Tp is multiplexed is taken as "4", as shown in Table 1.
When the reproduction speed is 1.5 times the standard speed, for example, a waveform of the first pitch period and a waveform of the second pitch period are compressed into one waveform, to constitute the first output waveform, a waveform of the third pitch period is then taken as the second output waveform as it is, and a waveform of the fourth pitch period and a waveform of the fifth pitch period are then compressed, to constitute the third output waveform, as shown in Fig. 13 (b). If waveforms of five or more pitch periods are included in a range in which pitch periods are detected, therefore, the pitch period need not be detected when the second and third output waveforms are produced. Therefore, the factor Q1 by which the variable Tp is multiplexed is taken as "5", as shown in Table 1.
Suitable factors are respectively given in the same manner as that with respect to the other reproduction speeds. That is, it can be confirmed that strain is reduced by setting the factor by which the variable Tp is multiplexed to the most suitable value.
In Table 1, the factor Q1 by which the variable Tp is multiplexed is changed depending on the reproduction speed information from the reproduction speed setting means. Alternatively, the threshold SH used at the step 4 may be changed depending on the reproduction speed, in which case substantially the same results can be realized.

[2] Description of Second Embodiment

Although in the above-mentioned first embodiment, description was made of a case where an inputted voice signal is time-axis compressed, the present invention is not limited to the same. The present invention is also applicable to a case where an inputted voice signal is time-axis decompressed.
Fig. 5 illustrates the configuration of a reproducing apparatus for performing slow listening (slow reproduction) of a voice signal. In Fig. 5, units corresponding to those shown in Fig. 1 are assigned the same reference numerals.
In the reproducing apparatus, the time-axis compressing means 4 shown in Fig. 1 is replaced with a time-axis decompressing means 8.
The time-axis decompressing means 8 performs the following time-axis decompression processing when the reproduction speed is one-half the standard speed, for example. That is, a voice waveform of three pitch periods is cut down, as shown in Fig. 7. A waveform A of the two pitch periods on the front side is multiplexed by a weighting factor S1 which linearly changes from zero to one, for example, to produce a waveform A' of the two pitch periods. Further, a waveform B of the two pitch periods on the rear side is multiplexed by a weighting factor S2 which linearly changes from one to zero, for example, to produce a waveform B' of the two pitch periods. The obtained waveforms A' and B' are added, to obtain a waveform of two pitch periods comprising a waveform D of one pitch period and a waveform E of one pitch period.
A voice waveform of three pitch periods is then cut down in the same manner as described above from a point which is moved rightward by one pitch period, to obtain a waveform of two pitch periods. That is, the waveform of two pitch periods is obtained for each movement by one pitch period. Accordingly, the reproduction speed is one-half the standard speed.
The voice waveform of three pitch periods is then cut down from a position which is shifted by one pitch period, and the waveform of two pitch periods on the front side and the waveform of two pitch periods on the rear side are respectively multiplexed by weights indicated by broken lines, and the results of the multiplication are added together, to obtain a waveform of two pitch periods.
The voice waveform of three pitch periods is cut down from the position which is shifted by one pitch period to perform the same processing, thereby converting the waveform of one pitch period into the waveform of two pitch periods. Consequently, it is possible to perform slow listening.
Fig. 6 shows the operations of the reproducing apparatus shown in Fig. 5.
A user first operates reproduction speed setting means 7 to set a reproduction speed (step 21). Set reproduction speed information is fed to pitch period judging means 2.
When the reproduction speed is set, a threshold SH is set, and the initial value of a branching variable upd is set (step 22) . As the threshold SH, a time period for detecting pitch periods (a time period for finding an auto-correlation value) Ts is set. Here, 240 (samples) shall be set. As the initial value of the branching variable upd, a value larger than the threshold SH (the threshold SH < the branching variable upd) is set. Here, 300 (samples) shall be set as the initial value of the branching variable upd.
Description is now made of a case where 0.5 times the standard speed is set as the reproduction speed.
When the threshold SH and the initial value of the branching variable upd are set, the pitch period detecting means 1 and the time-axis compressing means 4 start to read an input voice signal (step 23).
At the step 23 first carried out after the reproduction operation is started, the reading of input voice signals which are 240 samples corresponding to the time period Ts for detecting pitch periods is started. At the step 23 second and later carried out, the reading of input voice signals corresponding to the number of samples which have been decompressed is started. When data corresponding to 100 samples, for example, are decompressed, the reading of the data corresponding to 100 samples is started at the step 3 next carried out.
The pitch period judging means 2 compares the branching variable upd with the threshold SH (step 24) . Ts < upd immediately after the initial value of the branching variable upd is set at the step 22. At the step 24 first carried out, therefore, it is judged that the branching variable upd is more than the threshold SH. Thereafter, the program proceeds to the step 25. At the step 25, a switch 5 is switched to a contact a. Further, pitch period detecting processing by pitch period detecting means 1 is performed (step 26). The pitch period Tp obtained by the pitch period detecting means 1 is fed to the time-axis decompressing means 8 through the switch 5, and is fed to a buffer 3 and stored therein.
The branching variable upd is updated to a factor Q1' times Tp (Q1' × Tp) (step 27) . The factor Q1' is a value determined by the set reproduction speed, and is set to "3" when the reproduction speed is 0.5 times the standard speed, as shown in Table 2. Thereafter, the time-axis decompressing means 8 subjects an input voice signal to time-axis decompression processing on the basis of the pitch period Tp fed from the pitch period detecting means 1 through the switch 5 (step 28).

reproduction speed Q1' Q2'

0.8 6 2

0.75 5 2

0.7 4 2

0.6 3.5 1.5

0.5 3 1
The time-axis decompressing means 8 performs time-axis decompression processing corresponding to the reproduction speed. In this case, the reproduction speed is 0.5 times the standard speed. Accordingly, the time-axis decompression processing, described using Fig. 7, is performed.
Thereafter, it is judged whether or not the reproduction processing is terminated (step 29). When the reproduction processing is not terminated (No at step 29), the program is returned to the step 23. At the step 23, the reading of input voice signals is started. The branching variable upd and the unit time Ts are compared with each other (step 24). At the step 24 second carried out, used as the branching variable upd is the branching variable (upd = Q1' × Tp) obtained by the updating at the step 27. That is, it is judged whether or not Q1' times the pitch period Tp detected last time is larger than the time period Ts for detecting pitch periods (= the threshold SH). When the reproduction speed is 0.5 times the standard speed, it is judged whether o,r not three times the pitch period Tp detected last time is larger than the threshold SH.
When the branching variable upd is more than the threshold SH, it is judged that the pitch period is relatively long. Accordingly, the processing at the steps 25 to 29 is performed again. Thereafter, the program is returned to the step 23. The processing at the steps 23 to 29 is repeatedly performed until it is judged at the step 24 that the branching variable upd is not more than the threshold SH.
When it is judged at the step 24 that the branching variable upd is not more than the threshold SH, it is judged that the pitch period is relatively short. Accordingly, pitch detecting processing for an input voice waveform of one pitch period subsequent to an input voice waveform of three pitch periods used in the last decompression processing is omitted. That is, it is assumed that the same pitch period as the pitch period found last time is further repeated once. In this case, in order to perform time-axis decompression processing, considering the pitch period most newly detected in the past which is stored in the buffer 3 as the pitch period of the input voice signal corresponding to one pitch period to be currently found, the switch 5 is switched to a contact b (step 30).
Furthermore, the branching variable upd is updated to a value {upd + (Q2' × Tp)} obtained by adding Q2' times the pitch period Tp most newly detected at the step 26 to the current branching variable upd (step 31).
The factor Q2' is a value determined by the set reproduction speed, as shown in Table 2. When the reproduction speed is 0.5 times the standard speed, the factor Q2' is set to "2" . Letting Tp be the pitch period most newly detected at the step 26, therefore, the branching variable upd is updated to 4 Tp at the step 31.
Thereafter, the time-axis compressing means 8 subjects the input voice signal to time-axis decompression processing on the basis of the pitch period Tp fed from the buffer 3 through the switch 5 (step 28). When the reproduction processing is not terminated (NO at step 29), the program is returned to the step 23. At the step 23, the reading of the input voice signals is started.
Processing at the steps 23, 24, 30, 31, 28, and 29 is repeated until it is judged at the step 24 that the branching variable upd is more than the threshold SH. In this case, the branching variable is updated to a value which is larger by 1Tp at the step 31.
As described in the foregoing, when it is judged at the step 24 that the branching variable upd is not more than the threshold SH, no pitch period detecting processing is performed. Further, the switch 5 is switched to the contact b. Accordingly, the time-axis decompression processing is performed on the basis of the pitch period most newly detected which is stored in the buffer 3.
That is, when the pitch period is short, the necessity of detecting the pitch period of the voice signal at the step 26 is eliminated. Accordingly, a processing load on the pitch period detecting means 1 is reduced.

[3] Description of Third Embodiment

Fig. 8 illustrates the configuration of a reproducing apparatus for performing fast listening of a voice signal and slow listening of a voice signal.
In Fig. 8, reference numeral 12 denotes ADPCM (Adaptive Differential Pulse Code Modulation) coding means for coding an inputted voice signal by existing ADPCM processing. Reference numeral 9 denotes a memory storing a signal coded by the ADPCM coding means 12. Reference numeral 13 denotes ADPCM decoding means for decoding the signal from the memory 9.
Reference numeral 15 denotes a fast listening reproducing apparatus described in Fig. 1. Reference numeral 16 denotes a slow listening reproducing apparatus described in Fig. 5. Reference numeral 14 denotes selecting means for selecting either one of the fast listening reproducing apparatus 15 and the slow listening reproducing apparatus 16.
When the fast listening reproducing apparatus 15 is selected by the selecting means 14, the signal from the ADPCM decoding means 13 is fed to the fast listening reproducing apparatus 15, where time-axis compression processing is performed. In this case, therefore, an output signal for fast listening is obtained.
When the slow listening reproducing apparatus 16 is selected by the selecting means 14, the signal from the ADPCM decoding means 13 is fed to the slow listening reproducing apparatus 16, where time-axis decompression processing is performed. In this case, therefore, an output signal for slow listening is obtained.

[4] Description of Fourth Embodiment

Fig. 9 illustrates an audio recording/reproducing apparatus.
In Fig. 9, the same units as those shown in Fig. 1 are assigned the same reference numerals and hence, the description thereof is not repeated.
In the voice signal recording/reproducing apparatus, a memory 9 and time-axis decompressing means 8 are added to the reproducing apparatus shown in Fig. 1. At the time of recording, a voice signal which has been time-axis compressed by time-axis compressing means 4 and a pitch period used in the case of time-axis compression are stored in the memory 9.
At the time of reproduction, the voice signal which has been time-axis compressed and the pitch period are read out of the memory 9, and are fed to the time-axis decompressing means 8. The time-axis decompressing means 8 decompresses and outputs the voice signal, which has been time-axis compressed, read out of the memory 9 on the basis of the pitch period read out of the memory 9.
The audio recording/reproducing apparatus is an apparatus for not performing fast listening or slow listening of a voice signal but recording a lot of voice signals in a small number of memories by storing the voice signal in the memory with the voice signal time-axis compressed.

[5] Description of Fifth Embodiment

Fig. 10 illustrates an audio recording/reproducing apparatus.
In Fig. 10, the same units as those shown in Fig. 9 are assigned the same reference numerals and hence, the description thereof is not repeated.
The voice signal recording/reproducing apparatus differs from the recording/reproducing apparatus shown in Fig. 9 in that a pitch period detecting device for detecting a pitch period used when time-axis decompression processing is performed by time-axis decompressing means 8 on the basis of a voice signal read out of a memory 9 is added, and only a voice signal which has been time-axis compressed by time-axis compressing means 4 is stored and a pitch period used in the case of the time-axis compression is not stored in the memory 9.
The added pitch period detecting device comprises pitch period detecting means 21, pitch period judging means 22, a buffer 23, threshold setting means 25, and a switch 24, and detects a pitch period in the same method as the pitch period detecting method described in Fig. 3 or 6. In the audio recording/reproducing apparatus, the pitch period need not be stored in the memory 9, thereby making it possible to reduce the capacity of the memory 9, as compared with that in the recording/reproducing apparatus shown in Fig. 9.

[6] Description of Sixth Embodiment

Fig. 11 illustrates an audio recording/reproducing apparatus. Band dividing and coding means 10 and band dividing and decoding means 11 are further added to the audio recording/reproducing apparatus shown in Fig. 9.
In the audio recording/reproducing apparatus, a voice signal which has been compressed along, time-axis by time-axis compressing means 4 is also compressed along the frequency band by the band dividing and coding means 10, thereby making it possible to reduce the capacity of the memory 9, as compared with that in the recording/reproducing apparatus shown in Fig. 9.

[7] Description of Seventh Embodiment

Fig. 12 illustrates an audio recording/reproducing apparatus. The audio recording/reproducing apparatus differs from the audio recording/reproducing apparatus shown in Fig. 11 in that a pitch period detecting device for detecting a pitch period used when time-axis decompression processing is performed by time-axis decompressing means 8 on the basis of a voice signal read out of a memory 9 is added, and only a voice signal which has been time-axis compressed by time-axis compressing means 4 is stored and a pitch period used in the case of the time-axis compression . is not stored in the memory 9.
The added pitch period detecting device comprises pitch period detecting means 21, pitch period judging means 22, a buffer 23, threshold setting means 25, and a switch 24, and detects a pitch period in the same method as the pitch period detecting method described in Fig. 3 or 6. In the audio recording/reproducing apparatus, the pitch period need not be stored in the memory 9, thereby making it possible to reduce the capacity of the memory 9, as compared with that in the recording/reproducing apparatus shown in Fig. 11.

Claims

In a voice signal pitch period detecting method for detecting the pitch period of an input voice waveform by taking a predetermined number of pitch periods on the basis of the input voice waveform of a predetermined time period, a voice signal pitch period detecting method characterized by
reducing, when the detected pitch period is not more than a predetermined reference value, the number of times of pitch period detecting processing by considering the pitch period of a waveform of a predetermined number of pitch periods subsequent to a waveform of the predetermined number of pitch periods detected the same as the currently detected pitch period.
In a voice signal pitch period detecting method for detecting the pitch period of an input voice waveform by taking a predetermined number of pitch periods on the basis of the input voice waveform of a predetermined time period, a voice signal pitch period detecting method characterized by
judging whether the detected pitch period is long or short on the basis of the ratio of the detected pitch period to said predetermined time period, and reducing, when it is judged that the detected pitch period is short, the number of times of pitch period detecting processing by considering the pitch period of a waveform of a predetermined number of pitch periods subsequent to a waveform of the predetermined number of pitch periods detected the same as the detected pitch period.
A pitch period detecting device comprising:

first means for detecting the pitch period of an input voice waveform by taking a predetermined number of pitch periods on the basis of the input voice waveform of a predetermined time period;

second means for judging whether or not the detected pitch period is not more than a predetermined reference value;

third means for causing, when it is judged that the detected pitch period is more than the predetermined reference value, the first means to detect the pitch period of a waveform of a predetermined number of pitch periods subsequent to a waveform of the predetermined number of pitch periods detected; and

fourth means for determining, when it is judged that the detected pitch period is not more than the predetermined reference value, the pitch period of the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected to be the same as the currently detected pitch period, and omitting the pitch period detecting processing by the first means with respect to the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected.
A pitch period detecting device comprising:

first means for detecting the pitch period of an input voice waveform by taking a predetermined number of pitch periods on the basis of the input voice waveform of a predetermined time period;

second means for judging whether the detected pitch period is long or short on the basis of the ratio of the detected pitch period to said predetermined time period;

third means for causing, when it is judged that the detected pitch period is long, the first means to detect the pitch period of a waveform of a predetermined number of pitch periods subsequent to a waveform of the predetermined number of pitch periods detected; and

fourth means for determining, when it is judged that the detected pitch period is short, the pitch period of the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected to be the same as the currently detected pitch period, and omitting the pitch period detecting processing by the first means with respect to the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected.
A voice signal time-axis compressing device comprising:

pitch period detecting means for detecting the pitch period of an input voice waveform; and

time-axis compressing means for time-axis compressing the input voice waveform on the basis of the pitch period detected by the pitch period detecting means,

the pitch period detecting means comprising

first means for detecting the pitch period of the input voice waveform by taking a predetermined number of pitch periods on the basis of the input voice waveform of a predetermined time period,

second means for judging whether or not the detected pitch period is not more than a predetermined reference value,

third means for causing, when it is judged that the detected pitch period is more than the predetermined reference value, the first means to detect the pitch period of a waveform of a predetermined number of pitch periods subsequent to a waveform of the predetermined number of pitch periods detected, and

fourth means for determining, when it is judged that the detected pitch period is not more than the predetermined reference value, the pitch period of the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected to be the same as the currently detected pitch period, and omitting the pitch period detecting processing by the first means with respect to the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected.
A voice signal time-axis compressing device comprising:

pitch period detecting means for detecting the pitch period of an input voice waveform; and

time-axis compressing means for time-axis compressing the input voice waveform on the basis of the pitch period detected by the pitch period detecting means,

the pitch period detecting means comprising

first means for detecting the pitch period of the input voice waveform by taking a predetermined number of pitch periods on the basis of the input voice waveform of a predetermined time period,

second means for judging whether the detected pitch period is long or short on the basis of the ratio of the detected pitch period to said predetermined time period;

third means for causing, when it is judged that the detected pitch period is long, the first means to detect the pitch period of a waveform of a predetermined number of pitch periods subsequent to a waveform of the predetermined number of pitch periods detected, and

fourth means for determining, when it is judged that the detected pitch period is short, the pitch period of the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected to be the same as the currently detected pitch period, and omitting the pitch period detecting processing by the first means with respect to the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected.
A voice signal time-axis decompressing device comprising:

pitch period detecting means for detecting the pitch period of an input voice waveform; and

time-axis decompressing means for time-axis decompressing the input voice waveform on the basis of the pitch period detected by the.pitch period detecting means,

the pitch period detecting means comprising

first means for detecting the pitch period of the input voice waveform by taking a predetermined number of pitch periods on the basis of the input voice waveform of a predetermined time period,

second means for judging whether or not the detected pitch period is not more than a predetermined reference value,

third means for causing, when it is judged that the detected pitch period is more than the predetermined reference value, the first means to detect the pitch period of a waveform of a predetermined number of pitch periods subsequent to a waveform of the predetermined number of pitch periods detected, and

fourth means for determining, when it is judged that the detected pitch period is not more than the predetermined reference value, the pitch period of the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected to be the same as the currently detected pitch period, and omitting the pitch period detecting processing by the first means with respect to the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected.
A voice signal time-axis decompressing device comprising:

pitch period detecting means for detecting the pitch period of an input voice waveform; and

time-axis decompressing means for time-axis decompressing the input voice waveform on the basis of the pitch period detected by the pitch period detecting means,

the pitch period detecting means comprising

first means for detecting the pitch period of the input voice waveform by taking a predetermined number of pitch periods on the basis of the input voice waveform of a predetermined time period,

second means for judging whether the detected pitch period is long or short on the basis of the ratio of the detected pitch period to said predetermined time period;

third means for causing, when it is judged that the detected pitch period is long, the first means to detect the pitch period of a waveform of a predetermined number of pitch periods subsequent to a waveform of the predetermined number of pitch periods detected, and

fourth means for determining, when it is judged that the detected pitch period is short, the pitch period of the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected to be the same as the currently detected pitch period, and omitting the pitch period detecting processing by the first means with respect to the waveform of the predetermined number of pitch periods subsequent to the waveform of the predetermined number of pitch periods detected.