EP0702354A1 - Appareil pour modifier l'échelle de temps pour la modification du langage - Google Patents

Appareil pour modifier l'échelle de temps pour la modification du langage Download PDF

Info

Publication number
EP0702354A1
EP0702354A1 EP95306302A EP95306302A EP0702354A1 EP 0702354 A1 EP0702354 A1 EP 0702354A1 EP 95306302 A EP95306302 A EP 95306302A EP 95306302 A EP95306302 A EP 95306302A EP 0702354 A1 EP0702354 A1 EP 0702354A1
Authority
EP
European Patent Office
Prior art keywords
speech
speed
section
time scale
modification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP95306302A
Other languages
German (de)
English (en)
Inventor
Takeshi Norimatsu
Masayuki Misaki
Koji Watanabe
Norikazu Ueno
Kazuhiko Sato
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Panasonic Holdings Corp
Original Assignee
Matsushita Electric Industrial Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from JP22013194A external-priority patent/JP3189587B2/ja
Priority claimed from JP26020694A external-priority patent/JP3189597B2/ja
Application filed by Matsushita Electric Industrial Co Ltd filed Critical Matsushita Electric Industrial Co Ltd
Publication of EP0702354A1 publication Critical patent/EP0702354A1/fr
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/04Time compression or expansion

Definitions

  • the present invention relates to a speech time scale modification apparatus capable of varying a reproduction speed without changing pitch of an acoustic signal mainly of speech, and more particularly to a speech time scale modification apparatus used for variable speed reproduction of an acoustic signal in a video tape recorder (VTR) or a language learning system.
  • VTR video tape recorder
  • AV audio and visual
  • the reproduced speech in a case of varying the reproduction speed of the speech recorded on a tape, usually, when the speed is reproduced at the high speed or the low speed, the reproduced speech is also changed in the pitch, and it is very hard to hear the reproduced speech. For example, when reproduced at the high speed, the pitch is higher, and when reproduced at the low speed, the pitch is lower. Therefore, it is general to process the speech so as not to change the pitch in such systems at the time of variable speed reproduction of speech.
  • the reproducing speed is fixed, and as the reproducing speed is further from the recording speed, it is harder to hear the speech.
  • the reproduction speed of the speech in a case of viewing the pictures of the VTR or the like slowly or quickly, when the reproduction speed of the tape is changed, the reproduction speed of the speech also changes along with the pictures and it is very hard to hear the speech in the conventional speech time scale modification apparatus.
  • a primary object of the invention is to present a speech time scale modification apparatus which, when playing back an audio signal containing speech from a recording medium at a playback speed different from a recording speed, reproduces the speech at a speed close to the recording speed by sequentially changing a reproducing speed of a speech portion depending on a quantity of a speechless portion in the audio signal in a range between the playback speed and the recording speed, thereby enabling to reproduce the speed at a clearly recognizable quality. It is another object of the invention to realize a speech time scale modification apparatus allowing, when playing back at the same speed as the recording speed, to hear easily rapid speech by properly changing the speech to a slow speed below the recording speed depending on the quantity of the speechless portion.
  • the invention provides a speech time scale modification apparatus capable of notably improving clarity of the speech in variable speed reproduction, by detecting the speechless portion of an acoustic signal being read out from a recording medium, and compressing or expanding the speechless portion, and sequentially changing the compressing or expanding ratio of speech portion depending on the quantity of the speechless portion.
  • a speech time scale modification apparatus comprises a recording and reproducing section for reproducing an acoustic signal recorded in a recording medium at a reproduction speed higher than a recording speed, a speech judging section for judging a speechless portion and a speech portion of the acoustic signal, a buffer memory for storing data of the reproduced acoustic signal, a write control section for controlling a write address of the buffer memory so as to write the data of the acoustic signal judged to be the speech portion in the speech judging section into the buffer memory, a read control section for controlling reading of the data from the buffer memory and a read address of the buffer memory, a residual storage data amount monitor section for monitoring a residual storage data amount in the buffer memory from a current write address of the buffer memory and a current read address of the buffer memory, an adaptive speed control section for determining a modification speed of the data depending on the residual storage data amount obtained from the residual storage data amount monitor section, and a time scale compressing section for compress
  • a speech time scale modification apparatus comprises a recording and reproducing section for reproducing an acoustic signal recorded in a r-ecording medium at the same speed as a recording speed, a speech judging section to judge a speechless portion and a speech portion of the acoustic signal, a buffer memory for storing data of the acoustic signal, a write control section for controlling a write address of the buffer memory so as to write the data of the acoustic signal judged to be the speech portion in the speech judging section into the buffer memory, a read control section for controlling reading of the data from the buffer memory and a read address of the buffer memory, a residual storage data amount monitor section for monitoring a residual storage data amount in the buffer memory from a current write address of the buffer memory and a current read address of the buffer memory, an adaptive speed control section for determining a modification speed depending on the residual storage data amount from the residual storage data amount monitor section, and a time scale expanding section for expanding time scale of the acou
  • a speech time scale modification apparatus comprises a recording and reproducing section for reproducing an acoustic signal recorded in a recording medium at a reproduction speed lower than a recording speed, a speech judging a section for judging a speechless portion and a speech portion of the acoustic signal, an input buffer for storing data of the acoustic signal, a time scale expanding section for expanding time scale of the data of the acoustic signal of the input buffer by independently setting a time scale expanding ratio to the speechless portion and a time scale expanding ratio to the speech portion from a judging result of the speech judging section, an output buffer for storing output data of the time scale expanding section, a residual storage data amount monitor section for monitoring a residual storage data amount being stored in the output buffer, and expanding ratio control section for determining an expanding ratio of time scale modification of the speech portion and the speechless portion depending on the residual storage data amount.
  • Fig. 1 is a block diagram showing a constitution of an a speech time scale modification apparatus in a first embodiment of the invention.
  • Fig. 2 (a) and Fig. 2 (b) are explanatory diagrams explaining measuring methods of residual storage data amounts in the first embodiment.
  • Fig. 3 (a) is an explanatory diagram of a speed setting method by a linear rule of an adaptive speed control section in the first embodiment.
  • Fig. 3 (b) is an explanatory diagram of a speed setting method by a nonlinear rule of the adaptive speed control section in the first embodiment.
  • Fig. 3 (c) is an explanatory diagram of a speed setting method by a staircase rule of the adaptive speed control section.
  • Fig. 4 is a circuit diagram of a time scale control section in the first embodiment.
  • Fig. 5 (a) shows a data row before processing data in the time scale control section in the first embodiment.
  • Fig. 5 (b) shows a data row after processing the data in the time scale control section in the first embodiment.
  • Fig. 6 is a flow chart showing other operation of a write control section in the first embodiment.
  • Fig. 7 is a block diagram showing a constitution of a speech time scale modification apparatus in a second embodiment of the invention.
  • Fig. 8 (a) is an explanatory diagram of a speed setting method by a linear rule of an adaptive speed control section in the second embodiment.
  • Fig. 8 (b) is an explanatory diagram of a speed setting method by a nonlinear rule of the adaptive speed control section in the second embodiment.
  • Fig. 8 (c) is an explanatory diagram of a speed setting method by a staircase rule of the adaptive speed control section in the second embodiment of the invention.
  • Fig. 9 is a circuit diagram of a time scale control section in the second embodiment.
  • Fig. 10 (a) shows a data row before processing data in the time scale control section in the second embodiment.
  • Fig. 10 (b) shows a data row after processing the data in the time scale control section in the second embodiment.
  • Fig. 11 is a flow chart showing other operation of a write control section in the second embodiment.
  • Fig. 12 is a block diagram showing a constitution of speech time scale modification apparatus in a third embodiment of the invention.
  • Fig. 13 (a) is an explanatory diagram of a first expanding ratio setting table of an expanding ratio determining section in the third embodiment of the invention.
  • Fig. 13 (b) is an explanatory diagram of a second expanding ratio setting table of the expanding ratio determining section.
  • Fig. 14 (a), (b), (c) are principle diagrams showing operations of a time scale expanding section in the third embodiment.
  • the first embodiment relates to a speech time scale modification apparatus capable of sequentially changing to a speed below a reproduction speed depending on a quantity of a speechless portion when reproducing an audio signal recorded in a recording medium at a higher speed than a recording speed.
  • a speech portion and the speechless portion are detected, and only the speech portion is written into a buffer memory having a specific capacity.
  • the data is always output while processing speed modification.
  • a modification speed is properly altered so as to avoid on a basis of a memory remainder in the buffer memory overflow or underflow on the buffer memory. As a result, even in high speed reproduction, it is possible to reproduce the audio signal at a speed below the reproduction speed depending on the quantity of the speechless portion.
  • Fig. 1 is a block diagram showing a constitution of a speech time scale modification apparatus in the first embodiment.
  • an acoustic signal is read out from a recording and reproducing section 101 at a speed of M ( ⁇ 1) times the recording speed.
  • supposing a sampling period of recording in the recording and reproducing section 101 to be T the acoustic signal reproduced at M times speed from the recording and reproducing section 101 is converted into a digital signal series in a sampling period T/M sequentially in an A/D converter 102.
  • the digital signal series is fed into a speech judging section 103, and the speech portion and the speechless portion of the digital signal series are judged. This speech judgement is done, for example, as follows.
  • a write point er Supposing a pointer (hereinafter called a write point er) indicating an address for storing next data on a buffer memory 105 to be Pw
  • the sample value series is sequentially stored at the address in the buffer memory 105 indicated by the write pointer Pw by a write control section 104, and Pw is increased.
  • the write control section 104 stops storing the sample value series in the buffer memory 105. In this way, only data of speech portions are accumulated in the buffer memory 105.
  • the sample value series is judged herein to be the speech portion when the formula (1) is satisfied, and the speechless portion when the formula (1) is satisfied, but a short sample value series judged to be speechless consecutive before or after the sample value series satisfying the formula (1) may be included in the speech portion.
  • a read control section 106 the data in the buffer memory 105 is read out sequentially in the period T, and sent into a time scale control section 109.
  • a pointer hereinafter a read pointer
  • Pr a pointer indicating an address of next data on the buffer memory 105 to be read out
  • a residual storage data amount monitor section 107 by using configuration of the write pointer Pw and the read pointer Pr, a residual storage data amount not read out yet from the buffer memory 105 is measured sequentially.
  • Fig. 2 (a) and Fig. 2 (b) are explanatory diagrams explaining measuring methods of residual storage data amount, and there are two cases Fig. 2 (a) and Fig.
  • a residual storage data amount Z not read out yet is shown in shaded areas in Fig. 2 (a) and Fig. 2 (b). and calculated as follows. This is equivalent when the buffer memory 105 is handled as a so-called cyclic memory.
  • Pw Pr
  • the speed of the time scale modification is set to a slow speed as close to the recording speed as possible when the residual storage data amount is small, or to a properly fast speed so that the write pointer Pw may not catch up with the read pointer Pr when the residual storage data amount is abundant.
  • a maximum value of the modification speed is 2 same as the reproduction speed, and a minimum value of the modification speed is 1 same as the recording speed.
  • FIG. 3 (a), (b), and (c) show a relation between the residual storage data amount and the modification speed, and these are rules for setting the modification speed.
  • Fig. 3 (a) shows a rule of linear correspondence between the residual storage data amount and the modification speed.
  • the modification speed V is calculated in the following formula.
  • V Z n + 1
  • Fig. 3(b) shows an example of a rule of nonlinear correspondence between the residual storage data amount and the modification speed.
  • the modification speed V is calculated as follows.
  • V Z n + 1
  • the modification speed can be changed smoothly depending on increment or decrement of the residual storage data amount, while it is a feature of Fig. 3 (b) that it is stabilized near the recording speed 1 until the data is accumulated to a certain extend in the buffer memory 105.
  • Fig. 3 (c) relates to an example of defining the nonlinear correspondence on a staircase profile, and the modification speed V is calculated as follows.
  • a rule shown in Fig. 3 (c) can realize nearly the same control as the rule in Fig. 3 (b) in a smaller quantity of calculation and circuit scale.
  • the modification speed can be set at an easy-to-hear speed close to recording speed 1 as for an input signal including more than a specified quantity of speechless portion in even a signal reproduced at a double speed, or set at a maximum modification speed 2 if signals without the speechless portion are reproduced, so that data missing does not occur.
  • the maximum value of the modification speed is 2 and the minimum value is 1, but the same rules can be applied if the maximum value is smaller than 2 (for example, 1.8) and the minimum value is greater than 1 (for example, 1.5).
  • Fig. 4 is a block diagram showing a detailed constitution of the time scale compressing section 109.
  • reference numeral 401 denotes a control circuit for controlling the time scale compressing section
  • reference numeral 402 denotes a changeover circuit for changing over cross fade processing section or non-processing section for weighting and adding according to a command from the control circuit
  • reference numeral 403 denotes a latch circuit for temporarily holding the data
  • reference numeral 404 denotes a cross fade circuit for weighting addition processing
  • other sections are same as those in the same names in Fig. 1 and are hence identified with same reference numerals. Referring to Fig. 4, operation of the time scale compressing section 109 is described below.
  • the control circuit 401 first determines cross fade section length K and non-processing section length S in order to realize the modification speed V.
  • the cross fade section length K is fixed, but the K may be variable depending on the modification speed V.
  • Fig. 5 (a) and Fig. 5 (b) are schematic diagrams for explaining the time scale modification processing, and Fig. 5 (a) shows a data row before processing the data, and Fig. 5 (b) shows a data row after processing the data. Besides, a portion corresponding to the cross fade section length K of the data in Fig. 5 (b) shows cross fade processing of data A and data B.
  • the length S should be determined so that 1/V of length (2K + S) of a total of the data A, B, C before processing may be data length (K + S) after time scale processing.
  • the control circuit 401 changes over the change-over circuit 402 to cross fade processing side, and instructs the read control section 106 to read out the data indicated the read pointer Pr.
  • the data is fed to and held in the latch circuit 403.
  • the control circuit 401 instructs the read control section 106 to read out the data indicated by the address of Pr + K of k samples ahead, and the data indicated by the address of Pr + K is put directly into the cross fade circuit 404.
  • the cross fade circuit 404 executes weighted addition by using the data indicated by the read pointer Pr and the data indicated by the address of Pr + K.
  • the time scale modification for giving the modification speed V is realized.
  • the modification speed set in the adaptive speed control section 108 is changed at a certain point, the non-processing section length is varied in the expression (6), and similar processing is continued thereafter, thereby varying the modification speed as desired.
  • the data row thus processed by time scale modification is finally converted into analog signal at the period T in the D/A converter 110, thereby obtaining an audio signal adaptively changing over the speed below the reproducing speed M at same pitch as in recording.
  • the apparatus for time scale modification of speech comprises the speech judging section 103, the memory remainder monitor section 107 for measuring the memory remainder from the configuration of the write pointer and the read pointer, and the adaptive speed control section 108 for determining the speed of time scale modification depending on the memory remainder, the modification speed is controlled to be gradually slower when the residual storage data amount is less and gradually faster when the residual storage data amount is much, so that the audio signal reproduced at a high speed may be heard at a slow speed below the reproducing speed depending on the quantity of the speechless portion contained therein, and at a high speed with almost no missing of information.
  • time scale modification at high quality is realized, and in particular when the cross fade section length is fixed at a preset value, an arbitrary speed of time scale modification is achieved only by changing the length of non-processing section, so that the speech time scale modification apparatus can be realized in a very simple constitution.
  • the recording and reproducing section accompanied by images such as the VTR for example, the images can be reproduced at double speed, and only the sound may be reproduced at a slow speed of less than the double speed, and hence its effect is great.
  • Fig. 6 is a flow chart showing other operation of the write control section. Referring now to Fig. 6, the other operation of the write control section is described below.
  • the write control section 104 sequentially takes in the values of the residual storage data amount Z measured by the residual storage data amount monitor section 107 (S601), and compares with the preset threshold value Zth (S602). Herein, if Z is greater than Zth, or there is enough residual storage data amount, it is judged if the present input data is speech or speechless from the result of the speech judging section 103 (S603), and is written into the buffer memory 105 only in the case of the speech portion (S604), and the write pointer Pw is incremented (S605). If not satisfying a judging condition at S602, or there is no enough residual storage data amount, regardless of judgement of speech, the data is written into the buffer memory 105, and the write pointer Pw is increased. In this series of processing, specifically, it is controlled so that, in the case of signal containing much speechless portion, the read pointer Pr may not catch up with the write pointer Pw in Fig. 2 (a), that is, the residual storage data amount may not become 0.
  • analog signals are recorded in the recording and reproducing section 101, but it may be realized similarly when handling digital signals.
  • the digital signals of the sampling period T are directly fed into the speech judging section 103, and the same processing is carried out thereafter, so that the signals adaptively modified in time scale are output.
  • FIG. 7 is a block diagram showing a constitution of a speech time scale modification apparatus in the second embodiment. An operation of the second embodiment is specifically described below.
  • This digital signal is sequentially fed into a speech judging circuit 103 to judge a speech or speechless portion, and only the signal judged to be a speech portion is written into a buffer memory 105 while a write control section 104 controls the pointer Pw of the address to be written in.
  • a read control section 106 reads out the data sequentially from the buffer memory 105 and sends out into a time scale expanding section 702, while controlling a read pointer Pr.
  • the residual storage data amount Z not being readout is measured from the current read pointer Pr and the current write pointer Pw. So far, the operation is same as in the first embodiment, except that the value of the reproduction speed M is different.
  • an adaptive speed control section 701 On the basis of the value of the residual storage data amount Z obtained in the residual storage data amount monitor section 107, in an adaptive speed control section 701, the speed of time scale modification is set to a slower speed than the recording speed 1 when the residual storage data amount is less, or to a speed close to the recording speed 1 adequately so that the write pointer Pw may not catch up with the read pointer Pr when the residual storage data amount is much.
  • the maximum value of the modification speed is supposed to be 1 same as the reproducing speed, and the minimum value to be V o (where 0 ⁇ V o ⁇ 1).
  • Fig. 8 (a) Fig.
  • Fig. 8 (b), and Fig. 8 (c) show the relation of the residual storage data amount and the corresponding modification speed, and present rules for setting the modification speed.
  • Fig. 8 (a) shows a rule of linear correspondence between the residual storage data amount and the modification speed.
  • Fig. 8 (b) shows an example of a rule of nonlinear correspondence between the residual storage data amount and the modification speed.
  • the modification speed can be smoothly changed depending on increment or decrement of the residual storage data amount, while in the case of Fig. 8 (b), it is stabilized nearly at the recording speed 1 until the data are accumulated to a certain extent in the buffer memory 105.
  • Fig. 8 (c) shows a case of staircase definition of the nonlinear correspondence, and the modification speed V can be calculated as follows.
  • the rule shown in Fig. 8 (c) can realize nearly same control as in the rule in Fig. 8 (b) in a smaller quantity of operation and circuit scale.
  • a slow speed Vo less than the recording speed may be realized in the signal input containing more than specified quantity of the speechless portion.
  • the maximum modification speed 1 is set, so that data missing does not occur.
  • the value of the modification speed V determined in the adaptive speed control section 701 is sent out into the time scale expanding section 702, and the time scale is modified depending on the modification speed V.
  • Fig. 9 is a block diagram showing a detailed description of the time scale expanding section 702.
  • reference numeral 901 is a control circuit for controlling the entire time scale expanding section
  • reference numeral 902 is a changeover circuit for changing over cross fade processing section or non-processing section for weighting and adding according to the command from the control circuit
  • reference numeral 903 is a latch circuit for temporarily holding the data
  • reference numeral 904 is a cross fade circuit for weighting addition processing
  • other sections are same as those in the same names in Fig. 1 and are hence identified with same reference numerals. Referring to Fig. 9, an operation of the time scale expanding section 702 is described below.
  • the control circuit 901 first determines the cross fade section length K and the non-processing section length S in order to realize the modification speed V.
  • the cross fade section length is fixed value K, but the value of K may be variable depending on the modification speed V.
  • Fig. 10 are schematic diagrams for explaining the time scale modification processing, and Fig. 10 (a) shows the data before processing, and Fig. 10 (b) shows the data after processing. Besides, the portion corresponding to the length K enclosed by data row A and data row B is the data row obtained by cross fade processing of the data row A and the data row B.
  • the length S should be determined so that 1/V of the length (2K + S) of the total of the data rows before processing A, B, C may be the data length (3K + S) after time scale processing.
  • the cross fade processing comprises three processes.
  • Fig. 11 is a flow chart showing part of the cross fade processing.
  • the control circuit 901 changes over the changeover circuit 902 to non-processing side (S1101). It consequently commands the read control section 106 to read out the data indicated by the read pointer Pr (S1102).
  • the read data is put into the D/A converter 110 without being processed (S1103).
  • the read pointer Pr is increased (S1104). The same operation is repeated until data row A is processed completely.
  • the control circuit 901 commands the read control section 106 so that the read pointer Pr may indicate the beginning data of the data row A.
  • the control circuit 901 changes over the changeover circuit 902 to the cross fade processing side, and commands the read control section 106 to read out the data indicated by the pointer Pr.
  • the data are fed and held in the latch circuit 903.
  • the control circuit 901 commands the read control section 106 to read out the data shown by address Pr+k of k samples ahead, and the data are directly put into the cross fade circuit 904.
  • the cross fade circuit executes weighted addition by using these two sets of data.
  • the read pointer Pr indicates the beginning of the data row B, and the same processing on the data row in the first process is conducted on the data row B. More specifically, the control circuit 901 changes over the changeover circuit 902 to the non-processing side. It also commands the read control section 106 to read out the data indicated by the read pointer Pr. The read data is put into the D/A converter 110 directly without being processed. Finally, the read pointer Pr is increased. This series of processing is repeated on the data row B.
  • control circuit 901 changes over the changeover circuit 902 to non-processing side, and the number of data corresponding to the length S determined in formula (11) is read out from the buffer memory 105, and directly put into the D/A converter 110.
  • the data row thus modified in time scale is finally converted into an analog signal in the period T by the D/A converter 110, thereby obtaining the acoustic signal adaptively changing over the speed below the recording speed 1 at the same pitch as when recording.
  • the operation of the write control section 104 in Fig. 7 can be changed to the flow chart in Fig. 6 same as in the first embodiment.
  • the speech judging section 103 comprising the speech judging section 103, the residual storage data amount monitor section 107, and the adaptive speed control section 701 for determining the speed of time scale modification depending on the residual storage data amount, by controlling at a speed close to the reproducing speed 1 when the residual storage data amount is much, and at a slow speed below 1 gradually when the residual storage data amount is less, the sound signal reproduced at the recording speed can be heard at a slow speed below the recording speed depending on the quantity of the speechless portion contained therein. It is particularly effective for hearing sound signal of fast speech.
  • analog signals are recorded in the recording and reproducing section 101, but it may be realized similarly in the case of digital signals.
  • the digital signals of the sampling period T are directly fed into the speech judging section 103, and the same processing is carried out thereafter, so that the signals adaptively modified in time scale are output.
  • Fig. 12 is a block diagram showing a constitution of the speech time scale modification apparatus in the third embodiment. Its operation is descried in detail below while referring to Fig. 12.
  • acoustic signals are read out at a speed of M times (0 ⁇ M ⁇ 1) of recording speed are readout.
  • the sampling period in recording in the recording and reproducing section 1201 Supposing the sampling period in recording in the recording and reproducing section 1201 to be T, the acoustic signals reproduced at M times speed from the recording and reproducing section 1201 are sequentially changed into digital signal series at sampling period T/M by the A/D converter, and written into an input buffer 1203.
  • the data being read out from the input buffer 1203 is fed into the speech judging section 1204, where the sample value row is judged to be the speech portion or the speechless portion.
  • the speech or speechless judgement may be done in the condition in the formula (1) explained in the first embodiment.
  • a time scale expanding section 1206 processes time scale expansion on the data being read out from an input buffer 1203, and issues to an output buffer 1208.
  • the residual storage data not issued to a D/A converter 1211 is monitored in every specific time in a residual storage data monitor section 1209, and depending on the remainder, consequently, an expanding ratio determining section 1210 determines an expanding ratio Es for speechless portion in the speechless portion, and an expanding ratio Ev for speech portion in the speech portion.
  • FIG. 13 (a) and Fig. 13 (b) are explanatory diagrams showing a setting method of expanding ratio in the expanding ratio determining section 1210.
  • the example in Fig. 13 (a) is a case of correspondence of the residual storage data and expanding ratio by linear function, which prevents from being empty by increasing an expanding rate when the residual storage data Z obtained in the residual storage data monitor section 1209 is less, that is, when the output buffer 1208 is nearly empty.
  • the expanding rates Es, Ev for speechless portion and speech portion are obtained in formulas (13) and (14) respectively.
  • E s - 1. 5 N ⁇ Z + 3.
  • E v - 0. 5 N ⁇ Z + 1.
  • the expanding ratio of the speechless portion is larger than the expanding ratio of the speech portion because it is intended to prevent the output buffer 1208 from being empty if the expanding rate of the speech portion is lowered.
  • the expanding rate is 1.0 so far as the residual storage data in the speech portion is not 0, that is, it is reproduced at the same speed as the recording speed.
  • the speechless portion corresponds by a quadratic function.
  • the expanding ratio determining section 1210 determines the expanding ratios Ev, Es of speech and speechless portions in every specific period according to the rule shown in Fig. 13, and sends to a time scale control section 1206.
  • the time scale control section 1206, on the basis of the expanding ratios the time scale is expanded at the expanding ratio Ev of speech portion in the speech portion and the expanding ratio Es of speechless portion in the speechless portion.
  • Fig. 14 (a) shows the time series of input signals in recording
  • blocks 1, 2, 3 are the speechless portions
  • blocks 4, 5, 6 are the speech portions
  • the signal row after processing is shown, at the expanding ratio Ev of speech portion of 1.0 as given by the expanding ratio determining section 1210, and the expanding ratio Es of speechless portion of 2.0.
  • the time scale modification of expanding ratio 2.0 is realized by inserting the cross fade processing section in formula (12), and the data is accumulated in the output buffer 1208.
  • the data is directly accumulated in the output buffer 1208.
  • the expanding ratios obtained from the expanding ratio determining section 1210 are changed, the expanding ratio is set again in the time scale expanding section 1206, and the time scale expanding process as shown in Fig. 14 (c) is continued.
  • the expanding ratio can be set independently for the speechless portion and the speech portion even if the rate of the speechless portion in the signal cannot be expected.
  • the third embodiment by independently setting the time scale expanding ratio in the speech portion and the speechless portion depending on the residual storage data, setting the expanding ratio of speech portion at 1/M when the residual storage data is less than a predetermined quantity to prevent the output signal from being interrupted, and controlling the expanding ratio so that the speech portion may be close to the sound speed in recording as far as possible, easy-to-hear reproduced sound without feel of strangeness can be obtained even if the reproducing speed from the recording medium is slow.
  • analog signals are recorded in the recording and reproducing section 1201, but it may be realized similarly in the case of digital signals.
  • the digital signals of the sampling period T are directly fed into the input buffer 1203, and the same processing as in the third embodiment is carried out thereafter, so that-the signals adaptively modified in time scale are output.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Signal Processing For Digital Recording And Reproducing (AREA)
  • Signal Processing Not Specific To The Method Of Recording And Reproducing (AREA)
EP95306302A 1994-09-14 1995-09-08 Appareil pour modifier l'échelle de temps pour la modification du langage Withdrawn EP0702354A1 (fr)

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
JP22013294 1994-09-14
JP220132/94 1994-09-14
JP220131/94 1994-09-14
JP22013194A JP3189587B2 (ja) 1994-09-14 1994-09-14 音声時間軸変換装置
JP260206/94 1994-10-25
JP26020694A JP3189597B2 (ja) 1994-10-25 1994-10-25 音声時間軸変換装置

Publications (1)

Publication Number Publication Date
EP0702354A1 true EP0702354A1 (fr) 1996-03-20

Family

ID=27330404

Family Applications (1)

Application Number Title Priority Date Filing Date
EP95306302A Withdrawn EP0702354A1 (fr) 1994-09-14 1995-09-08 Appareil pour modifier l'échelle de temps pour la modification du langage

Country Status (1)

Country Link
EP (1) EP0702354A1 (fr)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1997046999A1 (fr) * 1996-06-05 1997-12-11 Interval Research Corporation Modification non uniforme de l'echelle du temps de signaux audio enregistres
EP0907161A1 (fr) * 1997-09-18 1999-04-07 Victor Company Of Japan, Ltd. Appareil de traitement de signaux audio
EP0939401A1 (fr) * 1997-09-12 1999-09-01 Nippon Hoso Kyokai Procede de traitement de sons, processeur de sons, et dispositif d'enregistrement/de reproduction
WO2003001510A1 (fr) * 2001-06-21 2003-01-03 Sony Corporation Appareil et procede de traitement de signaux numeriques et systeme de reproduction/reception de signaux numeriques
MY118991A (en) * 1997-09-22 2005-02-28 Victor Company Of Japan Apparatus for processing audio signal

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0144689A1 (fr) * 1983-10-27 1985-06-19 Nec Corporation Dispositif de comparaison de formes
EP0318858A2 (fr) * 1987-11-25 1989-06-07 Nec Corporation Système pour la reconnaissance de mots enchaînés comprenant des réseaux de neurones disposés le long d'un axe des temps associé au signal
EP0356568A1 (fr) * 1988-09-02 1990-03-07 Siemens Aktiengesellschaft Procédé et arrangement pour la reconnaissance du locuteur dans un central téléphonique

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0144689A1 (fr) * 1983-10-27 1985-06-19 Nec Corporation Dispositif de comparaison de formes
EP0318858A2 (fr) * 1987-11-25 1989-06-07 Nec Corporation Système pour la reconnaissance de mots enchaînés comprenant des réseaux de neurones disposés le long d'un axe des temps associé au signal
EP0356568A1 (fr) * 1988-09-02 1990-03-07 Siemens Aktiengesellschaft Procédé et arrangement pour la reconnaissance du locuteur dans un central téléphonique

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1997046999A1 (fr) * 1996-06-05 1997-12-11 Interval Research Corporation Modification non uniforme de l'echelle du temps de signaux audio enregistres
US5828994A (en) * 1996-06-05 1998-10-27 Interval Research Corporation Non-uniform time scale modification of recorded audio
AU719955B2 (en) * 1996-06-05 2000-05-18 Interval Research Corporation Non-uniform time scale modification of recorded audio
EP0939401A1 (fr) * 1997-09-12 1999-09-01 Nippon Hoso Kyokai Procede de traitement de sons, processeur de sons, et dispositif d'enregistrement/de reproduction
EP0939401A4 (fr) * 1997-09-12 2000-07-19 Japan Broadcasting Corp Procede de traitement de sons, processeur de sons, et dispositif d'enregistrement/de reproduction
US6360198B1 (en) * 1997-09-12 2002-03-19 Nippon Hoso Kyokai Audio processing method, audio processing apparatus, and recording reproduction apparatus capable of outputting voice having regular pitch regardless of reproduction speed
EP0907161A1 (fr) * 1997-09-18 1999-04-07 Victor Company Of Japan, Ltd. Appareil de traitement de signaux audio
US6035009A (en) * 1997-09-18 2000-03-07 Victor Company Of Japan, Ltd. Apparatus for processing audio signal
MY118991A (en) * 1997-09-22 2005-02-28 Victor Company Of Japan Apparatus for processing audio signal
WO2003001510A1 (fr) * 2001-06-21 2003-01-03 Sony Corporation Appareil et procede de traitement de signaux numeriques et systeme de reproduction/reception de signaux numeriques

Similar Documents

Publication Publication Date Title
US5611018A (en) System for controlling voice speed of an input signal
US7260306B2 (en) Editing method for recorded information
JP3053541B2 (ja) デジタル記録音声及びビデオの同期式可変速度再生
US4363049A (en) Method and apparatus for editing digital signals
DE69224783T2 (de) Gerät zur Aufzeichnung/Wiedergabe von Audiotönen mit einem Halbleiterspeicher
EP0155970B1 (fr) Appareil reproducteur de signaux audio
EP0939401B1 (fr) Procede de traitement de sons, processeur de sons, et dispositif d'enregistrement/de reproduction
EP0702354A1 (fr) Appareil pour modifier l'échelle de temps pour la modification du langage
US4301480A (en) Apparatus for monitoring reproduced audio signals during fast playback operation
US6085157A (en) Reproducing velocity converting apparatus with different speech velocity between voiced sound and unvoiced sound
US5864792A (en) Speed-variable speech signal reproduction apparatus and method
JP3162945B2 (ja) ビデオテープレコーダ
JP3189587B2 (ja) 音声時間軸変換装置
JP2874607B2 (ja) 音声時間軸変換装置
JPH08328586A (ja) 音声時間軸変換装置
JPH0573089A (ja) 音声再生方法
JP2764994B2 (ja) 編集処理装置
JP3189597B2 (ja) 音声時間軸変換装置
JPH05303400A (ja) 音声再生装置と音声再生方法
KR100201309B1 (ko) 3배속이상 변속재생시 음성신호 처리방법
JP3224906B2 (ja) 信号記録方法、信号記録装置、信号再生方法及び信号再生装置
JP3498319B2 (ja) 自動演奏装置
EP0641125B1 (fr) Appareil de reproduction de signaux vocaux
JP2962777B2 (ja) 音声信号の時間軸伸長圧縮装置
EP0811975A2 (fr) Méthode d'édition pour l'information enregistrée

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): DE FR GB

17P Request for examination filed

Effective date: 19960517

17Q First examination report despatched

Effective date: 19981208

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 19990420