GB1068282A - Speech waveform modification - Google Patents

Speech waveform modification

Info

Publication number
GB1068282A
GB1068282A GB20363/65A GB2036365A GB1068282A GB 1068282 A GB1068282 A GB 1068282A GB 20363/65 A GB20363/65 A GB 20363/65A GB 2036365 A GB2036365 A GB 2036365A GB 1068282 A GB1068282 A GB 1068282A
Authority
GB
United Kingdom
Prior art keywords
peak
speech
pitch period
gate
signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired
Application number
GB20363/65A
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Publication of GB1068282A publication Critical patent/GB1068282A/en
Expired legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/04Time compression or expansion
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/06Elementary speech units used in speech synthesisers; Concatenation rules
    • G10L13/07Concatenation rules
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)
  • Electrophonic Musical Instruments (AREA)
  • Analogue/Digital Conversion (AREA)

Abstract

1,068,282. Speech waveform modification. INTERNATIONAL BUSINESS MACHINES CORPORATION. May 14, 1965 [June 9, 1964], No. 20363/65. Heading H4R. The time duration of an audio signal is modified, e.g. to make speech samples from different sources sound as if from the same source, by adjusting the lengths of the pitch periods of the speech samples to a common length, discontinuities due to amplitude differences between the end of an adjusted pitch period and the beginning of the following period being eliminated by adding to the adjusted pitch period signal a " ramp " signal having an amplitude of zero at the commencement of the pitch period and an amplitude equal to the amplitude difference at the end of the adjusted pitch period. The actual pitch period of the samples is determined by measuring the time of occurrence of the maximum peak to peak excursions of the speech waveform during time intervals assessed by a rough determination of the pitch period. Figs. 3A, 3B and 3C show an embodiment in which a sample of speech from a source 2 is applied via a sampling switch 4 to a store 8 in which the speech sample circulates, together with a synchronizing pulse from single shot circuit 6 marking the start of the speech sample. On each repetition of the speech sample the synchronizing pulse is applied to reset the counter 22 which during the repeat of the sample provides a time scale by counting the output of oscillator 24. The speech sample is applied to a voicing detector 26, which produces a pulse at the beginning of a voiced sound, and to a conventional form of pitch extractor 10 to 16, which produces a count in counter 18 corresponding to the approximate pitch period. The pulse from the voicing detector is applied to gate 28 to gate a count, corresponding to the start of the voiced speech, from counter 22 into the register 30. In addition, this count is fed from gate 28 to an " ADD " circuit 34 which is also fed with the count from counter 18, corresponding to the pitch period, and the resulting count is fed into register 36. During the following cycles the counts in registers 30 and 36 are compared in comparators 38 and 40 with the count from counter 22 and signals are produced to trigger the bi-stable 42 to produce an output on lead Q which is positive during a period from the commencement of voiced signal to a time approximately one pitch period later. During this time speech is fed via gate 44 to the positive and negative peak detectors 64 and 66 which feed the values of the respective peaks to the gates 56, 58, 60 and 62. Initially, the synch. pulse sets bi-stable 72 so that the output 1a is energized and therefore the first positive and negative peak values are fed respectively via gates 56 and 58 to hold circuits 46 and 48, the outputs from which are fed to a differential amplifier 90 to obtain a signal representative of the first peak to peak excursion of the waveform, which is applied via an inverter 92 to adder 94, in addition the time of occurrence of the positive peak is fed via gate 74 into the register 68. Since no input has yet been applied to gates 60 and 62 the output of differential amplifier 96 is zero and the output of adder 94 is therefore negative and passes via gate 98 and gate 100, operated by the delayed negative peak, to trigger bi-stable 72 so that output 1a is removed and 2a is energized so that the following positive and negative peak values are fed via gates 60 and 62 to hold circuits 50 and 52 and differential amplifier 96, while the time of occurrence of the positive peak is fed into register 70 via gate 104. The outputs of amplifiers 90 and 96 are then compared and depending on the relative values either a positive or negative output results from adder 94 which is fed via gates 98 or 108 and gate 100 to trigger bi-stable 72 into such a condition that the following pair of positive and negative peaks is fed in via gates 56 and 58 or gates 60 and 62 to replace the values in the hold circuits corresponding to the smaller peak to peak swing. The process is repeated during the remaining duration of the Q signal discarding always the smaller of the two peak to peak swings being compared until at the end of the Q period the negative going signal detector 54 is energized to apply an output which is gated through the appropriate one of gates 110 and 112 to feed the output of the register 68 or 70, holding the position of the maximum peak to peak swing, into the computer 3. The computer takes the count corresponding to the maximum peak to peak value and adds to that a count corresponding to half a pitch period as stored in counter 18, and one and a half pitch periods, and the resulting values are fed-in to replace the counts stored in registers 30 and 36 respectively. The determination of the maximum peak to peak swing is then carried out, as before, for the interval between the counts now stored in registers 30 and 36 to determine the position of the next pitch pulse. In a similar fashion the positions of the maximum peak to peak swings of the speech waveform is determined for the remainder of the speech sample stored in the circulating store 8 and these values are stored in the computer 3. In order to adjust the pitch cycles to the required length the speech is fed to gates 122 and 140. Each pitch period is adjusted in length during a cycle of operations which entails two repeats of the speech sample from store 8. During the first repeat a pitch pulse from computer 3 on line 126 triggers bi-stable 124 to allow speech to pass through gate 122 to gate 134. At the end of the delay time produced by delay 128, which is equal to the desired pitch period and is equal to or shorter than any actual pitch period in the sample, bi-stable 124 is reset to inhibit gate 122 and gate 134 is operated to apply the voltage value existing at the end of the modified pitch period to the hold circuit 136 where it is stored. The following pitch pulse on line 138 operates gate 140 to feed to hold circuit 142 the voltage value of the speech signal at the beginning of the next pitch period. The two signals from stores 136 and 142 are applied to a differential amplifier 144 to obtain a signal representing the error between the amplitude of the signal at the end of the length modified pitch period and the amplitude of the signal at the commencement of the following pitch period, this signal being applied to the input of the integrating amplifier circuit 146. During the second repetition of the period being modified the gate 122 gates through the speech signal for the modified length period to one input of " add " circuit 130, in addition, the output of bi-stable 124 opens switch 154 on the output of integrating amplifier 146 for the duration of the modified pitch period so that the output of this amplifier consists of a ramp waveform which is zero at the beginning of the pitch period and has a value equal to the output of differential amplifier 144 at the end of the modified pitch period, this signal is applied to the other input of " add " circuit 130 to be added to the modified length speech waveform sample so that the resulting signal will be continuous in amplitude with the following sample starting at the following pitch pulse. The output of adder 130 is converted to digital form in the analogue to digital converter 132 so that it may be stored in computer 3 to await the following length modified pitch periods of the speech sample which will be processed in a similar way on subsequent cycles of the circulating store 8.
GB20363/65A 1964-06-09 1965-05-14 Speech waveform modification Expired GB1068282A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US373751A US3369077A (en) 1964-06-09 1964-06-09 Pitch modification of audio waveforms

Publications (1)

Publication Number Publication Date
GB1068282A true GB1068282A (en) 1967-05-10

Family

ID=23473721

Family Applications (1)

Application Number Title Priority Date Filing Date
GB20363/65A Expired GB1068282A (en) 1964-06-09 1965-05-14 Speech waveform modification

Country Status (3)

Country Link
US (1) US3369077A (en)
DE (1) DE1472004C3 (en)
GB (1) GB1068282A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1993009531A1 (en) * 1991-10-30 1993-05-13 Peter John Charles Spurgeon Processing of electrical and audio signals

Families Citing this family (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3828132A (en) * 1970-10-30 1974-08-06 Bell Telephone Labor Inc Speech synthesis by concatenation of formant encoded words
JPS5331323B2 (en) * 1972-11-13 1978-09-01
DE2349626C2 (en) 1973-10-03 1984-06-07 Robert Bosch Gmbh, 7000 Stuttgart Speech audiometer with a sound player
US3982070A (en) * 1974-06-05 1976-09-21 Bell Telephone Laboratories, Incorporated Phase vocoder speech synthesis system
US4214125A (en) * 1977-01-21 1980-07-22 Forrest S. Mozer Method and apparatus for speech synthesizing
US4464784A (en) * 1981-04-30 1984-08-07 Eventide Clockworks, Inc. Pitch changer with glitch minimizer
JPS6017120B2 (en) * 1981-05-29 1985-05-01 松下電器産業株式会社 Phoneme piece-based speech synthesis method
JPS602680B2 (en) * 1981-06-18 1985-01-23 三洋電機株式会社 speech synthesizer
US4601052A (en) * 1981-12-17 1986-07-15 Matsushita Electric Industrial Co., Ltd. Voice analysis composing method
US4618984A (en) * 1983-06-08 1986-10-21 International Business Machines Corporation Adaptive automatic discrete utterance recognition
US4757540A (en) * 1983-10-24 1988-07-12 E-Systems, Inc. Method for audio editing
JPH0754440B2 (en) * 1986-06-09 1995-06-07 日本電気株式会社 Speech analysis / synthesis device
US5216744A (en) * 1991-03-21 1993-06-01 Dictaphone Corporation Time scale modification of speech signals
DE69231266T2 (en) * 1991-08-09 2001-03-15 Koninklijke Philips Electronics N.V., Eindhoven Method and device for manipulating the duration of a physical audio signal and a storage medium containing such a physical audio signal
DE69228211T2 (en) * 1991-08-09 1999-07-08 Koninklijke Philips Electronics N.V., Eindhoven Method and apparatus for handling the level and duration of a physical audio signal
DE4425767C2 (en) * 1994-07-21 1997-05-28 Rainer Dipl Ing Hettrich Process for the reproduction of signals with changed speed
US5920842A (en) * 1994-10-12 1999-07-06 Pixel Instruments Signal synchronization
US6775372B1 (en) 1999-06-02 2004-08-10 Dictaphone Corporation System and method for multi-stage data logging
US6246752B1 (en) * 1999-06-08 2001-06-12 Valerie Bscheider System and method for data recording
US6249570B1 (en) * 1999-06-08 2001-06-19 David A. Glowny System and method for recording and storing telephone call information
US6252947B1 (en) 1999-06-08 2001-06-26 David A. Diamond System and method for data recording and playback
US6252946B1 (en) * 1999-06-08 2001-06-26 David A. Glowny System and method for integrating call record information
US6869644B2 (en) * 2000-10-24 2005-03-22 Ppg Industries Ohio, Inc. Method of making coated articles and coated articles made thereby
FR2907586A1 (en) * 2006-10-20 2008-04-25 France Telecom Digital audio signal e.g. speech signal, synthesizing method for adaptive differential pulse code modulation type decoder, involves correcting samples of repetition period to limit amplitude of signal, and copying samples in replacing block
CN109102821B (en) * 2018-09-10 2021-05-25 思必驰科技股份有限公司 Time delay estimation method, time delay estimation system, storage medium and electronic equipment

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US2921133A (en) * 1958-03-24 1960-01-12 Meguer V Kalfaian Phonetic typewriter of speech
US3133268A (en) * 1959-03-09 1964-05-12 Teleregister Corp Revisable data storage and rapid answer back system
US3158685A (en) * 1961-05-04 1964-11-24 Bell Telephone Labor Inc Synthesis of speech from code signals
US3183303A (en) * 1961-12-21 1965-05-11 Ibm System for voice answer-back from data processor

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1993009531A1 (en) * 1991-10-30 1993-05-13 Peter John Charles Spurgeon Processing of electrical and audio signals

Also Published As

Publication number Publication date
DE1472004C3 (en) 1975-08-28
DE1472004B2 (en) 1975-01-16
DE1472004A1 (en) 1969-10-09
US3369077A (en) 1968-02-13

Similar Documents

Publication Publication Date Title
GB1068282A (en) Speech waveform modification
GB796677A (en) Improvements in or relating to circuits for the analysis of speech currents
GB1382524A (en) Process and apparatus for the recognition of a predetermined frequency in a mixture of frequencies
GB1525141A (en) Band compression device
GB1276138A (en) Improvements relating to sampling measurements
US3200338A (en) Automatic correction arrangements for periodic integrators
GB1383621A (en) Apparatus for detecting the fundamental frequency of a speech sound
GB1101721A (en) Improvements in or relating to machine recognition of speech
GB1172244A (en) Improvements relating to Voice Operated Apparatus
SU595681A2 (en) Quick-action video pulse shape analyzer
GB1156096A (en) Signal Sampling Circuit.
GB1217610A (en) A process and device for identifying frequencies by logical circuits
US3851158A (en) Method and apparatus for deriving the mean value of the product of a pair of analog quantities
JPS5483703A (en) Audio synthesizer
SU764124A1 (en) Binary code-to-time interval converter
SU917172A1 (en) Digital meter of time intervals
SU926588A1 (en) Ultrasonic velocity meter
GB1059015A (en) Improvements in analysers for acoustic signals
SU1168865A1 (en) Stroboscopic oscillographic recorder of single electric signals
SU928237A1 (en) Device for measuring voltage instantaneous values
SU1350841A2 (en) Device for measuring domination of discrete signals
SU953717A2 (en) Pulse programmable delay device
SU1278969A1 (en) Device for measuring parameters of movement of magnetic tape
SU920540A1 (en) Device for extremum moment determination
SU557359A1 (en) Device for displaying information