CA2251502A1

CA2251502A1 - Digital speech acquisition, transmission, storage and search system and method

Info

Publication number: CA2251502A1
Application number: CA002251502A
Authority: CA
Inventors: Marc Lutz; Mark Vange
Original assignee: LANCASTER EQUITIES Ltd
Current assignee: LANCASTER EQUITIES Ltd
Priority date: 1998-10-26
Filing date: 1998-10-26
Publication date: 2000-04-26
Also published as: EP1103954A1; AU6176999A; JP2001154686A

Abstract

A digital speech system processes acquired speech to determine speech element information, such as the phonemes in the speech, and associated the prosody information. This determined information is then encoded for transmission and/or storage.
The encoded information is decoded to recover the speech element information and the prosody information which can be provided to a speech generator to construct a facsimile of the original speech. Also, the speech element and prosody information can be searched to locate words or phrases of interest in the speech.

Claims

We claim:

1. A digital speech system, comprising:
speech element determination means operable on speech in a digital format to determine speech element information from said speech;
speech prosody determination means operable on said speech to determine prosody information from said speech;
encoding means to encode speech information comprising said determined speech element information, said determined speech prosody information and timing information relating thereto in a digital form;
decoding means to decode said encoded speech information to obtain said determined speech element information, said determined speech prosody information and said timing information;
comparison means to compare said determined speech element information and said determined speech prosody information to a database of speech elements to select speech sound elements which correspond thereto; and speech generating means to assemble said selected speech sound elements to construct a facsimile of said speech.

2. The digital speech system of claim 1 further comprising:
speech acquisition means to acquire an analog electronic representation of said speech; and digitization means to convert said analog representation of said speech to said digital format.

The digital speech system of claim 1 further comprising:
an output means to produce an output of said facsimile in a manner audible to a user.

4. The digital speech system of claim 1 wherein said comparison means further comprises recognition means to identify undesired speech characteristics, said selection of speech sound elements being performed to reduce the presence of said identified undesired speech characteristics in said facsimile.

5. The digital speech system of claim 1 further comprising search means operable to receive an input representing a word or phrase of interest and to examine said determined speech element information to locate occurrences of said word or phrase therein.

6. A method of acquiring and constructing digital speech, comprising the steps of:
(i) examining digitized speech to determine speech element information relating to said speech;
(ii) examining said digitized speech to determine prosody information relating to said speech;
(iii) encoding speech information corresponding to said determined speech element information, said determined prosody information and timing information relating thereto;
(iv) receiving and decoding said encoded speech information to obtain said timing information, said determined speech element information and said determined prosody information;
(v) comparing said decoded speech element information and prosody information to a database to select corresponding speech sound elements; and (vi) assembling said selected speech sound elements to construct a facsimile of said speech.

7. The method of claim 6 further comprising the steps of:
acquiring an electronic representation of speech in an analog form; and digitizing said analog representation of speech to obtain said digitized speech for step (i).

8. The method of claim 6 wherein step (v) further comprises comparing said decoded speech element information to a predefined database of undesired speech characteristics and selecting speech sound elements which reduce said undesired speech characteristics is said facsimile.

9. The method of claim 6 further comprising the step of receiving from a user a word or phrase of interest and search said determined speech element information and said determined prosody information to locate occurrences of said received word or phrase and to identify said locations to said user.