CA1202418A

CA1202418A - System for synthesizing sounds associated with printed materials

Info

Publication number: CA1202418A
Application number: CA000416715A
Authority: CA
Inventors: Michael Nitefor
Original assignee: Individual
Current assignee: Individual
Priority date: 1982-11-30
Filing date: 1982-11-30
Publication date: 1986-03-25

Abstract

INVENTION : SYSTEM FOR SYNTHESIZING
SOUNDS ASSOCIATED WITH
PRINTED MATERIALS
INVENTOR : MICHAEL NITEFOR

ABSTRACT OF THE DISCLOSURE

A speaking device adapted for use with display material consisting of visual data (such as printed text) and associated bar code is described.
The speaking device includes an optical wand for sensing the bar code and producing a control signal intended to trigger the reproduction of speech sounds associated with the visual data. A speech synthesizer stores sets of control data for use in audibly reproducing the speech sounds. Speech synthesizing circuitry responds to each control data set by releasing the speech sounds associated with the control data set. A microprocessor is programmed to respond to the control signals generated by the optical wand to select from memory the control data set corresponding to the required speech sounds.

Description

~%1~1~4~g~

FIELD OF INVENTION

The invention relates generally to sound synthesizing devices, and in particular to the manner in which data is inputted into sound synthesizing S devices and in which such devices are adapted to reproduce sounds associated with such data.

BACKGROUND OF THE INVENTION

Electronic speech synthesizers which produce speech with near-human likeness are well known.

In the past, speech synthesizing systems have been proposed which attempt to read printed text and synthesize associated speech. For example, in U.S.
p~tent No~ 3,114,980 which issued on December 24, 1963 to John H. Davis, there is proposed a speech synthesizing sys~em which reads letter-press text positioned before a scanning cathode ray tube and focusing lens. Means are provided or moving the printed text relative to the scanning tube and focusing lens to effect a scanning of the lines of the text. Electrical circuitry is then used to process the information obtained by scanning the printed text and to synthesize the associated speech.
The complexity of attempting to move direc~ly from printed work to spoken speech, and the fallability of such a system, will be readily apparent to those skilled in the art.

5~3 A different approach is suggested in U.S.
patent No. 4,128,737 to Mark V. Dorais in which digitally coded input data, pre-processed for example by a computer to contain all information necessary for synthesis of speech, is delivered to a speech synthe-sizer appropriately programmed to handle such input data. Such a system avoids problems inherent in optically scanning visual display materials and stxuctur-ing associated speech, but has limited application.

A problem with the prior art devices residesin the manner in which input data is received. In particular, data cannot be inputted in a reliable manner which can be user controlled on an incremental basis. Such devices would consequently not readily lend thenselves to use as educatiollal tools, for example in language studies in which a student might wish words or phrases regarding certain visual display materials such as pictures or foreign language text to be enunciated on a repetitive or user-selected basis. It is an object of the present invention to provide a speech synthesizing system and associated device which can overcome such limitations.

BRIEF SUMMARY OF THE INVENTION

The invention provides a sound synthesizing system which involves use of display material consisting of visual data which can be understood or recognized by ~2~

a user and associated coded data in a "machine-readable" form which identifies words, expressions or more generally sounds, associated with the visual data. For example, the visual data may consist of lines of foreign language text with alternate lines of machine-readable coded data (preferably bar code~
which can be used to trigger or control the audible reproduction by a sound synthesizing device of the foreign language text or even a translation. It is particularly advantageous in such applications to provide a pick-up device such as an optical wand or pen with which the user may initiate the pronunciation of the text or translation thereof on a selective or incremental basis.

I5 Such a sound synthesizing system includes manually operable pick-up means such as the above-mentioned optical wand capable of sensing the coded data and producing a control signal which identifies sounds. The device also includes a sound synthesizer which stores control data sets containing information required for electronic or electromagnetic reproduction of the sounds which are to be associated with the visual data, and sound synthesizing means which can be made to respond to each con~rol data set to reproduce the sounds associated with the controlled data set. The synthesizer also includes selection means such as a microprocessor which respond to the control signal .... _ _ generated by the pick-up means to select the particular control data sets corresponding to the sounds identified by the control signal and which cause the sound synthe-sizing means to audibly reproduce the sounds identified by the control signal.

With regard to reproduction of speech associated with visual information, the control signal c~n identify speech sounds in a variety of ways. For example~
individual phonemes can be identified, and the sound synthesizer adapted to store control data sets for reproducing the individual phonomes and to then reproduce speech sounds by concatenating the phonemes. In many cases, however, particularly in educational applications, it may be preferable simply to store control data sets which completely specify the reproduction of a word, phrase or sentence. A correspondence will of course be effected between the coded data of the display material and the control signal which regulates the operation of the speech synthesizer.

2 0 BRIEF DESCRIPTION OF THE DR~WINGS

The sound synthesizing system of th~ invention will be better understood with reference to the drawings in which Figs. 1 and 2 illustrate exemplary display materials; and, . ... . .. . .. .. . . ... . ~

Fig. 3 schematically illustrates a speech synthesizing system adapted to pronounce words associated with the display materials.

DESCRIPTION OF PREFERRED EMBODIMENT

Figs. 1 and 2 illustrate typical display materials which can be used in a speech synthesizing system embodying the invention. In Fig. 1, a sheet 10 of materials suited for French language instruction are illustrated (fragmented). The sheet 10 carries a line of French language text 12 (symbolically illustrated) beneath which appears a line of bar code 14 (also symbolically illustrated). To obtain a pronunciation of the French language text 12, a student need only scan the line of bar code 14 with a quick left-to-right movement o* an optical pen 16 which forms part of a speaking device 18. The speaking device 18 then pronounces the line of French text 12. An optional mode of instructional operation would involve providing an additional line of bar code 20 immediately below the bar code 14 which can be scanned by the optical pen 16 to cause the speaking device 18 to pronounce the English translation of the French text 12.

Fig. 2 illus~rates instructional display material 27 suitable for use in the education of young ch~ldren. Two objects 24, 26 are illustrated on ..... . . . . .

the display material, and bar codes 28, 30 used to ldentify the names of the objects 24, 26. To have the names of the objects 24, 26 pronounced by the speaking device 18, a student simply scans the bar codes 28, 30 individually with the optical pen 16.

The relative functioning and arrangement of the components of the speaking device 18 will be described below; however, the exact nature of each component will not be described in detail, as these components and alternativa devices which can be substituted therefoxe are well known.

The overall operation of the speaking device 18 is specifically tailored to particular display materials. Each of bar code s~mbols used represents a unit of speech which may be a sound, word or phrase which may be associated with the display materials, or more accurately describes the location in an electronic memory of a ~et of control data necessary for the electronic pronunciation of a unit of speech. For e~ample, the bar code 28 would effectively identify the memory address of control information which is an electronic representation of the pronunciation of the name of the object 24.
Consequently, all words, phrases or sounds which are to be associated with the display materials are effectively stored in predetermined locations in the electronic memory Although this system has a finite vocabulary of words or expressions which can be pronounced, the system is simple, reliable and well suited to language studies. Phonome stringing devices, on the other hand, although possessing a greater vocabulary potential, generally do not produce speech of adequate quality, particularly for educational purposes. Additionally, measures have been taken in the design of the speaking device 18 to enhance versatility and these will be described below.

The speaking device 18 includes a general purpose microprocessor which is model No. 8085 of Intel Corporation. The microprocessor 32 is pro-grammed with instructions provided in an erasable programmable, read-only memory I~PRO~) 34. The EPRO~I
34 conditions the microprocessor 30 to effectively recognize units of bar code sensed by the optical pen 16 as identifying a unit of speech to be pronounced and the memory address of an electronic xepresentation of that unit of speech in a vocabulary memory 36. Each unit of bar code representing a unit of speech con~e~
quently commences with indicia intelligeable to the m'croprocessor 32 as the starting point in a unit of bar code, followed by multi-bit bar code segment which encodes the memory address of the unit of speech. The microprocessor 32 deciphers the encoded memory address, and then delivers to a speech synthesizer 38 a control signal which identifies the memory address of the required .. . . ..

_ 9 _ unit o~ speech and initiates audible reproduction thereof.

The speech synthesizer 38 can be a device sold under the trade mark DIGITALKER by National Semiconductor Corporation. Additionally, the vocabulary memory 36 may ~e a read only vocabulary device such as model No. DT1050 manufactured by National Semiconductor Corporation and containing the electronic representation of the pronunciation of about 140 items. It will be appreciated that for greater flexibilit~ any suitable speech synthesizer may be substituted together with for example a programmable memory unit in which can be stored controlled data sets for use in reproducing various units of speech.

Upon receiving the control signal from the microprocessor 32, the speech synthesizer 38 obtains from the vocabulary memory 36 the sets of control inormation relating to the units of speech specified by the control signal, and then generates a speaker drive signal. A filter ~0 removes undesirable frequency components (in a known manner), and the filtered signal is received by an amplifier 42 capable of driving a speaker 44.

Other modes of operation besides memory look~
up of complex units of speech are contemplated. For example, the bar code can be used to identify individual phoneme~stringing can be provided. In such circum~
stancesl the vocabulary memory would contain control information sets which relate to the production of individual phonemes. Such a method of synthesizing operation is well known in the art, and its adaption for use in speech synthesizing systems and speaking devices of the present invention will be readily apparent to those skilled in the art.

It will be appreciated that particular embodiments of a speech synthesizing system and speaking device have been disclosed, and that modi-fications may be made therein without departing from the spirit of the invention and scope of the appended claims. In paxticular, the invention also contemplates systems adapted to produce sounds not related to human speech, forexample, where visual display materials are to instruct or identify for user sounds produced by animals, musical instrumen-ts,vehicles, machinery etc.

Additionally, hard-wired logic circuitry can be used instead of a programmable micxoprocessor to re-late control signals to memory locations of control data sets for sound reproduction, although versatility of the overall system may be somewhat reduced.

Claims

THE EMBODIMENTS OF AN INVENTION IN WHICH AN EXCLUSIVE
PROPERTY OR PRIVILEGE IS CLAIMED ARE DEFINED AS FOLLOWS:

1. A learning system permitting a user to selectively reproduce audio information associated with visual learning materials, comprising:
learning material including visual data intelligible to a person associated with machine-readable digitally coded data in predetermined units, each unit of digitally coded data corresponding to preselected sounds associated with a preselected portion of the visual data;
manually operable pick-up means for sensing the digitally coded data on a unit basis and producing from each sensed unit of digitally coded data a control signal identifying the preselected sounds associated with a particular preselected portion of the visual data; and, a sound synthesizer including (a) storage means for storing control data sets representing the preselected sounds associated with the visual data, (b) sound synthesis means which respond to each control data set to audibly reproduce the preselected sounds represented by the control data set, (c) selection means responsive to control signal for selecting the control data sets representing the preselected sounds identified by the control, signal, the selection means controlling the sound synthesis means to audibly reproduce the preselected sounds identified by the control signal.

2. A learning system as claimed in claim 1 in which the pick-up means comprise an optical wand.

3. A learning system as claimed in claim 2 in which the pick-up means are adapted to sense bar code.

4. A learning system as claimed in claim 1 in which:
the visual data includes printed words;
the digitally coded data identify speech sounds associated with the printed words; and, the sound synthesizer is adapted to reproduce sounds associated with human speech;
whereby, speech associated with the printed words can be audibly reproduced.

5. A learning system as claimed in claim 1 in which:
the visual data includes words printed in a first language;
the digitally coded data identify speech sounds associated with a translation of the printed words into a second language; and, the sound synthesizer is adapted to reproduce sounds associated with human speech;
whereby, speech representing a translation of the printed words into the second language can be audibly reproduced.

6. A learning device for use with learning material including visual data intelligible to a person associated with machine-readable digitally coded data in predetermined units, each unit of digitally coded data corresponding to preselected sounds associated with a preselected portion of the visual data, comprising:
manually operable pick-up means for sensing the digitally coded data on a unit basis and producing from each sensed unit of digitally coded data a control signal identifying the preselected sounds associated with a particular preselected portion of the visual data; and, a sound synthesizer including (a) storage means for storing control data sets representing the preselected sounds associated with the visual data, (b) sound synthesis means which respond to each control data set to audibly reproduce the preselected sounds represented by the control data set, (c) selection means responsive to the control signal for selecting the control data sets representing the preselected sounds identified by the control signal, the selection means controlling the sound synthesis means to audibly reproduce the preselected sounds identified by the control signal;
whereby, a user can selectively reproduce preselected audio information associated with visual data.

7. A learning device as claimed in claim 6 in which the pick-up means comprise an optical wand.

8. A learning device as claimed in claim 7 in which the pick-up means are adapted to sense bar code.

9. A learning device as claimed in claim 6 adapted for use with learning materials including visual data comprising printed words and associated digitally coded data, in which:
the pick-up means are adapted to produce a control signal identifying speech sounds associated with the printed words; and, the sound synthesizer is adapted to audibly reproduce sounds associated with human speech;
whereby, speech associated with the printed words can by audibly reproduced.

10. A learning device as claimed in claim 6 adapted for use with learning materials including visual data comprising words printed in a first language and associated digitally coded data, in which:
the pick-up means are adapted to produce a control signal identifying speech sounds associated with a translation of the printed words into a second language; and, the sound synthesizer is adapted to audibly reproduce sound associated with human speech in the second language;
whereby, speech representing a translation of the printed words into the second language can be audibly reproduced.

11. Learning material adapted to permit a user to reproduce on a selective and repetitive basis predetermined audio information associated with the learning material, comprising:
visual data intelligible to a person; and, machine-readable digitally coded data in predetermined units.
each unit of digitally coded data corresponding to preselected sounds associated with a preselected portion of the visual data;
whereby, manually operable pick-up means can be used to sense the digitally coded data on a unit basis, and to produce from each sensed unit of digitally coded data a control signal identifying the preselected sounds associated with a particular preselected portion of the visual data and controlling audible reproduction of the preselected sounds by a sound synthesizer.