WO2000067249A1 - Systeme de stockage, de distribution et de coordination de texte d'ouvrages affiche avec synthese vocale - Google Patents
Systeme de stockage, de distribution et de coordination de texte d'ouvrages affiche avec synthese vocale Download PDFInfo
- Publication number
- WO2000067249A1 WO2000067249A1 PCT/US2000/011182 US0011182W WO0067249A1 WO 2000067249 A1 WO2000067249 A1 WO 2000067249A1 US 0011182 W US0011182 W US 0011182W WO 0067249 A1 WO0067249 A1 WO 0067249A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- computer
- words
- book
- text
- word
- Prior art date
Links
- 230000015572 biosynthetic process Effects 0.000 title description 2
- 238000003786 synthesis reaction Methods 0.000 title description 2
- 238000000034 method Methods 0.000 claims abstract description 49
- 230000001771 impaired effect Effects 0.000 claims description 9
- 238000004891 communication Methods 0.000 claims description 6
- 230000007423 decrease Effects 0.000 claims description 6
- 230000003247 decreasing effect Effects 0.000 claims description 4
- 238000013500 data storage Methods 0.000 claims description 3
- 230000000694 effects Effects 0.000 claims description 2
- 238000010586 diagram Methods 0.000 description 13
- 230000008569 process Effects 0.000 description 9
- 230000006870 function Effects 0.000 description 7
- 230000005540 biological transmission Effects 0.000 description 4
- 238000002360 preparation method Methods 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 238000012552 review Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 3
- 230000003466 anti-cipated effect Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000007639 printing Methods 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 1
- 239000007795 chemical reaction product Substances 0.000 description 1
- 210000001072 colon Anatomy 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013144 data compression Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000012015 optical character recognition Methods 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
Definitions
- This invention relates to a system and a method for the storage of a 5 multiplicity of books (or other text based documents) in a repository (e.g., a computer database), which allows a user to select one of such books and to produce a spoken version of such book.
- a repository e.g., a computer database
- This selected book will then be stored on media (e.g., a CD disk or other suitable computer readable media) so that a text-to-speech version of the book may be stored and be re-played at will by an
- PCs Personal Computers
- CD compact disk
- CD readers Compact disk recorders (CDR) are now capable of 4X recording speeds and CD readers are now capable of 24X and higher speeds.
- a typical 20 Mbyte file can be recorded in less than 36 seconds.
- Data transmission rates of 20 Mbytes per second are now achievable and cost effective.
- 20 book of the present invention can be compressed to 1 Kbytes per page.
- the system of the present invention converts text-based documents (e.g., books or other such documents) from text to speech.
- the system of the present invention stores a plurality of documents in a central repository, for example, in a central computer data base or the like, and the distributes a selected one of such stored documents to an end user.
- the system of this invention allows the end user to selectively play back of such selected document (or a portion thereof) in a spoken format.
- this system includes a program for converting such text-based documents into a digital format, means for storing a plurality of such documents in a digital format in the above-noted central repository, means for selecting one of such stored documents stored in such repository, means for recording or storing such selected document onto a medium which allows such document to be played back on computer means, means for encrypting the selected document as it is recorded onto such medium so as to prevent copying of such document, and means for selectively playing back the stored document, the play back means including means for deciphering the encrypted document stored on the medium, and means for converting such stored document to speech upon play back of the document.
- the system of this invention is intended for use with a computer so as to play back a text file in a spoken format.
- the system comprises a media (e.g., a CDROM or the hard drive of the computer) on which at least part of a book or other text file is recorded in a format readable by the computer.
- a program is provided for reading the words of the text file which are to be played back.
- the program has access to a plurality of digitized words that have been spoken by a person. Generally, there is a digitized spoken word for each word in the text file.
- the computer has a sound system for audibly playing back the digitized word files where the computer reads each word in a text file and plays back the digitized words corresponding to the word in the text file over the sound system so that the end user hears the spoken words in the order they appear in the text file.
- the method of the present invention involves storing a multiplicity of books or other text-based documents in a digital format in a central repository in a digital computer.
- An end user selects one of the multiplicity of such books stored in the computer. Then, the selected book is downloaded and stored on suitable playback media. As the one book is recorded on the media, it is protected against copying.
- the end user selectively plays back the one book recorded on the media on a suitable play back computer.
- the play back computer reads an encryption key. and converts the text of the selected book from text to speech.
- Fig. 1 is a block diagram illustrating the steps of storing a plurality of text based documents (e.g., books) in a central computer repository;
- a plurality of text based documents e.g., books
- Fig. 2 is a block diagram illustrating the steps managing the documents stored in the central computer repository
- Fig. 3 is a block diagram of the main steps or components of a point of sale or terminal (which could include an end user's computer)which may be connected via a wide area network (e.g., the Internet) to the central book (or other text based document) repository shown in Fig. 2 and which may be used to browse through the documents stored in the central repository, for ordering a selected one of such documents, and for downloading this one selected document (or a portion thereof) from the central computer repository and for recording such selected document on suitable playback media (e.g., on a CDROM or on the hard drive of the end user's computer);
- Fig. 4 is a block diagram illustrating the main components or steps in playing back the selected one document recorded on such playback media in a spoken mode;
- Fig. 5 is a flow diagram depicting the steps for creating a master dictionary and for updating (adding words to) this master dictionary;
- Fig. 5A is a flow diagram of a program for preparing optimal database for the digitized word files for use with the present invention;
- Fig. 6 is a flow diagram for checking the master dictionary to insure that sound files exist in the master dictionary corresponding to all of the words in a book or other text based document to be played over the system and method of this invention
- Fig. 7 and 7A are flow diagrams for a book reader program for use with the present invention
- Fig. 8 is a flow diagram in accordance with this invention for the preparation of a minimized book database
- Fig. 9 is a flow diagram for a minimized version of a book reader program for use with the present invention.
- Fig. 10 is another flow diagram for preparing a book to be entered in the data base;
- Figs. 1 1 and 1 1 A show another flow diagram illustrating the steps in accordance with this invention for reading a book.
- This invention relates to a system and a method for allowing a text based document, such as a book, stored in a digital computer, to be stored on a central data base, distributed to an end user, either on computer storage media, such as a CDROM disk, or to distributed over a wide area network, such as the Internet, so that the text based document may be stored on the computer storage media within an end user's computer (e.g.. on suitable portable media, such as a CD ROM disk or a floppy disk) or on the hard drive of the end user computer, and then played back on the sound system of the end user computer so that the user may hear the words of the text in a spoken format.
- a text based document such as a book
- a digital computer to be stored on a central data base
- an end user either on computer storage media, such as a CDROM disk, or to distributed over a wide area network, such as the Internet
- the text based document may be stored on the computer storage media within an end user's computer (
- This invention involves the creation of a database in a central computer (also referred to as the central book repository, as shown in Fig. 2) of optically read (scanned) books (which preferably undergo are converted into text as, for example, performing an optical character reading (OCR) routine on the scanned book or document) or other computer readable text-based documents.
- a central computer also referred to as the central book repository, as shown in Fig. 2
- OCR optical character reading
- the text of the document is stored as a simple ASCII text file (or as any other text file, such as a conventional word processing format, as a PDF file, or in a vector based format such as an Adobe PostScript® file) in the computer data base.
- the text may optionally be encrypted (as will be hereinafter described) to prevent piracy.
- the data representing a book may be compressed (as will be hereinafter described in detail) at an average of 1Kb per page. This results in an "average" book (or other document) of 250 pages in length requiring 250Kbytes of storage. Due to low costs of computer disk storage, the available database of books may be stored in a single site and or partially distributed to the points of sale. A small PC can now easily contain 28 Gbytes of storage. This would translate to 1 12,000 books stored in compressed format. Local storage could provide caching of often requested books, thus reducing the demand on the central repository.
- the central book repository used for the central repository database may be a personal computer having a Pentium processor manufactured by Intel, preferably having a speed of 200 MHz or more and a random access memory (RAM) of 64 Mbytes or more.
- Intel preferably having a speed of 200 MHz or more and a random access memory (RAM) of 64 Mbytes or more.
- RAM random access memory
- Such a computer will have adequate speed to service requests coming from multiple other computers requesting copies of books stored on central computer.
- the central computer would preferably have a relatively larger hard disk storage capability, such as about 7 - 20 Gigabits, having a SCSI-WIDE interface thus allowing fast access to the data.
- SCSI interfaces allow "daisy chaining" of numerous disk units such that with current technology, perhaps as much as 70 Gigabits of storage may be provided on a PC, which with current data compression technology, would allow approximately 250,000 books (at about 250 pages/book) to be stored in the central book depository.
- the central computer is provided with an Internet interface, such as a BayStack Instant Internet interface, and would preferably have at least a Tl connection to allow data transmission at the rate of about 1 Mbit/second or more. It will be understood that the central book repository computer would preferably be part of a wide area network which could include the Internet. In this manner, the books or documents stored in the data base would be remotely accessible by a customer or other authorized person or entity.
- the process of audio book distribution involves creating a central repository, as shown in Fig. 2, in which a multiplicity of book or other text based documents are stored on the central computer which, as noted above, may contain compressed text of a plurality (multiplicity) of books (or other text-based documents) that had been entered into the repository either by scanning and optical character recognition or by electronic transmission from book publishers where these electronic files (in a suitable format) may now exist.
- the book contains graphics (e.g., photographs or charts), a written description of the graphics in the book or document may be added to enhance the understandability of the book or document as it is audibly played back by the user of the system and method of the present invention .
- the manner in which such book data is stored could be any one of suitable data storage methods.
- the book data can be stored on a hard drive (as described above), on CD disks in a juke box player, or any other suitable method.
- the central book repository computer may be located remotely for a point of sale system POS or end user computer, as shown in Fig. 3, by the above-mentioned wide area network or by the Internet.
- a point of sale system includes a customer (or end user) computer from which an end user may review the books, or may review sales information concerning the books in the data base.
- sales information may include such information as book reviews, summaries, abstracts, key words, abstracts, card catalog, cross reference to other books by the same author or on the same subject, or other information stored on the central repository and associated with the books.
- This sales information may be used by the end user (e.g., a potential customer connected to the central book repository computer by means of the Internet) or by a clerk in a store or an authorized person at a school or order processing center to select a book or other document from the data base. Then the customer or the clerk may order a copy of any book or other document stored on the central repository.
- the customer's computer may be located remotely from the central computer C and may be connected to the central repository computer by means of a satellite link, a connection to the Internet, or by a direct link, as part of a wide area network or a local area network.
- book ordering and selecting information may be available both as words that appear on the screen and may be printed or they may be played back over the sound system of the customer or end user's computer by means of a text to speech program.
- the customer on the Internet selects a book to be order, the customer transmits transaction data, such as billing information (e.g., credit card information) along with shipping instructions.
- billing information e.g., credit card information
- shipping instructions e.g., credit card information
- the text of the selected book (or other document) may be copied onto a recordable compact disk, typically referred to as a CDR (or on other suitable media) so the selected book may be shipped to the customer.
- CDR recordable compact disk
- the selected book may be downloaded to the customer's computer over the Internet or other network such that the use of the above-described media is not required.
- the CDR containing the text of the selected book, which may be in an encrypted format, along with an encryption key is recorded on a copy protected diskette which is shipped to the customer along with the CDR by mail or by other suitable delivery service.
- distributing a book to a customer or an end user would encompass recording the selected book or document on computer media, such as on a CD ROM disk or on a floppy disk or the like and physically shipping or giving the media to the end user, or allowing the end user's computer to connect to an electronic computer file over a wide area network, such as the Internet, and to allow the selected text file to be down loaded onto storage media in the end user's computer (i.e., on the hard drive or onto a suitable floppy or the like).
- the term “recording the book or document on suitable media” or words to that effect would include both recording the text file on a CD ROM or floppy disk or recording it on the hard drive of another computer.
- the text stored on the central computer may be encrypted to prevent unauthorized copying of copyrighted materials.
- a variety of encryption techniques may be used to encrypt the data stored on computer C.
- the encryption technique shown below in C language is preferred because it offers reasonable copy protection, but operates quickly so as to not substantially slow ⁇ down the transfer of data.
- An encryption program may be as follows:
- the text of the selected book stored in the data base may be delivered electronically to the customer or end user via the wide area network or via the Internet. Such electronic delivery would make any of the books or documents stored in the data base virtually instantly available from anvwhere in the world.
- the central repository also contains software and hardware to accept orders, transmit orders either by Internet, direct telecommunications, surface mail or satellite link, and to provide billing information both to the purchaser of the book selected and to the publisher of the book.
- the central repository has the capability of transmitting data corresponding to a selected point of sale or customer computer where the above-noted CDR and the encryption key diskette (described above) may be recorded.
- an operating system such as Microsoft ' s Windows 95 or Windows NT.
- an Internet server software system (ISS) is provided.
- ISS Internet server software system
- One ISS system that has been found to work well is Webstar commercially available from QuarterDeck®, or NetServe® available from NetScape®.
- Custom software to interface the ISS with the tasks of fetching and transmitting requested files as well as maintaining transaction history for billing and publisher notification are included as part of a CGI engine program. This last- noted program is launched or communicated to the Dynamic Data Exchange (DDE) by the ISS.
- DDE Dynamic Data Exchange
- the ISS When a request comes to the central computer from a point of sale or customer computer via the Internet, the ISS notifies the CGI engine that a request has been made and informs the CGI of the name of the file it has created that contains the information about the request and the name of the file to use that will contain the requested data.
- the CGI engine When the CGI engine is finished, it notifies the ISS that it is done and the ISS transmits the data back to the requesting point of sale or customer computer by an appropriate data transmission link, as shown in Fig. 2.
- one or more point of sale terminals or customer computers are shown to be located remotely from the central repository computer and are interconnected to the central computer by a suitable network or communication link.
- Such point of sale terminals or customer computers may be connected directly to the central computer such that upon an order being received from a remote location via the Internet, the selected book is transmitted to the point of sale computer and the CDR with the book data thereon and the appropriate encryption key may be recorded.
- PC personal computer
- the point of sale terminal or customer computer includes a suitable personal computer (PC) terminal which allows the user to browse or to otherwise interrogate the central repository system for books (or other documents) available from the central repository, along with ordering information, such as the cost of ordering such a book.
- PC personal computer
- the computer used in the point of sale terminal should have about 7 Gigabits of disk space and a moderate memory (e.g., 32 Mbytes of RAM). Any current Pentium®- based PC will be sufficient.
- such computers have a monitor, a keyboard, and a sound system including a suitable sound card for playing digital sound recordings over speakers connected tot he computer.
- the point of sale terminal or a similar computer located in conjunction with the central repository would preferably have both a floppy disk drive and a compact disk recorder/reader (CDR), such as are readily commercially available from any number of companies, for copying the book data (which may be in an encrypted format) transmitted from the central repository computer upon receipt of an order for a selected book.
- CDR disk containing the book or document will have sufficient space such that the suitable text-to-speech conversion program of the present invention (as will be hereinafter described) may also be transmitted with and stored on the disk.
- the end user or point of sale computer have a modem for connection to the Internet.
- point of sale computer located in a kiosk or located at a school or other "public " location
- point of sale location also have a label printer such that labels for the floppy disk and the CDR disk for the book selected can be printed with information to identify the book recorded on the disks.
- the point of sale or customer computer is preferably provided with software to enable it to run and to connect to the central computer and to allow and end user or clerk to order a book, to transmit the book from the central computer to the remote or customer computer, and to record the book on the CDR.
- software would include a basic operating system, such as Microsoft's Windows 95, and a suitable Internet browser.
- custom software may be provided that will run on such operating system. These functions include:
- a query function that will allow a clerk (via an appropriate communication link) or end user (via the Internet) to query or browse the local database of titles.
- This database may include the above-described sales information relating to the books stored on the central computer. Such sales information may include a synopsis of the book and sufficient information such that the end user may search the database for books that may relate to certain topics or categories so as to aid the end user in finding books that may be of interest. Included in this database will be searchable fields of titles, author names, topics, and the like to aid the clerk/customer in the selection of a book to be ordered. If the full text of the books is stored in a searchable format, full text searches of the books in the repository may be conducted, as well. Of course, this end user or browser database is updated as new books are added to the central repository.
- An order function that will transmit the text of a selected book from the central repository computer to a point of sale or end user computer in response to an order being placed for the selected book.
- a receiving function that accepts the data transmitted from the central computer.
- a copying function that takes the transmitted book data and copies this data onto a blank recordable CDR.
- An optional encryption key diskette is also created, preferably on a 3-1/2" floppy diskette, with a copy protected sector(s) and places an encryption key transmitted form the central computer
- the end user may readily replay the book in spoken form.
- an end user In order to replay the book in spoken form, an end user must have a suitable personal computer that can read the floppy diskette and the CDR (or other media upon which the book has been recorded) containing the book data and, preferably, a suitable text-to-speech conversion program, preferably the text to speech program as hereinafter described which constitutes a part of the present invention.
- a suitable text-to-speech conversion program preferably the text to speech program as hereinafter described which constitutes a part of the present invention.
- the text-to-speech program may be resident on the end user's PC.
- the end user's PC must have a suitable sound card and speakers.
- OCR optical character reading
- the resulting book page is then preferably stored on the hard disk drive of the central computer in an ASCII format.
- the book pages are proofed and a written description of any graphics appearing on a page may be added to aid the listener of the book upon playback in its spoken version.
- an electronic file in a word processing or in a vector based file such as a PostScript or other widely known computer file format
- the book data may optionally be encrypted, using, for example, the above described encryption program.
- the text file containing the book is compressed. and stored in the central repository computer data base.
- a typical transaction would comprise a clerk at a point of sale terminal connected to the central computer via an appropriate communications link, or an end user at a remote computer connected directly to the central computer via the Internet.
- a point of sale terminal it is anticipated that a sighted person would use the terminal.
- an end user terminal it is anticipated that it may be equipped for use by sight impaired persons.
- the end user or a clerk could browse the books on the central repository to locate books of interest and to select one or more books to be ordered. In order to initiate the ordering process, the end user or clerk enters a financial portion of the transaction in a known manner.
- the point of sale or end user computer would transmit the financial data (e.g., credit card information) via the Internet to computer along with a positive identification code for the point of sale or end user terminal.
- Computer would, upon receiving an order request, first verify that the requesting point of sale or end user terminal was in good standing and computer would then fetch the requested book data from the central repository and transmit this book data over a suitable link back to the point of sale or end user terminal.
- Computer would also record the transaction for billing information and for notification of the publisher of the selected book that an order for that book has been filled.
- the point of sale terminal or end user terminal receiving the book data would then record the book file along with a "playing" program (i.e., a suitable text-to-speech (TTS) program) onto a compact disk CD by an recordable CDR recorder.
- a "playing" program i.e., a suitable text-to-speech (TTS) program
- TTS text-to-speech
- a copy protected floppy disk may be recorded on which the encryption key is included.
- the order be filled at the central computer (or some other location under the control to the system owner) so that unauthorized copying of the books can be controlled.
- the CDR and the encryption key (if the file is encrypted) on the diskette can then be mailed or shipped by express courier to the customer.
- the customer Upon a customer or end user receiving the CDR on which the book data and the playing program is recorded and the floppy including the encryption key for the CDR, the customer inserts the CDR and the floppy in the corresponding drives of a personal computer, as above-described. Since the operating system (e.g., Windows 95 or the like) autosenses the presence of a compact disk in the CD drive of the computer, an autorun program is initiated which launches the playing (TTS) program contained on the CDR. The TTS program would then search for a diskette in the floppy drive of the computer for an encryption key. Of course, if such an encryption key is not found, the book data on the CD cannot be accessed and thus cannot be copied or played.
- the operating system e.g., Windows 95 or the like
- the program would load, decompress the book data, decrypt, and convert the text of the book to speech which is audibly played back (spoken) via the sound card and speakers of the end user's PC.
- TTS programs are commercially available by programs such as the commercially available The Open Book program.
- Other such TTS software such as is available from Ref Software-Quelle technik GmbH, may be utilized.
- WAN. files corresponding to each word in the text file could be issued to the computer's sound card. Of course, each page of the book could be played sequentially.
- the computer may keep track of the current page being read (spoken) on the writable key disk for future startup.
- bookmarks aid in returning to where the end user left off in a previous session.
- the end user has random access to any page of the book, and is able to flag selected pages and be able to search the pages of the book for key words or the like.
- the software for the end product performs the following functions.
- the program is loaded into computer memory and be given control (i.e. load and execute).
- the program Upon initialization, the program would look for any "last opened” bookmarks (as will be hereinafter described) indicating that the user had stopped "reading" the book prior to finishing.
- the "pages" of the book is stored as individual files.
- a bookmark is the file number of the page as well as the offset into that file that represented the last word "read".
- the page is expanded by reading the encryption key that was placed in the copy-proof segment of the disk/CD, and then decrypted into ASCII text (or other suitable format) in memory.
- the program would then parse and convert the text word by word into speech by either current conventional text-to-speech means or by the process described below.
- a file used as the aforementioned "bookmark" is updated as to the current file and offset. Subsequent pages are "opened” by the above process, clearing memory of the preceding page and loading and decrypting the data for the next until the last of the sequentially numbered files are "read". Reaching the end of the book, the bookmark is closed.
- An alternative to processing a number of files corresponding to the number of pages is to compress each page of text into a binary storage field of a database.
- each page in a book is contained in a separate record of a database with a field related to the page number and a field containing the textual data of that page. Additional searchable fields pertaining to and describing said data on that page could be included for search capabilities.
- spoke word or "spoken format” are herein defined as a sound file, preferably a digitized sound file, which upon the program reading a word in a text based document, such as a book, plays the digitized sound file over the sound system of the end user's computer such that a sound recording of a human voice (not a synthesized voice) is heard by the user.
- This dictionary involves a database with a field for the text to be matched to the selected word in the "book” along with one or more fields that would contain recorded, digitized speech for that word.
- the words in this dictionary are created by a person speaking the words and recording the words on tape or the like.
- the analog recording of the spoken words is then converted into a digital format, and the digitized words may be edited or "trimmed" so as place the words in a uniform format.
- This dictionary will contain tens of thousands of such words, such that all of the words required to play back a book in spoken form are included in the dictionary. Further, similar dictionaries using Male, Female, Child, and other voices for each word in the dictionary may be provided thus allowing the user to select the voice style most pleasing to the user.
- Fast record indexing routines are available to allow finding the matching word in the dictionary and process the digitized pattern to the sound card. In the case that no match was found, the program would revert to spelling the selected word - using the digitized letters also contained in the dictionary.
- FIG. 5 flow diagrams illustrating all necessary steps for operating and using the system of this invention for playing back a text file, such as a book or other text based document, in the above-described spoken format.
- a text file such as a book or other text based document
- FIG. 5A the steps for the creation of a master dictionary of digitized words, as spoken by a person, and for updating this master dictionary (Fig. 5A) are shown. It is believed that the steps shown in Figs. 5 and 5A would fully describe the process of creating the master dictionary to one skilled in the art and thus further description herein is not required.
- the book database includes a record layout that may have a first text field including processing instructions (PROCESS), a second text field (35 characters) for the text to be displayed (DISPLAY), and a memo field for compressed digitized sound to the corresponding text (SOUND).
- PROCESS processing instructions
- DISPLAY second text field
- SOUND memo field for compressed digitized sound to the corresponding text
- a book reader program of this invention reads a text file of a book (or a passage therefrom) and commands the end user's computer to speak each word in the book (or in a selected passage of the book or other document) over the sound system so that each word may be heard by the end user.
- the basic reading system comprises a reader program, a complete master dictionary including digitized sound files of all of the words in the dictionary (which may number hundreds of thousands of such sound filed), and a text file in an ASCII or other suitable format.
- a special dictionary (referred to as a Book Database in Fig. 10) may be constructed for each book which includes only the words in the book to be read, thereby minimizing the size of the dictionary and increasing speed with which words can be looked up in the dictionary by the computer.
- the time interval between the words, the rate (speed) at which the program reads the words, and the pitch of the spoken words may be selectively varied by the end user to suit the speed and sound desired by the end user. It will be recognized that some end user ' s may want the words read by the computer to be at a faster pace than other users. However, this is not merely a matter of increasing the speed or rate at which the computer reads the words.
- the system of this invention has a routine that allows the pitch of the words to be selectively adjusted up or down by the end user so that the rate (speed) at which the computer plays back (speaks) the words of the text may be increased or decreased and yet the words remain understandable to the end user.
- Figs. 7A, 9A, and 1 This is shown in Figs. 7A, 9A, and 1 1.
- the text to speech program of this invention will spell the letters of the word so that the end user may discern the word.
- the steps for spelling such words that are not in the dictionary of digitized sound files are shown, for example, in Figs. 7A. 9A or
- Fig. 7 illustrates that the system of this invention allows the end user to place a bookmark at desired locations in the text as the text is being read aloud by the computer so that the end user may readily return to a desired passage in the book.
- the system and method of this invention may convert text to spoken words in two ways.
- a master dictionary of a very large number of words i.e., digitized sound recordings of words spoken by a person and stored or recorded in a digitized format.
- the computer searches for the corresponding digitized sound file in the master dictionary corresponding to the word being read, and if such word if found, the sound will be played back over the sound system of the computer, and then the next word in the text will be read.
- a minimized dictionary for a particular book may be prepared such that the dictionary supplied with a particular book text file contains only the words in that book. This requires a new dictionary for each book.
- the entire master dictionary may be supplied to the end user so that no matter what book or document is being read by the computer, the master dictionary will have substantially all of the words contained in the book or document.
- special dictionaries may be created for technical or specialized vocabularies or for different languages. While the system of this invention using such dictionaries works well when the end user's computer has a fast CD reader, such dictionaries place a premium on the response time of the CD reader. It will be recognized, however, that this system of providing a master dictionary with a very large number of digitized sounds allows virtually any text file to be converted from text to speech merely by using a text file with the text to speech program of this invention.
- a second method of reading the words of a book is to, upon making us the file for the book, to record the digitized sounds for the words in the order the words appear in the book.
- the program for creating or preparing such book file is shown in Fig. 10 and the program for playing such a book file is shown in Fig. 1 1.
- Fig. 10 there is a master dictionary which contains all of the digitized words.
- Each word in the book text file is read by the computer and, if a corresponding digitized word exists in the master dictionary, the digitized sound from the master dictionary if laid down on the book file. This process is repeated for each word in the book.
- Such a book file is read back by the program shown in Fig. 1 1.
- This program, where the digitized word is recorded for each word in the book file eliminates the need to look up the digitized sounds in the master dictionary and thus speeds up the reading of the book.
- the end user may make notes regarding passages in the book, and at a later time, the end user may play back and hear the words of the notes.
- notes may be entered by the end user by typing on the keyboard of the end user computer.
- a sight impaired student may study a book and may take appropriate notes and then play back the notes to aid in studying the text.
- bookmarks may be used by the user to find certain passages in the book. It will also be recognized that since the book file contains all of the words in the book, simple word searches may be employed to find desired passages in the book. This is a distinct advantage over recording books on audio tape.
- the word as the words of a book file are being converted to speech and played back over the sound system of the computer, the word, or (preferably) a string of words containing the word being spoken so that the word can be seen in context, may be displayed in an enlarged font may be displayed on the monitor screen of the end user ' s computer. It will be recognized that many sight impaired persons do have partial vision and the enlarged font display aids these persons in comprehending the spoken words. Also, as the notes a person has made are played back, such notes may be displayed in the enlarged format.
- the speech generated by the computer sound system in speaking the words of the book passage or other text file sound as much like a human voice as possible.
- the use of the above-described digitized sound files of a human voice offers substantial advantages over current voice synthesized sounds.
- Another feature of this invention which makes the speech generated or spoken by the system of this invention is the use of the punctuation in the text to add pauses between sentences or phases, or to change the pitch of the last word or the last syllable of a word at the end of sentence or phrase.
- the reading program of this invention will increase the time interval until the next word is read and spoken. If the punctuation is a "question mark”, the pitch of the last word may be increased or inflected to generate a sound of the word indicating an interrogative sound. Likewise, if the punctuation is an "exclamation point”, the pitch and/or the loudness of the last word may be increased.
- the dictionary may include definitions of the words that may be selectively played back by the end user upon the end user commanding the computer to do so, as with key strokes or the like.
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU46652/00A AU4665200A (en) | 1999-04-29 | 2000-04-26 | System for storing, distributing, and coordinating displayed text of books with voice synthesis |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13154199P | 1999-04-29 | 1999-04-29 | |
US60/131,541 | 1999-04-29 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2000067249A1 true WO2000067249A1 (fr) | 2000-11-09 |
Family
ID=22449900
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2000/011182 WO2000067249A1 (fr) | 1999-04-29 | 2000-04-26 | Systeme de stockage, de distribution et de coordination de texte d'ouvrages affiche avec synthese vocale |
Country Status (2)
Country | Link |
---|---|
AU (1) | AU4665200A (fr) |
WO (1) | WO2000067249A1 (fr) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1122716A2 (fr) * | 2000-01-29 | 2001-08-08 | Deutsche Telekom AG | Appareil pour la transformation de textes imprimés en parole |
EP2316076A2 (fr) * | 2008-08-13 | 2011-05-04 | Hewlett-Packard Development Company, L.P. | Signets pour accès intégré flexible à un document publié |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4700322A (en) * | 1983-06-02 | 1987-10-13 | Texas Instruments Incorporated | General technique to add multi-lingual speech to videotex systems, at a low data rate |
US5384893A (en) * | 1992-09-23 | 1995-01-24 | Emerson & Stern Associates, Inc. | Method and apparatus for speech synthesis based on prosodic analysis |
US5475399A (en) * | 1990-05-21 | 1995-12-12 | Borsuk; Sherwin M. | Portable hand held reading unit with reading aid feature |
US5661635A (en) * | 1995-12-14 | 1997-08-26 | Motorola, Inc. | Reusable housing and memory card therefor |
US5820379A (en) * | 1997-04-14 | 1998-10-13 | Hall; Alfred E. | Computerized method of displaying a self-reading child's book |
US5864823A (en) * | 1997-06-25 | 1999-01-26 | Virtel Corporation | Integrated virtual telecommunication system for E-commerce |
US5991594A (en) * | 1997-07-21 | 1999-11-23 | Froeber; Helmut | Electronic book |
-
2000
- 2000-04-26 AU AU46652/00A patent/AU4665200A/en not_active Abandoned
- 2000-04-26 WO PCT/US2000/011182 patent/WO2000067249A1/fr active Application Filing
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4700322A (en) * | 1983-06-02 | 1987-10-13 | Texas Instruments Incorporated | General technique to add multi-lingual speech to videotex systems, at a low data rate |
US5475399A (en) * | 1990-05-21 | 1995-12-12 | Borsuk; Sherwin M. | Portable hand held reading unit with reading aid feature |
US5384893A (en) * | 1992-09-23 | 1995-01-24 | Emerson & Stern Associates, Inc. | Method and apparatus for speech synthesis based on prosodic analysis |
US5661635A (en) * | 1995-12-14 | 1997-08-26 | Motorola, Inc. | Reusable housing and memory card therefor |
US5820379A (en) * | 1997-04-14 | 1998-10-13 | Hall; Alfred E. | Computerized method of displaying a self-reading child's book |
US5864823A (en) * | 1997-06-25 | 1999-01-26 | Virtel Corporation | Integrated virtual telecommunication system for E-commerce |
US5991594A (en) * | 1997-07-21 | 1999-11-23 | Froeber; Helmut | Electronic book |
Non-Patent Citations (1)
Title |
---|
Zoom Text Etra Usre's Guide Version 6.1 Manchester Center, Vermont, USA: Ai Squared 1997. pp. 85, 88, 96, 106- 108, 117- 124, XP002929376 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1122716A2 (fr) * | 2000-01-29 | 2001-08-08 | Deutsche Telekom AG | Appareil pour la transformation de textes imprimés en parole |
EP1122716A3 (fr) * | 2000-01-29 | 2001-11-14 | Deutsche Telekom AG | Appareil pour la transformation de textes imprimés en parole |
EP2316076A2 (fr) * | 2008-08-13 | 2011-05-04 | Hewlett-Packard Development Company, L.P. | Signets pour accès intégré flexible à un document publié |
EP2316076A4 (fr) * | 2008-08-13 | 2011-08-10 | Hewlett Packard Development Co | Signets pour accès intégré flexible à un document publié |
Also Published As
Publication number | Publication date |
---|---|
AU4665200A (en) | 2000-11-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6334104B1 (en) | Sound effects affixing system and sound effects affixing method | |
Hockey | Electronic texts in the humanities: principles and practice | |
US5773741A (en) | Method and apparatus for nonsequential storage of and access to digital musical score and performance information | |
US5957697A (en) | Printed book augmented with an electronic virtual book and associated electronic data | |
US6095418A (en) | Apparatus for processing symbol-encoded document information | |
US20020120635A1 (en) | Apparatus and method for providing an electronic book | |
US20040017390A1 (en) | Self instructional authoring software tool for creation of a multi-media presentation | |
US20070182990A1 (en) | Reproduction of documents into requested forms | |
Jones et al. | Experiments in spoken document retrieval | |
US20020120651A1 (en) | Natural language search method and system for electronic books | |
Lesk | Going digital | |
Ekman et al. | Technology and scholarly communication | |
US20090083621A1 (en) | Method and system for abstracting electronic documents | |
WO2000067249A1 (fr) | Systeme de stockage, de distribution et de coordination de texte d'ouvrages affiche avec synthese vocale | |
Rawlins | The new publishing: Technology's impact on the publishing industry over the next decade | |
US20030040966A1 (en) | Marketing system | |
Romano | E-books and the challenge of preservation | |
Hawkins et al. | Forces shaping the electronic publishing industry of the 1990s | |
Pobiak | Adjustable access electronic books | |
Cook | Bible and Computer | |
Bevan | Transient technology? The future of CD‐ROMs in libraries | |
Spring et al. | The document processing revolution | |
Raine et al. | Rare Book Records in Online Systems | |
AU2005255044B2 (en) | Reproduction of documents into requested forms | |
Green | Some Issues Concerning Access to Information by Blind and Partially Sighted Pupils. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GH GM KE LS MW SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
122 | Ep: pct application non-entry in european phase | ||
NENP | Non-entry into the national phase |
Ref country code: JP |