WO1999024969A1 - Systeme de lecture affichant une representation en image amelioree - Google Patents

Systeme de lecture affichant une representation en image amelioree Download PDF

Info

Publication number
WO1999024969A1
WO1999024969A1 PCT/US1998/024134 US9824134W WO9924969A1 WO 1999024969 A1 WO1999024969 A1 WO 1999024969A1 US 9824134 W US9824134 W US 9824134W WO 9924969 A1 WO9924969 A1 WO 9924969A1
Authority
WO
WIPO (PCT)
Prior art keywords
word
document
representation
computer
image
Prior art date
Application number
PCT/US1998/024134
Other languages
English (en)
Inventor
Raymond C. Kurzweil
Colin Day
Original Assignee
Kurzweil Educational Systems, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kurzweil Educational Systems, Inc. filed Critical Kurzweil Educational Systems, Inc.
Priority to AU14021/99A priority Critical patent/AU1402199A/en
Publication of WO1999024969A1 publication Critical patent/WO1999024969A1/fr

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/169Annotation, e.g. comment data or footnotes

Definitions

  • This invention relates to displaying of electronic representations of documents on computer systems.
  • Computer systems are used to display on a monitor an image file representation of a document obtained from an optical scanner or the like. Often the viewed image is not of high quality. For instance, lines or edges of characters displayed on the monitor may often have a jagged appearance particularly when magnified.
  • Reading machines have been used to improve the educational attainment of individuals with learning disabilities.
  • reading machines are computer based having specialized software that processes an input source document and generates synthetic speech to enable the user to hear the computer read through the document a word, line, sentence, etc. at a time.
  • these reading machines include a scanner to provide one technique to input source documents to the reader.
  • the scanner provides an image file representation of a scanned document.
  • the personal computer using optical character recognition software produces an OCR file including generated text information.
  • the OCR file is used by the display system software to display a text based representation of the scanned document on the monitor.
  • the OCR file text is also used by speech synthesis software to synthesize speech.
  • a computer program product residing on a computer readable medium includes instructions for causing a computer to display an image representation of a scanned document on a computer monitor and selectively replace text image representations of the scanned document with text represented in a scalable, mathematically defined font.
  • the product also causes the computer to manipulate the displayed replaced scalable mathematically defined font representation of the document by using positional information associated with a converted text file generated by converting the image file representation of the document into the converted text file. In this manner a more pleasing display of the image representation of the word is provided and by repetitively performing this operation over all of the words that are in the currently viewed portion of the image, that portion of the image can be replaced with a scalable font.
  • With a scalable font the text will not appear jagged while zooming in or out on the text. If a zoom value or a view section of the image is changed the process can be repeated to re-replace the scanned text images with the scalable font counterpart .
  • FIG. 1 is a block diagram view of a general purpose computer system which in one embodiment is a reading machine system;
  • FIG. 2 is a flow chart showing steps used in displaying a scanned image representation of a document for use in system of FIG. 1 when configured as a general purpose computer or as the reading system embodiment;
  • FIG. 3 is a flow chart showing steps used to associate user selected text on the displayed image representation to OCR generated text to permit voice synthesis and highlighting of the image representation for the reading system embodiment ;
  • FIG. 3A-3B are flow charts showing the steps used to enhance the display image of a document on the general purpose computer or reading machine systems of FIG. 1;
  • FIGS. 4A-4C are flow charts which show steps used in calculating a nearest word for use in the process described in conjunction with FIG. 3 ;
  • FIG. 4D is a pictorial illustration of a portion of an image representation of text displayed on a monitor useful in understanding the process of FIGS. 4A-4C;
  • FIG. 5 is a flow chart showing steps used to highlight a selected word for use in the process described in conjunction with FIG. 3;
  • FIG. 6 is a diagrammatical representation of a data structure used in the process of FIG. 3;
  • FIGS. 7-9 are diagrammatical views of detailed portions of the data structure of FIG. 6;
  • FIGS. 10A - IOC are flow charts of an alternative embodiment for determining the nearest word; and FIG. 11 is a pictorial illustration of a portion of an image representation of text displayed on a monitor useful in understanding the process of FIGS. 10A-10C.
  • a reading machine 10 includes a computer system 12 such as a personal computer.
  • the computer system 12 includes a central processor unit (not shown) that is part of a processor 14.
  • a preferred implementation of the processor 14 is a Pentium ® based system from Intel Corporation, Santa Clara, California, although other processors could alternatively be used.
  • the processor includes main memory, cache memory and bus interface circuits (not shown) .
  • the computer system 12 includes a mass storage element 16, here typically the hard drive associated with personal computer systems.
  • the reading system 10 further includes a standard PC type keyboard 18, a sound card (not shown), standard monitor 20 as well as speakers 22, a pointing device such as a mouse 19 and a scanner 24 all coupled to various ports of the computer system 10 via appropriate interfaces and software drivers (not shown) .
  • the computer system 12 here operates under a WindowsNT ® Microsoft Corporation operating system although other systems could alternatively be used.
  • image display and conversion software 30 (FIG. 2) that controls the display of a scanned image provided from scanner 24.
  • the software 30 permits the user to control various features of the reader by referencing the image representation of the document displayed by the monitor.
  • the steps used in the image display and conversion software 30 include scanning an input document in a conventional manner to provide an image file 31 (step 32) .
  • the image file 31 is operated on by an optical character recognition (OCR) module 34.
  • OCR optical character recognition
  • the OCR module 34 uses conventional optical character recognition techniques (typically software based) on the data provided from the scanned image 32 to produce an output data structure 35.
  • image-like representations can be used as a source such as a stored bit-mapped version of a document.
  • the array of OCR data structures generally denoted as 35 produced by step 34 includes information corresponding to textual information or the OCR converted text, as well as, positional and size information associated with the particular text element.
  • the positional and size information associates the text element to its location in the image representation of the document as displayed on the monitor 20.
  • a data structure element 140 includes for a particular word an OCR text representation of the word stored in field 142.
  • the data structure 140 also has positional information including X-axis coordinate information stored in field 143, Y-axis coordinate information stored in field 144, height information stored in field 145 and width information stored in field 146.
  • This positional information defines the bounds of an imaginary rectangle enclosing an area associated with the corresponding word. That is, if a pointer device such as a mouse has coordinates within the area of this rectangle, then the mouse can be said to point to the word within the defined rectangle.
  • the image file 31 is fed to a display system 38 which in a conventional manner processes the image file to display the document that it represents on the monitor at step 39.
  • the text file 35 provides one input along with commands driven by the operating system (not shown) to a module 40 which is used to associate user initiated actions with an image representative of a scanned document.
  • both the image file 31 and the text file 35 are stored in the reading system for use during the session and can be permanently stored.
  • the files are stored using generally conventional techniques common to WindowsNT ® or other types of operating systems.
  • step 36 if a user has selected an option to enhance the image at step 36, the process will jump to step 37 and call image display enhancement software as will be described in conjunction with FIGS. 3A and 3B.
  • the user controls operation of the reading system embodiment 10 with reference to the image displayed on the monitor 20 by the steps generally shown by the software module 40.
  • a user initiates reading of the scanned document at the beginning of the document by selecting a reading mode.
  • the user can have the document start reading from any point in the document by illustratively pointing to the image representation of an item from the scanned document displayed on the monitor at step 42.
  • the document item is the actual image representation of the scanned document rather than the conventional text file representation.
  • the item can be a single word of text, a line, sentence, paragraph, region and so forth.
  • the user activates a feature to enable the reading machine to generate synthesized speech associated with the selected image representation of the document item. For purposes of explanation, it will be assumed that the document item is a word.
  • a pointer such as a mouse can point within the text in an image in other ways that emulate the pointer behavior typically used in computer text displays and word processing programs. For instance, by simply pointing to a word the software selects a position in the text before the word; whereas, pointing to a word and clicking a mouse button twice will cause the word to be selected and pointing to a word and clicking an alternate mouse button selects several words, starting at a previously determined point and ending at the word pointed to. The user can use a mouse or other type of pointing device to select a particular word. Once selected, the software fetches the coordinates associated with the location pointed to by the mouse 19 (FIG. 1) at step 44. Using these coordinates the word or other document item nearest to the coordinates of the mouse is determined. The information in the data structure 100 is used to generate highlighting of the word as it appears on the display item as well as synthesized speech as will be described.
  • the searching step 46 as will be further described in conjunction with FIGS. 4A-4C will search for the nearest word.
  • a searching step 46' as will be described with FIGS. 10A-10C can also be used.
  • the search operation performed by searching step 46' is based upon various attributes of a scanned image.
  • the highlighting is applied to an area associated with the item or word at step 48.
  • the text corresponding to the nearest document item is also extracted at step 50.
  • the software will call one of several possible routines to provide an enrichment feature for the reading system 10.
  • the user can request a definition of a word and call a dictionary program, can have the current word read aloud, can have the current word spelled, can provide a synonym for the current word, or can look up a translation of the current word in a foreign dictionary, and so forth.
  • a text representation of an element This element may be the word that was retrieved from step 42 or material corresponding to the definition of the word, the spelling of the word, and so forth.
  • a routine 37 used to enhance the displayed image of the scanned document replaces text in the viewed, displayed portion of the document with corresponding text presented in a mathematically defined, scalable font commonly referred to as a "true type" font. Any particular font face can be used. Such fonts are obtainable from various sources. In essence true type fonts are characterized as being scalable, that is, they can be displayed and printed in any size. Such fonts are stored as mathematical descriptions of the individual characters.
  • a font face is selected.
  • the software determines whether there are any words in the image. The software will retrieve a next word and examine it to determine whether it is a word or a null. If there are no words (a null) in the image, then the process will terminate at the "End" of the routine and exit. If there are words in the image, however, the process will calculate the viewed portion of the entire image. The coordinates of the viewed portion of the entire image are available from the image display software 38. That is, the process will determine the boundaries of text information that is currently displayed in the document by reference to the data provided in the OCR text file 31.
  • the software 37 will fetch from the OCR text file a first one of the words as displayed in the image.
  • the software fetches the locational information of the word. Since the OCR text file exactly describes the location of the word as is contained in the image, the software calculates at step 256 whether the coordinates of the word fetched from the OCR text file fall within the coordinates circumscribed by the currently viewed portion of the image. The word is visible if the coordinates of the word are inside the coordinates of a view bounding rectangle. The word is not visible if the coordinates of the word are not inside the coordinates of the view bounding rectangle.
  • the process 37 determines whether there are any additional words in the document at step 260 and, if there are additional words, will fetch the next word at step 262. Again, at step 256 the process will determine whether the next word is within the current view of the document. When any one of the words is within the current view of the document, as determined at step 256, the process branches to a routine 258 which replaces the image with the selected true type font. That is, the process 37 will pass the text of the word corresponding to that received from the OCR text file as well as a selected font into routine 258 to generate a representation of the word which will fit in the space occupied by the image representation of the word as displayed. The process 258 replaces the image representation with the corresponding true type font.
  • the process 258 calculates a word height with the default point size at step 263 and a word rectangle, so that the word would be viewed as a true type font in the region currently occupied by the image representation of the word on the display.
  • the operating system can provide the bounding rectangle of the text given the font and point size.
  • the visible bounding rectangle of a word (R3 ' ) is given by the bounding rectangle of the word (R3) on the entire document (Rl) offset to the view bounding rectangle (R2) multiplied by a zoom factor.
  • R3 ' (R3- (R2-R1) ) *Z
  • the process computes the point size necessary to fit the height of the true-type font replacement (Ph) of the image of the word into the computed rectangle. Given the bounding rectangle of the viewed image of the word (R5) , the size of the default True type font text replacement image (R6) and the default point size, the height of new point size (Ph) is calculated as:
  • Ph P * (R5. height/R6. height)
  • R5. height and R6. height are the heights of the respective word and text replacement.
  • the process determines whether the width of the word will fit into the computed rectangle in a similar manner as for the height by:
  • step 268 branches to step 270 to compute a point size so that the width of the word will fit into the computed rectangle.
  • the process branches to step 272 where the image of the word is erased from the displayed view of the document and a true type font text is clipped or pasted into the viewed rectangle using the computed size centered in the word rectangle as determined above. Erasing the image is accomplished by filling the word view bounding rectangle with the background color and drawing in the word in the true type font using calls into the operating system.
  • Described below are processes used to determine a nearest word in the image as well as a process used to highlight a word or apply double highlighting to a word. In essence, these processes can operated on a display of the document by use of the image file or on a displayed enhanced image representation provided with the true type font.
  • the process described below is used in conjunction with the scanned image. However, it should be appreciated that either the scanned image or the enhanced image representation provided by the above-mentioned replacement of the image with the true type font can be used.
  • the software makes reference to the OCR data structure to determine positional information to associate the reading software, highlighting software or other software with respect to commands by the user.
  • the above enhanced image representation can be saved in a file for later use by building a data structure incorporating the true type font substituted text, it is desirable not to.
  • the displayed window can be maximized, minimized or otherwise scaled to a particular size. This scaling of the window necessitates a recalculation of the font sizes and fitting of the font in the proper position on a document view.
  • the image enhancement software can be used on a conventional general purpose computer which displays image representation of scanned or otherwise obtained documents where it is desirable to improve the readability and image quality of the displayed representation of the document .
  • a pointer is initialized and a maximum value is loaded into a displacement field 51b of structure 51 (FIG. 4D) .
  • the displacement field 51b is used to store the smallest displacement between a word boundary and the coordinates of the pointing device.
  • the pointer initialized at step 60 is a pointer or index into the OCR generated data structure 35 (FIG. 6) .
  • the software 46 retrieves each word entry in the data structure 35 to determine for that word in accordance with the image relative position information (which is the same wether or not the true type font replacement is used) associated with the OCR text generated word whether or not that particular word is the closest word to the coordinates associated with the user's pointing device.
  • the coordinates associated with a first one of the words are fetched.
  • the coordinates associated with the first one of the fetched words are used to determine whether the pointing device is pointing to a location within a box 65 5 that is defined around the word.
  • the mouse points to a spot 61 having coordinates X ir Y j .
  • an imaginary box here 65 5 is assumed to exist about the word “IMAGE” in FIG. 4D.
  • the word "image” as well as the other words can be in the replaced true type font representation.
  • the pointing device coordinates fall within the box 65 5 , the pointing device would be considered to point to the document item "IMAGE" associated with the box 65 5 .
  • each of the words will have associated therewith the OCR text converted from the image file 31, as well as position and size data that identifies the position and size of the word as it appears on the original document. Accordingly, this information also locates the word as it appears in the displayed image or replaced true type font representation of the document. Thus, when determining the closest word to a position pointed to by a mouse, it is necessary to determine the boundaries of the box that the particular word occupies.
  • the software determines whether or not point 61 falls within the box by considering the following:
  • step 66 control will pass directly to step 50 (FIG. 4B) .
  • the point (c, d) can be determined by subtracting the height of the box from the x coordinate (a associated with the image and adding the width of the box associated with the y coordinate (b.) of the image. This is true also if the word is presented in a true type font. If, however, the point 61 is not within the box as is shown, then the software 46 determines the word which is nearest to the point 61 at step 68 by one of several algorithms.
  • a first algorithm which can be used is to compute the distance from a consistent corner of the box associated with the word to the position of the mouse pointer 61.
  • the distance (S) to a consistent corner would be computed as the "Pythagorean" technique as follows:
  • this equation can be used at each corner of each box and further processing can be used to determine which one of the four values provided from each corner is in fact the lowest value for each box.
  • the computed value (S) is compared to the previous value stored in displacement field 51b. Initially, field 51b has a maximum value stored therein and the smaller of the two values is stored in field 51b at step 72. Accordingly the first computed value and the index associated with the word are stored in the structure 51 as shown in FIG. 4C.
  • step 74 it is determined whether or not this is the end of the data structure. If it is the end of the data structure then control branches to step 50 and hence step 52. If it is not the end of the data structure then the pointer is incremented at step 76 and the next word in the data structure as determined by the new pointer value is fetched at step 62.
  • step 72 will determine whether the previously stored value (S p ) in fields 51a, 51b is greater than or less than a current calculated value (S c ) for the current word. If the current value (S c ) is less than the previous value S p , then the current value replaces the previous value in field 51b and the index associated with the current value replaces the previous index stored in field 51a.
  • the structure 51 keeps track of the smallest calculated distance (S) and the index (i.e., word) associated with the calculated distance. The process continues until the positional data for all of the words in the data structure associated with the particular image have been examined. The values which remain in the data structure 51 at the end process thus correspond to the closest word to the location pointed to by the mouse coordinates 61.
  • the process 40 applies highlighting as appropriate to the selected item.
  • Prior techniques for providing highlighting would simply highlight a line or a paragraph in the text representation displayed on the monitor. The highlighting would be of the current word that is being read aloud to the user. Although this is acceptable, a preferred approach as described herein applies double highlighting and still preferably applies double highlighting to an image or true type replaced representation of a scanned document.
  • the highlighting process 48 is shown to include a step 80 in which an event is awaited by the software 48.
  • the event is typically an operating system interrupt-type driven operation that indicates any one of a number of operations such as a user of the reading machine 10 initiating speech synthesis of a word, sentence or paragraph.
  • the highlighting process 48 remains in that state until an event occurs. When an event occurs all previous highlighting is turned off at step 82.
  • the previous highlighting is turned off by sending a message (not shown) to the display system 38 causing the display system to remove the highlighting.
  • the highlighting process checks whether a unit of text has been completed.
  • a unit can be a word, line, sentence, or a paragraph, for example, as selected by the user.
  • step 90 highlighting of the unit is also turned off at step 90.
  • the software checks for an exit condition at step 91 after the coordinates have been fetched.
  • An exit condition as shown in step 91 can be any one of a number of occurrences such as reaching the last word in the array of OCR data structures 35 or a user command to stop coming from the keyboard 18 or other input device. If an exit condition has occurred at step 91, the routine 48 exits to step 92.
  • next unit is determined at step 93.
  • the next unit of text is determined by using standard parsing techniques on the array of OCR text structures 35. Thus, the next unit is determined by looking for periods for example to demarcate the end of sentences, and indents and blank lines to look for paragraphs. In addition, changes in the Y coordinate can be used to give hints about sentences and lines. Other document structure features can also be used.
  • the next unit is then highlighted at step 94 by instructing the display system software 38 (FIG. 2) to apply a transparent color to the selected next unit. This is a first level of highlighting provided on a unit of image or true type font replaced representation of the scanned document . Control transfers back to step 86.
  • step 86 which is arrived at directly from step 84 or from step 92, the coordinates of the next word that is to be synthesized and highlighted are fetched.
  • the software checks for an exit condition at step 88 after the coordinates have been fetched.
  • An exit condition as shown in step 88 can be any one of a number of occurrences such as reaching the last word in the array of OCR data structures 35 or a user command to stop provided from the keyboard 18 or other input device. If an exit condition has occurred at step 88, the routine 48 exits to step 89. Otherwise, at step 96 a second highlight is applied to the image or true type font replaced representations, here preferably with a different transparent color and applied only to the word which is to be synthesized by the speech synthesizer 52.
  • the pointer to the next word in the data structure 35 is then incremented at step 98 to obtain the next word.
  • the second highlighting is provided by sending a message to display system software 38 containing the positional information retrieved from the data structure. This process continues until an exit condition occurs at step 88.
  • the single and the dual highlighting above was described as applying two distinct, transparent colors to selected image or true type font representations of the displayed document.
  • other highlighting indicia can be used such as bold text, font style or size changes, italics, boxing in selected text, and underlining.
  • combinations of these other indicia with or without colors could be used. Referring now particularly to FIGS.
  • the data structure 35 is hierarchically organized. At the top of the data structure is a page, data structure 110.
  • the page includes pointers HOa-llOe to each one of a plurality of regions 120.
  • a region is here a rectangular shaped area that is comprised of one or more rectangular lines of text. If there are multiple line of text in a region, the lines do not overlap in the vertical direction. That is, starting with the top line, the bottom of each line is above the top of the next line.
  • the regions may include headers, titles, columns and so forth. The headers may or may not straddle more than one column and so forth.
  • the regions likewise include a plurality of pointers 120a-120e to each one of corresponding lines 130 shown in the data structure 130.
  • the lines correspondingly have pointers 130a-130e to each of the words contained within the line.
  • the detail structure of items 140, 130 and 120 include a plurality of fields.
  • FIG. 7 for the word includes the text field 142 which has the OCR generated text and has fields 143 and 144 which provide rectangular coordinate information x and y, respectively, as well as fields 145 and 146 which provide here height and width information. Similar data are provided for the lines as shown in FIG. 8 as well as regions as shown in FIG. 9.
  • pointers are again initialized to a first one of the regions, as shown by step 180 and the coordinates of the region's boundary box are fetched at step 182 from the data structure 120.
  • the position (X, Y) of the pointer is calculated to determine whether or not it falls within a box defining a region.
  • FIG. 11 shows a sample region containing a plurality of lines of image or true type font replaced representations text and boxes illustrated about the region, lines and word. Also three sample positions 61, 61a, 61b of the pointing device (not shown) are illustrated.
  • the calculation for a region is performed in a similar manner as for calculating a box for a word described in conjunction with FIGs. 5A to 5C except that the positional information contained within the region data structure 120 is used to determine a box or other boundary associated with the region. Coordinates (r 6 ,s 6 ) and (t 6 ,u 6 ) denote the imaginary box about the illustrated region in FIG. 11. If at step 186 it is determined that the coordinates of the pointer fall within the box (as 61 and 61a -61d, FIG 11) , then the process branches to determine the nearest line in step 201 (FIG. 10B) . Otherwise processing continues to step 187 to determine whether or not the process has reached the last region in the region data structure 120.
  • step 194 point to the next region in the data structure 120. If the process 46' has reached the last structure hence the coordinates of the pointer device do not point to any word, as 61, (FIG.11). Therefore, a previously determined word is used, and the process exits. If at step 186 it was determined that the coordinates fall within a region's box, then at step 201 a similar process is used to determine the nearest line except that the line data associated with the data structure 130 (FIG. 8) is used for positional information and index information such as coordinates (l 4 ,m 4 ) and (n 4 ,o 4 ) .
  • positional information is used to determine whether the coordinates of the pointing device are within a box defined about the line by the positional information associated with the line. If the coordinates of the positioning device fall above the box associated with the line as point 61a, then the software will choose the first word of the line here the word "TEXT" . If the coordinates fall above the bottom of the line box as point 61b, then the software branches to step 220.
  • the software initializes a pointer to the top line in the region (at step 201) and fetches the coordinates of the line at step 202.
  • the coordinates which are fetched correspond to the top and bottom coordinates of an imaginary box positioned about the line.
  • the software calculates to determine whether the Y coordinate of the pointing device is above the line. This is accomplished by comparing the value of the Y coordinate of the pointing device to the Y coordinate (m 4 )of the uppermost point defining the box about the line, as shown for point 61b. If at step 206 it is determined that the Y coordinate is above the box defined about the line, the software chooses the first word on line step 208 and is done.
  • the software determines whether the Y coordinate is above the bottom of the box defining the line by using a similar approach as for the top of the line except using, for example, the coordinate (0 4 ) . If it is determined that the Y coordinate is equal to or above the bottom of the box defining the line, as point 61b then the software branches to step 220 (FIG. 10C) .
  • the X coordinate of the pointer is already known to be in the region and is not checked here. This allows for short lines to be detected. Lines are often shorter than the width of the region. For example, short lines may occur at the beginning and end of paragraphs or in text that is not justified to form a straight right margin. Otherwise, it continues to step 212 where it is determined whether the current line is the last line in the data structure 230. If it is not the last line in data structure 230, the pointer is incremented at step 216 to point to the next lower line in the region. If it is the last line in the data structure and the Y coordinate was not above the top of the line nor above the bottom of the line, the software chooses at step 214 the word after the word in the last line as for point 61c and is done.
  • pointers are again initialized to a first one of the words on a line, as shown by step 220 and the coordinates of the word are fetched at step 222 from the data structure 140.
  • the position X of the pointer is calculated to determine whether or not it falls at or to the left of the current word's right side at step 224 as for point 61a. This calculation is performed by comparing the X value of the pointer coordinate to the X value of the right side of the box defined about the word here coordinate a 5 of point (a 5 ,b 5 ) .
  • the pointing device is considered pointing to the left side of the word's right side.
  • the process determines whether or not it has reached the last word in the data structure 140. If it has not reached the last word in the data structure 140 the pointer is incremented at step 234 to point to the next word to the right. If it has reached the last word in the data structure 140, the software at step 230 will choose the word after the last word in the line (not illustrated) and the process is done. The chosen word is forwarded on to steps 48 of FIG. 3. In this manner double highlighting, as described in conjunction with FIG. 5, and speech synthesis as described above are performed on the word chosen by this process.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Processing Or Creating Images (AREA)

Abstract

Cette machine de lecture contient un programme informatique résidant sur un support lisible par ordinateur pouvant fonctionner avec un système d'exploitation, lequel permet de produire des fenêtres sur une unité d'affichage. Ce programme contient des instructions qui amènent l'ordinateur à lancer un premier appel sélectionné par l'utilisateur à destination d'un premier sous-programme sélectionné par l'utilisateur, en réponse à la sélection par l'utilisateur d'un premier mot (250) pour la compréhension duquel l'utilisateur sollicite une assistance, et à produire un fichier texte pour le premier mot sélectionné constitué de données qui assistent l'appel de l'utilisateur. Un sous-programme du système d'exploitation donne au système d'exploitation l'instruction de créer une nouvelle fenêtre sur l'unité d'affichage et remplit la nouvelle fenêtre sur l'unité d'affichage avec les informations contenues dans le fichier texte (252, 254-260). Un second appel sélectionné par l'utilisateur à destination d'un sous-programme sélectionné par l'utilisateur peut être lancé pour un mot pour la compréhension duquel l'utilisateur sollicite une assistance, afin d'obtenir une fonction d'enrichissement récursive (262).
PCT/US1998/024134 1997-11-12 1998-11-12 Systeme de lecture affichant une representation en image amelioree WO1999024969A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU14021/99A AU1402199A (en) 1997-11-12 1998-11-12 Reading system that displays an enhanced image representation

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US96864397A 1997-11-12 1997-11-12
US08/968,643 1997-11-12

Publications (1)

Publication Number Publication Date
WO1999024969A1 true WO1999024969A1 (fr) 1999-05-20

Family

ID=25514554

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1998/024134 WO1999024969A1 (fr) 1997-11-12 1998-11-12 Systeme de lecture affichant une representation en image amelioree

Country Status (2)

Country Link
AU (1) AU1402199A (fr)
WO (1) WO1999024969A1 (fr)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1858005A1 (fr) * 2006-05-19 2007-11-21 Texthelp Systems Limited Transmission en flux continu de la parole avec marquage synchronisé généré par un serveur
US8913138B2 (en) 2012-12-21 2014-12-16 Technologies Humanware Inc. Handheld magnification device with a two-camera module
US9298661B2 (en) 2012-12-21 2016-03-29 Technologies Humanware Inc. Docking assembly with a reciprocally movable handle for docking a handheld device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5500919A (en) * 1992-11-18 1996-03-19 Canon Information Systems, Inc. Graphics user interface for controlling text-to-speech conversion
US5509092A (en) * 1992-12-03 1996-04-16 International Business Machines Corporation Method and apparatus for generating information on recognized characters
US5623679A (en) * 1993-11-19 1997-04-22 Waverley Holdings, Inc. System and method for creating and manipulating notes each containing multiple sub-notes, and linking the sub-notes to portions of data objects

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5500919A (en) * 1992-11-18 1996-03-19 Canon Information Systems, Inc. Graphics user interface for controlling text-to-speech conversion
US5509092A (en) * 1992-12-03 1996-04-16 International Business Machines Corporation Method and apparatus for generating information on recognized characters
US5623679A (en) * 1993-11-19 1997-04-22 Waverley Holdings, Inc. System and method for creating and manipulating notes each containing multiple sub-notes, and linking the sub-notes to portions of data objects

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1858005A1 (fr) * 2006-05-19 2007-11-21 Texthelp Systems Limited Transmission en flux continu de la parole avec marquage synchronisé généré par un serveur
US8913138B2 (en) 2012-12-21 2014-12-16 Technologies Humanware Inc. Handheld magnification device with a two-camera module
US9298661B2 (en) 2012-12-21 2016-03-29 Technologies Humanware Inc. Docking assembly with a reciprocally movable handle for docking a handheld device

Also Published As

Publication number Publication date
AU1402199A (en) 1999-05-31

Similar Documents

Publication Publication Date Title
US5875428A (en) Reading system displaying scanned images with dual highlights
US6052663A (en) Reading system which reads aloud from an image representation of a document
US6137906A (en) Closest word algorithm
US6199042B1 (en) Reading system
US5999903A (en) Reading system having recursive dictionary and talking help menu
US8588528B2 (en) Systems and methods for displaying scanned images with overlaid text
US5350303A (en) Method for accessing information in a computer
US6068487A (en) Speller for reading system
US5327342A (en) Method and apparatus for generating personalized handwriting
US5978754A (en) Translation display apparatus and method having designated windows on the display
US6256610B1 (en) Header/footer avoidance for reading system
EP0609996A2 (fr) Procédé et dispositif pour créer, indexer et regarder des documents résumés
JPH05233630A (ja) 日本語又は中国語を記載する方法
KR910003523A (ko) 화상데이타를 이용하는 문서데이타 처리방식
US20150138220A1 (en) Systems and methods for displaying scanned images with overlaid text
JPH05151254A (ja) 文書処理方法およびシステム
US6140913A (en) Apparatus and method of assisting visually impaired persons to generate graphical data in a computer
WO1999024969A1 (fr) Systeme de lecture affichant une representation en image amelioree
JP6731011B2 (ja) 電子図書の表示用データの作成装置
JPH0630107B2 (ja) 文書処理装置
JPS61240361A (ja) 手書き文字による文書作成装置
EP0631677B1 (fr) Procede et appareil de mise en memoire et de visualisation de documents
JP2901525B2 (ja) 文字作成方式
JP2003167503A (ja) 電子学習機、学習支援方法及び学習支援方法を実行するためのプログラムを記録したコンピュータ読み取り可能な記録媒体
JPH10240120A (ja) 手話学習装置、手話学習方法及び手話学習データ記憶媒体

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE GH GM HR HU ID IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW SD SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
NENP Non-entry into the national phase

Ref country code: KR

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

NENP Non-entry into the national phase

Ref country code: CA

122 Ep: pct application non-entry in european phase