WO2000013144A1

WO2000013144A1 - Graphical display system and method

Info

Publication number: WO2000013144A1
Application number: PCT/US1999/019820
Authority: WO
Inventors: Robert H. Thibadeau
Original assignee: Antique Books, Inc.
Priority date: 1998-08-31
Filing date: 1999-08-27
Publication date: 2000-03-09
Also published as: EP1025548A1; JP2002523845A

Abstract

A computer-implemented method of preparing an image for graphical display. The image has a high spatial frequency component and a low spatial frequency component. The method includes receiving a scanned representation of the image and extracting the high spatial frequency component and the low spatial frequency component from the scanned representation of the image. The method also includes compressing the high spatial frequency component using a spatially lossless compression technique and compressing the low spatial frequency component using a spatially lossy compression technique.

Description

GRAPHICAL DISPLAY SYSTEM AND METHOD

INVENTOR Robert H. Thibadeau

CROSS-REFERENCE TO RELATED APPLICATIONS

(Not Applicable) STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH (Not Applicable) BACKGROUND OF THE INVENTION Field of the Invention

The present invention is directed generally to a method and system of graphically displaying data with a high spatial frequency on a background with a low spatial frequency and, more particularly, to a system and method for graphically displaying text on a color background.

Description of the Background

A user-oriented approach to the graphical display of books, and especially antique books, has fundamentally different requirements from other approaches to digital libraries. For example, the problem of displaying antique books is not at all similar to the problem faced by a librarian. A librarian must be interested in gaining access to the book, not in the reader's experience of reading it. It is also not at all similar to the problem faced by the archivist who wishes to have views preserved for scholarly research. Instead, the approach is that of a publisher, or re-publisher, who wishes to preserve, as best as possible, the intent of the original publisher both in page layout and in gaining audience. There are many projects underway that seek to digitize rare and precious old books such as that undertaken by IBM as detailed in Gladney, H., "Safeguarding Digital Library Contents and Users: Digital Images of Treasured Antiquities", D-Lib Magazine, July /August 1997. There are also projects such as the Gutenberg Project which is detailed in Sanchez, R., "The Digital Press", Internet World, vol. 6, no. 9, September 1995, to convert old books to ASCII or machine readable form. The Universal Library Project, as detailed at www.ul.cs.cmu.edu/adlrc/index.html, has recently experimented with improving viewing fidelity applied to the problem of old books, whether the old books are precious or not. Another method used is that in the "Adobe Capture" procedure that says that all the print on a page should be substituted with ASCII text and matching fonts. The aforementioned projects have the disadvantage that they only convert the books to ASCII and do not preserve the pleasures of reading an old book. This is similar in spirit to the common practice of preserving antique furniture and other items even though a particular antique may not be of museum quality. The aforementioned projects also have the disadvantage that they do not provide a reader the option of trading off different grades of viewing fidelity against page display speed, even while reading the book.

Thus, there is a need for a graphical display system and method which presents the pages of a book as they actually appear with discolored pages, pictures and print, in full color, thus approximating the experience of reading the actual book. There is also a need for a graphical display system and method which preserves high-resolution, archival quality scans that are universally accessible on the Internet and in electronic form. There is also a need for a graphical display system and method which allows the reader of the book to trade off different grades of viewing fidelity against page display speed while reading the graphics.

SUMMARY OF THE INVENTION The present invention is directed to a computer-implemented method of preparing an image for graphical display. The image has a high spatial frequency component and a low spatial frequency component. The method includes receiving a scanned representation of the image and extracting the high spatial frequency component and the low spatial frequency component from the scanned representation of the image. The method also includes compressing the high spatial frequency component using a spatially lossless compression technique and compressing the low spatial frequency component using a spatially lossy compression technique.

The present invention represents a substantial advance over prior graphical display systems and methods. The present invention has the advantage that it presents the pages of a book as they actually appear, with discolored pages, pictures and print, in full color, thus approximating the experience of reading the actual book. The present invention also has the advantage that it preserves high-resolution, archival quality scans that are universally accessible on the Internet and in electronic form. The present invention has the further advantage that it allows the reader of the book to trade off different grades of viewing fidelity against page display speed while reading the book.

BRIEF DESCRIPTION OF THE DRAWING

For the present invention to be clearly understood and readily practiced, the present invention will be described in conjunction with the following figures, wherein:

FIG. 1 is a diagram illustrating a graphical display system; and FIGS. 2 A - 2B are diagrams illustrating the flow through the system illustrated in

FIG. 1. DETAILED DESCRIPTION OF THE INVENTION

FIG. 1 illustrates a graphical display system 10. The system 10 includes a computer 12, which executes the various modules comprising the software portion of the system 10. The computer 12 may be any type of computing system suitable to execute the modules such as, for example, an IBM compatible PC, Apple Macintosh, a mainframe computer, a workstation, a personal decision aid (PDA), or an application specific integrated circuit (ASIC).

A document scanner 14 scans an image from a document 15 and transmits a scanned representation of the image to the computer 12 via a communications link 16. The scanner 14 can be any type of scanner suitable such as, for example, any type of document scanner manufactured by Hewlett-Packard, Fujitsu, or Agfa that is capable of 300 DPI resolution with 8 bits of red, green, and blue. The communications link 16 can be any type of link suitable such as, for example, hardwired RCA-type cables or BNC-type cables, or a wireless RF connection. An extract module 18, a text module 20, a background module 22, and a merge module 24 are resident on and executable by the computer 12. The extract module 18 extracts high and low spatial frequency components from the scanned representation of the image. The text module 20 processes the high spatial frequency component of the scanned representation of the image and compresses it using a lossless compression technique such as, for example, graphics interchange format (GIF) compression. The background module 22 processes the low spatial frequency component of the scanned representation of the image and compresses it using a lossy compression technique such as, for example, joint photographic expert group (JPEG) compression. The merge module 24 merges the compressed low spatial frequency component and the compressed high frequency spatial components into a form for display on such as, for example, a video monitor on the computer

12.

The modules 18, 20, 22, and 24 can be implemented in any type of computer language suitable such as, for example, C or C++, and may be implemented using, for example, object- oriented techniques. FIGS. 2 A and 2B are diagrams illustrating the flow through the system 10 of FIG. 1.

At step 26, the document 15 is scanned, or digitally photographed, at a resolution of, for example, at least 600 dots per inch (about 40 per millimeter) in full 24 bit color to produce, for example, a raw RGB representation 28. Such a high resolution is oftentimes necessary because even old book printing technologies were capable of generating fine detail. Modern printing technologies, such as Offset and Gravure commonly yield detail that would require scanning at 2400 dots per inch. Furthermore, while 600 dots per inch is usually sufficient for letterpress with carved plate engravings common to most books that are out of copyright, detail on a U.S. dollar bill, Intaglio technology that is almost two centuries old, goes as fine as .001 inch. The rule of thumb, derived from standard signal theory, is to have two pixels spanning the finest distinct feature to be recovered. This implies that a U.S. dollar bill should be imaged at 2000 dots per inch if the objective is to preserve all the fine lines. The fine detail in modern Offset and Gravure can be significant (repeatable) even to .0001 inch in halftones, but, as the term "halftone" implies, the effective detail by a color camera can usually be captured fully by scanning in the 2400 dot per inch range. The scan step 26 must be capable of fully and clearly capturing the print detail in the original image with a bias towards oversampling.

Print imperfections are commonly manifest in excessively fine detail. Such imperfections, when present in old books, are often smaller than can be intentionally produced through the printing processes. However, a 600 dots per inch image capture for letterpress books typically provides enough visual accuracy to render print imperfections in a satisfactory, if not completely accurate, fashion. This same principle, that some non-focal aspects of the image can be successfully approximated while others need to be highly accurate, also holds for the non-printed components of the image.

At step 30, the high spatial frequency (text and illustrations 32) and low spatial frequency (background 44) components of the raw RGB are extracted. High spatial frequency (fine detail) is typically associated with the printed detail of a book. The background, paper, and defect portions of the book are typically characterized by no need for high spatial frequency. Because print is typically black, monochrome, duochrome, or some variant thereof, and the paper and any defects are rich in color, the background color subtleties are preserved without preserving background detail. Step 30 thus gives an "improved view" quality to the scanned image.

Step 30 is slightly counterintuitive because the print is not extracted from the image and the result processed as background. Instead, the print is extracted from the image, and a background is independently generated by blanking print from the image. Independent parameters for the two operations are computed based on generally accepted image processing principles. Examples of the parameters are the pixel thresholds t, for extracting the text and illustrations 32 and t₂ for extracting the background 44, where t, > t₂. The reason for the bifurcation is that the independence gives great control over the eventual appearance of the rendering. Furthermore, because the display resolution is several times (more than two times) less than the scan resolution, aliasing artifacts that would be normal to such independent processing will tend to vanish at display resolution (typically 72 dots per inch). At step 34, the text and illustrations 32 are enhanced by, for example, the application of a first Laplacian addition. A Laplacian addition, or image sharpening, is a standard signal processing technique that simultaneously smoothes and makes edges sharper. Laplacian addition techniques are described in Hall, "Computer Image Processing and Recognition", pp. 394 et seq., which is incorporated herein by reference. Other edge enhancement techniques such as, for example unsharp masking and difference of Gaussians, can be used to enhance the text and illustrations 32.

At step 36, the enhanced text and illustrations are averaged down to display resolution, using an averaging filter. At step 38, a second Laplacian addition is applied to the text and illustrations 32 at display resolution. The first enhancement at step 34 tends to cause detail to be preserved through the averaging process and the second enhancement 38 helps remove the "defocussed" look that is common for averaging techniques.

At step 40 in FIG. 2B, the enhanced and reduced text and illustrations 32 are compressed using a lossless technique such as, for example, graphics interchange format (GIF). The result of the compression step, a GIF file 42, can be saved as a "transparent GIF" file. Using the GIF format has the advantage that it is a supported default format for most web browsers and does not require any "plugins" or other code downloads to a web client. Using GIF compression nevertheless achieves excellent digital compression of the original image. For example, for an original page image over twenty megabytes, the resultant processed image using the teachings of the present invention, both print and background, is approximately 50 kilobytes (400 to 1 compression is achieved by this technique). Because the techniques of the present invention focus on achieving display resolution, a higher resolution source (e.g. 2000 dots per inch) will only yield better compression ratios while preserving acceptable visual evidence of detail. As display technology improves to provide higher resolution displays, the effective compression ratios will decrease, but it is expected that Internet bandwidth will also increase, thus compensating for the decrease in effective compression ratios. Also, the fundamental properties of the present invention should not change until display resolution approximates the scan resolution (viz., is more than half scan resolution). The GIF format is also useful for another reason. GIF provides for "transparency"

(more generally, a special case of "alpha channels"). The print GIF is actually a transparent GIF in the browser. The transparent part shows through the background when displayed.

The background 44 is processed such that it is converted into a true HTML "background" type. The nature of the processing of the background 44 is fundamentally different from the nature of the processing for the text and illustrations 32. Because there is not enough background data in a region to give a result that does not appear overly dark, the text and illustration areas of the background 44 are blanked at step 46. The blanking is performed through an interpolation process on the text and illustration areas, which computes a color value based on the distances and colors of the other, non-text pixels in the vicinity. Any number of interpolation strategies will work because the objective is to preserve background value in a subsequent averaging step needed for display resolution. For example, the pixel values of the background 44 can be lightened by subtracting black. This is the same as increasing the brightness of the background 44 by adding a constant to the red, green, and blue components of the background 44. If the background interpolation yields a background 44 that is too black, a repeating pattern of background pixel values may be generated from common light background pixel values. Because few of the pattern pixels appear in the final rendering of the background 44, a wide range of pattern techniques that preserve gradual color variation will work such as, for example, tiling a circularly grated square tile of, for example, a 16 X 16 pixel square. At step 48, a low pass filtering process is performed to remove fine detail to improve compression performance after an averaging step 50, which reduces the background 44 to display resolution.

At step 52, the background 44 is compressed using a lossy compression technique such as, for example, the joint photographic expert group (JPEG) standard. The background 44 can be compressed to three alternative "quality" settings (as defined in the JPEG standard) having to do with preserving detail spatially and in subtlety in color. The JPEG standard is supported by most web browsers and tends to yield better compression of continuous tone (low spatial frequency) image data. Also, JPEG is a common HTML "background" MIME type.

At a display step 56, the compressed text and illustrations 32 and the compressed background 44 are merged for display by, for example, a web browser by aligning the text and illustrations 32 and the background 44 and overlaying the text and illustrations 32 on the compressed background 44. A user can select the quality by inputting a quality selection 58 to vary the fidelity of the view of the original page. The quality selection 58 selects, for example, one of the three JPEG background compressions for display. The different JPEG quality levels can represent, for example, the number of Discrete Cosine Transform (DCT) parameters permitted to represent a given block such as, for example, a l6 X 16 or an 8 X 8 area of pixels. The fewer the number of DCT parameters, the lower the fidelity of the display but the more compact the display and thus, a quicker time to display.

At step 56, the compressed text and illustrations and the compressed background images are aligned. The current HTML specification does not allow the explicit alignment of a background image with an overlaying transparent GIF image. Thus, the http client or the user must be prompted to input the browser and the platform that is being employed in viewing the displayed image.

The text and illustrations 32 are overlayed on the background 44 at step 56. The GIF compression format provides for assigning an 8-bit pixel value as transparent. The JPEG compression standard provides that all pixel values be displayed. At step 56, the text and illustrations 32 is defined as the foreground image over the background 44 to ensure that the text and illustrations 32 pixel values replace the background 44 pixel values except where the text and illustrations 32 pixel values are defined as transparent. The result of the display step 56 can be stored in a memory device or on a storage device such as, for example, a floppy disk or a compact disc. The result of the display step 56 can also be processed by standard optical character recognition techniques for creating full text indices of scanned images.

While the present invention has been described in conjunction with preferred embodiments thereof, many modifications and variations will be apparent to those of ordinary skill in the art. For example, although the system and method has been described hereinabove as being used for the display of old books, the teachings of the present invention may be used to graphically display any type of image. The foregoing description and the following claims are intended to cover all such modifications and variations.

Claims

CLAIMS I claim:

1. A computer-implemented method of preparing an image for graphical display, the image having a high spatial frequency component and a low spatial frequency component, comprising: receiving a scanned representation of the image; extracting the high spatial frequency component and the low spatial frequency component from said scanned representation of the image; compressing the high spatial frequency component using a spatially lossless compression technique; and compressing the low spatial frequency component using a spatially lossy compression technique.

2. The method of claim 1, further comprising scanning the image to generate said scanned representation of the image before receiving said scanned representation of the image.

3. The method of claim 1, further comprising combining said compressed high spatial frequency component with said compressed low spatial frequency component to create a graphical image.

4. The method of claim 3, wherein combining said compressed high spatial frequency component with said compressed low spatial frequency component includes aligning said compressed high spatial frequency component with said compressed low spatial frequency component.

5. The method of claim 3, wherein combining said compressed high spatial frequency component with said compressed low spatial frequency component includes overlaying said compressed high spatial frequency component on said compressed low spatial frequency component.

6. The method of claim 3, further comprising displaying said graphical image.

7. The method of claim 6, wherein displaying the graphical image includes selecting the display quality of the graphical image.

8. The method of claim 1 , wherein compressing the high spatial frequency component using a spatially lossless compression technique includes compressing the high spatial frequency component using GIF compression.

9. The method of claim 1, wherein compressing the low spatial frequency component using a spatially lossy compression technique includes compressing the low spatial frequency component using JPEG compression.

10. The method of claim 1, further comprising enhancing the high spatial frequency component before compressing the high spatial frequency component.

11. The method of claim 10, further comprising reducing the high spatial frequency component to display resolution after enhancing the high spatial frequency component.

12. The method of claim 11, further comprising enhancing the high spatial frequency component after reducing the high spatial frequency component to display resolution.

13. The method of claim 1, further comprising interpolating the low spatial frequency component before compressing the low spatial frequency component.

14. The method of claim 13, further comprising filtering the low spatial frequency component after interpolating the low spatial frequency component.

15. The method of claim 14, further comprising reducing the low spatial frequency component to display resolution after filtering the low spatial frequency component.

16. The method of claim 1 , wherein compressing the high spatial frequency component using a spatially lossless compression technique includes compressing the high spatial frequency component using transparent GIF compression.

17. A graphical display system, comprising: a text module for receiving a high spatial frequency component of a scanned representation of an image and for compressing said high spatial frequency component using a lossless compression technique; a background module for receiving a low spatial frequency component of said scanned representation and for compressing said low spatial frequency component using a lossy compression technique; and a merge module in communication with said text module and said background module, said merge module for merging said compressed high spatial frequency component with said low spatial frequency component.

18. The system of claim 17, further comprising an extract module in communication with said text module and said background module for extracting said high frequency component and for extracting said low frequency component from said scanned representation of said image.

19. A graphical display system, comprising: a processor; and a memory, coupled to said processor, and storing a set of ordered data and a set of instructions which, when executed by said processor cause said processor to perform the steps of: receiving a scanned representation of an image, said image including a high spatial frequency component and a low spatial frequency component; extracting said high spatial frequency component and said low spatial frequency component from said scanned representation of the image; compressing said high spatial frequency component using a spatially lossless compression technique; and compressing said low spatial frequency component using a spatially lossy compression technique.

20. The system of claim 19, further comprising: a communications link connected to said processor; and a scanner connected to said communications link.

21. A computer readable medium having stored thereon instructions which, when executed by a processor, cause the processor to perform the steps of: receiving a scanned representation of an image, said image including a high spatial frequency component and a low spatial frequency component; extracting said high spatial frequency component and said low spatial frequency component from said scanned representation of the image; compressing said high spatial frequency component using a spatially lossless compression technique; and compressing said low spatial frequency component using a spatially lossy compression technique.

22. A graphical display apparatus, comprising: means for receiving a scanned representation of an image, said image including a high spatial frequency component and a low spatial frequency component; means for extracting said high spatial frequency component and said low spatial frequency component from said scanned representation of the image; means for compressing said high spatial frequency component using a spatially lossless compression technique; and means for compressing said low spatial frequency component using a spatially lossy compression technique.

23. A computer-implemented method of preparing an image of an antique book for graphical display, the image having high frequency spatial text and illustration component and a low frequency spatial background component, comprising: receiving a scanned representation of the image; extracting the text and illustration component and the background component from said scanned representation of the image; compressing said extracted text and illustration component using GIF compression; and compressing said extracted background component using JPEG compression.

24. The method of claim 23, further comprising combining said compressed text and illustration component with said compressed background component to create a graphical image of the book.

25. The method of claim 24, further comprising displaying said graphical image.