WO2005119572A2 - Systeme d'acquisition et d'interpretation d'images - Google Patents

Systeme d'acquisition et d'interpretation d'images Download PDF

Info

Publication number
WO2005119572A2
WO2005119572A2 PCT/US2004/030474 US2004030474W WO2005119572A2 WO 2005119572 A2 WO2005119572 A2 WO 2005119572A2 US 2004030474 W US2004030474 W US 2004030474W WO 2005119572 A2 WO2005119572 A2 WO 2005119572A2
Authority
WO
WIPO (PCT)
Prior art keywords
data
volumetric
layers
layer
accordance
Prior art date
Application number
PCT/US2004/030474
Other languages
English (en)
Other versions
WO2005119572A3 (fr
Inventor
David Falck
Jeffrey Sack
Shelly Mujtaba
Original Assignee
David Falck
Jeffrey Sack
Shelly Mujtaba
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by David Falck, Jeffrey Sack, Shelly Mujtaba filed Critical David Falck
Publication of WO2005119572A2 publication Critical patent/WO2005119572A2/fr
Publication of WO2005119572A3 publication Critical patent/WO2005119572A3/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/10Image acquisition

Definitions

  • This invention relates generally to acquisition of a workable image of an object of interest and in particular to a system that extracts and processes image and textual information, and is more particularly directed toward a system that extracts and processes image and textual information from three-dimensional volumetric scans of printed material.
  • the system of the Haas invention acquires an image of the letter using ordinary x- ray technology, for which the lead-core pencil writing provides excellent contrast.
  • the image may be processed for delivery to the intended recipient in electronic form.
  • Haas suggests that multiple pages of text may in fact be scanned provided that the pages are written with marking materials having discrete contrast levels, so that each page in the document is characterized by text having its own unique contrast level.
  • U.S. Patent No. 5,522,921, to Peter Custer also suggests a conventional x-ray technique for reading pages inside a sealed envelope or even within a bound document. The Custer technique also depends upon the use of special marking materials.
  • Custer suggests that a marking medium opaque to x-rays may be suitable for single pages.
  • Custer also discusses scanning techniques involving ultraviolet (UV) light in which the selected marking medium either fluoresces upon exposure to UV, or is UV absorptive.
  • UV ultraviolet
  • U.S. Patent No. 5,288,994 to William Berson suggests that most types of paper and envelopes used in ordinary correspondence are transparent to near infrared radiation, and that workable scans of such documents can be obtained using infrared scans of sufficient intensity.
  • Yet another methodology is revealed in U.S. Patent No. 5,811,792 to Gerrit Verschuur et al. Verschuur et al.
  • Computed tomography in its simplest form refers to an x-ray technique in which an object of which the image is desired is interposed between an x-ray source and x-ray detector.
  • the output signal of the detector is a measure of the attenuation experienced by the x-rays due to the mass, of heterogeneous radio-density, in their path.
  • Each volume element, or discrete 3D representation of physical space, or voxel, of the object may have an attenuation coefficient associated with it, where the attenuation coefficient may correspond to an intensity value of relative density.
  • the x-ray source and detector are linearly translated and rotated to span the entire object.
  • the above-described simple form (translate-rotate) has been further enhanced by second generation tomography techniques that employ multiple x-ray beams and an array of detectors, hence reducing acquisition time significantly.
  • Further advancements in the field, commonly referred to as third generation tomographs include an array of detectors arranged in one or more rows in the shape of a circular arc sufficiently large so that the object is within the detector field of view at all times.
  • a method for acquiring and interpreting a three-dimensional (3D) volumetric scan of an object, wherein the object includes a plurality of surfaces and radio-densities containing information of interest.
  • the method comprises the steps of acquiring three- dimensional volumetric data related to the object, analyzing the volumetric data to determine physical boundaries of the surfaces containing information of interest, and extracting two-dimensional image components of the information of interest.
  • the step of acquiring three-dimensional volumetric data further comprises the step of scanning the object to acquire layers of multiple data points wherein each layer represents a unique slice of the object oriented approximately parallel to a predetermined planar reference.
  • scanning of the object is accomplished through tomographic scans of the object to acquire the volumetric data.
  • the step of analyzing the layers of data further comprises digitizing each layer of volumetric data and storing the extracted information in a data structure. Digitizing each layer of volumetric data may further comprise the step of organizing the volumetric data in the data structure into voxels wherein each voxel has an associated metric for the voxel.
  • the metric may be an attenuation coefficient value derived from the volumetric scan of the object based on differences in radiodensity of a voxel in relation to adjacent voxels.
  • any obtainable property or attribute of the voxel relative to adjacent voxels which allows discernment between ink, paper and air may also be stored and utilized for analysis.
  • the step of analyzing the layers of data further comprises the steps of searching through each layer of volumetric data, beginning at a data point known to represent a point outside the object to establish a threshold value, comparing attenuation coefficient values associated with each point within the layers of volumetric data with the threshold value to establish physical boundaries of the surfaces within the object containing information of interest, and storing the data points representing said surfaces as a data structure of surface-related data points, or voxels.
  • the process may also include normalizing the data structure of surface-related data points to approximate the actual orientation of the surfaces within the object and allowing for the programmatic correction for surface curvature.
  • the step of extracting two- dimensional image components further comprises the steps of searching the layers of data along each surface of the object, determining relative attenuation coefficient values associated with each voxel disposed within each surface, and constructing a two-dimensional data structure in a predetermined format.
  • the two-dimensional image component that represents the surface of a page can be created along any plane, or set of planes, oriented in any dimension of the three-dimensional data structure.
  • the predetermined format of the two-dimensional image component may comprise a memory-based data object or a physical representation such as a TIFF image file.
  • the method may also include analyzing the two-dimensional image components using optical character recognition (OCR) to create a text file derived from the image components.
  • OCR optical character recognition
  • the method may further comprise subjecting the two-dimensional image components to additional analysis to extract information and creating an image file corresponding to the two-dimensional image components.
  • Additional output formats may include file formats, which store both text and image information. Apparatus for accomplishing the method recited is also disclosed. Further objects, features, and advantages of the present invention will become apparent from the following description and drawings.
  • FIGS. 1(A) and 1(B) represent, schematically, a system in accordance with the present invention
  • FIG. 2 is a perspective view of a bound volume shown from the top
  • FIG. 3 is a side elevational view of a bound volume
  • FIG. 4 is a stylized isometric view of a bound volume illustrating the orientations of the x-, y-, and z-axes
  • FIG. 5 is a cross-sectional view of a book along the x-z plane
  • FIG. 6 represents a 3D portion . of a volume being processed, with each spherical element representing a voxel of volumetric data
  • FIG. 1(A) and 1(B) represent, schematically, a system in accordance with the present invention
  • FIG. 2 is a perspective view of a bound volume shown from the top
  • FIG. 3 is a side elevational view of a bound volume
  • FIG. 4 is a stylized isometric view of a bound volume illustrating the orientation
  • FIG. 7 illustrates the curvature correction and path taken by the algorithm along the page surface, as well as the voxel layers representing the pages of a bound document;
  • FIG. 8 is a flowchart of the data acquisition process and algorithms used to determine page separation and to extract page data from tomographic scans;
  • FIG. 9 is a sample of a sub-section of a book scan after processing with the custom algorithm;
  • FIG. 10 is another sample of a sub-section of a book scan after processing; and
  • FIG. 11 illustrates page separations.
  • FIG. 1 An x-ray source A is directed so that x-rays C pass through an object such as bound document E.
  • An array of detectors B is disposed along a circular arc such that x-rays D that have passed through the volume E impinge upon the detectors B.
  • the bound volume is placed within a chamber F, while shielding G is provided for the sake of safety.
  • the shielding G may also be part of an external harness by which the x-ray source A and detectors B are attached to enable rotation of each around the bound document E.
  • FIG. 2 is a perspective view of a bound volume or book 201 demarcated in 3 dimensions (x corresponds to page width, y to page length, and z to the depth of the volume) .
  • a three-dimensional (3D) tomographic scan of the book 201 is made by taking multiple slices of the book along the plane defined by the x-y plane or the x-z plane.
  • FIG. 4 illustrates the orientations of the x-axis 401, y-axis 402, and z-axis 403, as well as the axial nature of the tomographic slices 404.
  • Each slice is digitized by a scanner and saved as an individual image file (such as a TIFF file, for example) .
  • FIG. 5 is a cross-sectional view of a book along the x-z plane.
  • Custom algorithms determine page separation based on page slope, page thickness, and relative differences in the attenuation coefficient of air, paper, and ink.
  • Each slice is analyzed programmatically, and each voxel of the data is assigned a value based upon its intensity value of relative radio-density.
  • the slices are combined and collected data points are stored in a three dimensional matrix. Assume that the relative density range for air, paper and ink have already been empirically established.
  • FIG. 6 represents a 3D portion of a volume being processed, with each spherical element 601 representing a voxel of volumetric data. For illustrative purposes, only the voxels representing pages are displayed. Page curvature, which can be accounted for programmatically, is illustrated.
  • the algorithm can be further described as follows (please refer to FIG. 8 for a flowchart of the algorithm) : 1. Travel to x+1 data point from S(x,y,z) i.e. S(x+l,y,z). If E up (x+l,y,z) is true, store this data point in the 2-D matrix as M(x+l,y,z) (See FIG. 8, block 807) . 2. Otherwise travel to S (x+1,y, z-1) and S(x+l,y,z+l) , one of these should return true for E up (See FIG. 8, block 810) . Store this value in the 2-D matrix (See FIG. 8, block 811) . (This is the simplest form of the algorithm; in a more complex form, the variation along Z axis could be wider than 2 data points depending on a pre-determined Z value threshold)
  • FIG. 7 illustrates the curvature correction and path taken by the algorithm along the page surface, as well as the voxel layers representing the pages of a bound document.
  • the system of the present invention has the capability to recognize characters in any language, and well as being able to differentiate information printed on both sides of each page. Pages can simply be imaged, or raw text can be manipulated to produce translations, for example. The raw text can be exported in any desired format.
  • the algorithm can be designed to compensate for page sections that may not align quite right due to age or condition of the original.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Image Processing (AREA)
  • Apparatus For Radiation Diagnosis (AREA)

Abstract

La présente invention concerne un système d'acquisition et d'interprétation d'images dans lequel on extrait et traite des données textuelles et à base d'images depuis des matériaux imprimés en trois dimensions. Le système de l'invention permet à l'utilisateur d'extraire des données de matériaux reliés sans manipuler physiquement les documents d'une façon ou d'une autre. Les applications de l'invention concernent la lecture optique, le traitement et la recherche spécialisée dans des textes anciens ou fragiles (livres, rouleaux, etc.) ainsi que dans des dossiers juridiques ou médicaux. Cela permet à l'utilisateur du système de faire des manipulations et des recherches, et d'établir des hyperliens dans des documents physiques de la même façon que pour un document électronique. Pour un mode de réalisation de l'invention, une source (A) de rayons X est dirigée de façon que les rayons X (C) traversent un objet tel qu'un document relié (E). Une batterie de détecteurs (B) est disposée en arc de cercle de façon que les rayons X (D) qui ont traversé le volume (E) arrivent en incidence sur les détecteurs (B).
PCT/US2004/030474 2004-05-25 2004-09-17 Systeme d'acquisition et d'interpretation d'images WO2005119572A2 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US57405404P 2004-05-25 2004-05-25
US60/574,054 2004-05-25

Publications (2)

Publication Number Publication Date
WO2005119572A2 true WO2005119572A2 (fr) 2005-12-15
WO2005119572A3 WO2005119572A3 (fr) 2006-10-26

Family

ID=35463586

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2004/030474 WO2005119572A2 (fr) 2004-05-25 2004-09-17 Systeme d'acquisition et d'interpretation d'images

Country Status (1)

Country Link
WO (1) WO2005119572A2 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2938065A1 (fr) * 2008-11-05 2010-05-07 I2S Procede de numerisation de livres en trois dimensions par ondes terahertz.

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6591004B1 (en) * 1998-09-21 2003-07-08 Washington University Sure-fit: an automated method for modeling the shape of cerebral cortex and other complex structures using customized filters and transformations

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6591004B1 (en) * 1998-09-21 2003-07-08 Washington University Sure-fit: an automated method for modeling the shape of cerebral cortex and other complex structures using customized filters and transformations

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2938065A1 (fr) * 2008-11-05 2010-05-07 I2S Procede de numerisation de livres en trois dimensions par ondes terahertz.
WO2010052427A1 (fr) * 2008-11-05 2010-05-14 I2S Procede de numerisation de livres en trois dimensions par ondes terahertz
US8564855B2 (en) 2008-11-05 2013-10-22 I2S Method for the three-dimensional digitization of books using terahertz radiation

Also Published As

Publication number Publication date
WO2005119572A3 (fr) 2006-10-26

Similar Documents

Publication Publication Date Title
CN110569832B (zh) 基于深度学习注意力机制的文本实时定位识别方法
Aradhye A generic method for determining up/down orientation of text in roman and non-roman scripts
JP3289968B2 (ja) 電子的文書処理のための装置および方法
US20100033772A1 (en) Multi-page Scanner/Copier and technique/method to simultaneously scan without separating pages or uncoupling documents or books
US8645819B2 (en) Detection and extraction of elements constituting images in unstructured document files
EP0544431B1 (fr) Procédés et appareil pour la sélection d'images sémantiquement significatives dans des images de documents, sans décodage du contenu de l'image
JP3576570B2 (ja) 比較方法
US7970171B2 (en) Synthetic image and video generation from ground truth data
EP0738987B1 (fr) Traitement des formulaires compréhensibles par une machine
Joshi et al. Script identification from Indian documents
JP2006285612A (ja) 情報処理装置およびその方法
JP4785655B2 (ja) 文書処理装置及び文書処理方法
GB2383223A (en) Correction of orientation of electronic document.
EP2291994A2 (fr) Découverte de la date de capture d'image d'un support de copie papier
Applbaum et al. The use of medical computed tomography (CT) imaging in the study of ceramic and clay archaeological artifacts from the ancient near east
US8605297B2 (en) Method of scanning to a field that covers a delimited area of a document repeatedly
CN106529597A (zh) 扫描图像文件生成装置
WO2005119572A2 (fr) Systeme d'acquisition et d'interpretation d'images
EP0585074A2 (fr) Génération automatique d'image par la fusion de l'image d'un texte avec celle d'une formule
US20060109526A1 (en) Case divider for organizing patient films
Tsuji Document image analysis for generating syntactic structure description
US10890543B2 (en) Book digitization apparatus and book digitization method
EP2289023A1 (fr) Détermination de l orientation d un tirage papier numérisé
EP3005676B1 (fr) Classification de media scannées sur support papier
Trieu et al. Machine printed and handwritten text discrimination in korean document images

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: DE

WWW Wipo information: withdrawn in national office

Country of ref document: DE

122 Ep: pct application non-entry in european phase