EP1421555A1 - Verarbeitung digitaler bilder - Google Patents
Verarbeitung digitaler bilderInfo
- Publication number
- EP1421555A1 EP1421555A1 EP02741593A EP02741593A EP1421555A1 EP 1421555 A1 EP1421555 A1 EP 1421555A1 EP 02741593 A EP02741593 A EP 02741593A EP 02741593 A EP02741593 A EP 02741593A EP 1421555 A1 EP1421555 A1 EP 1421555A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- subareas
- threshold
- image
- subarea
- values
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 238000012545 processing Methods 0.000 title claims description 13
- 239000011159 matrix material Substances 0.000 claims abstract description 100
- 238000000034 method Methods 0.000 claims abstract description 47
- 238000004364 calculation method Methods 0.000 claims abstract description 45
- 238000004590 computer program Methods 0.000 claims abstract description 7
- 235000019557 luminance Nutrition 0.000 claims description 128
- 238000001914 filtration Methods 0.000 claims description 7
- 238000009499 grossing Methods 0.000 claims description 6
- 238000004519 manufacturing process Methods 0.000 claims description 3
- 238000006243 chemical reaction Methods 0.000 claims 1
- 230000006870 function Effects 0.000 description 13
- 238000007781 pre-processing Methods 0.000 description 10
- 238000006073 displacement reaction Methods 0.000 description 9
- 238000012432 intermediate storage Methods 0.000 description 7
- 238000001514 detection method Methods 0.000 description 5
- 238000005286 illumination Methods 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 239000000049 pigment Substances 0.000 description 3
- 238000000638 solvent extraction Methods 0.000 description 3
- QNRATNLHPGXHMA-XZHTYLCXSA-N (r)-(6-ethoxyquinolin-4-yl)-[(2s,4s,5r)-5-ethyl-1-azabicyclo[2.2.2]octan-2-yl]methanol;hydrochloride Chemical compound Cl.C([C@H]([C@H](C1)CC)C2)CN1[C@@H]2[C@H](O)C1=CC=NC2=CC=C(OCC)C=C21 QNRATNLHPGXHMA-XZHTYLCXSA-N 0.000 description 2
- 230000003247 decreasing effect Effects 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 230000002123 temporal effect Effects 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000010420 art technique Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 230000005484 gravity Effects 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 239000013598 vector Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/50—Image enhancement or restoration using two or more images, e.g. averaging or subtraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/142—Image acquisition using hand-held instruments; Constructional details of the instruments
- G06V30/1423—Image acquisition using hand-held instruments; Constructional details of the instruments the instrument generating sequences of position coordinates corresponding to handwriting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10048—Infrared image
Definitions
- the present invention relates in general to processing of digital images, in particular thresholding or binarization thereof.
- the invention is particularly, but not exclusively, aimed at preliminary image processing prior to calculation of position information on the basis of the shape ' and/or location of an object in a digital image .
- thresholding In digital image processing, it is sometimes desirable to separate some form of structure from a background in a digital image. This can be achieved by so-called thresholding or binarization, in which the luminance values of the pixels of the digital image are compared to a threshold value. More particularly, luminance values above the threshold value are set to 1, while luminance values below the threshold value are set to 0, or vice versa. With a well-selected threshold value, the thresholding results in a binary image with defined, real structures .
- a sequence of images is processed in a number of steps.
- One of the introductory steps can be the above-mentioned thresholding, which aims on the one hand to locate relevant structures and, on the other, to reduce the amount of data that is processed in subsequent steps.
- the thresholding it is desirable for the thresholding to be carried out with high precision, as errors will otherwise propagate in the subsequent processing steps.
- the thresholding means that the luminance value of each pixel in a current image is compared with a global threshold value.
- global thresholding requires, however, extensive control of the image recording so as to avoid luminance variations and noise.
- there are often variations within each image for example with regard to the background luminance, signal-to-noise ratio and sharpness.
- global thresholding such variations can lead to structures being missed or to fictitious structures being identified, particularly at the periphery of the images .
- An example where the above considerations arise is in calculating a position based on images of a pattern on a base.
- the pattern contains individual symbols, the shape and/or relative location of which code said position.
- the images can, for example, be recorded optically by a sensor in a hand-held apparatus, for example in the form of a pen.
- a pen for position determination is described, for example, in US-A-5 051 736, US-A-5 477 012 and WO 00/73983 and can be used to record handwritten information digitally.
- the above-mentioned images can be processed in a data processing unit, such as a suitably programmed microprocessor, an ASIC, an FPGA, etc, which receives a sequence of digital greyscale images, converts these to binary for identification of the above-mentioned symbols, and calculates a position on the basis of each binarised image.
- a threshold matrix is used that contains a threshold value for each pixel in the greyscale image.
- a derivative matrix is calculated by convolution of the greyscale image with a suitable derivative mask, whereupon the pixel values of the derivative matrix are multiplied by the pixel values of the greyscale image to create a product matrix. Thereafter the derivative matrix and the product matrix are divided into subareas, within which a respective sum of the pixel values is calculated.
- US-A-5 764 611 describes a thresholding method to be applied to greyscale images containing a pattern of dark dots against a bright background.
- the greyscale image is divided into subareas, within which the pixel values are summed to create a sum matrix.
- a low-pass filter is then applied to this sum matrix to create a background matrix, which after multiplication by a suitable fraction value is considered to form a matrix of local threshold values.
- this thresholding method is hampered by being sensitive to lack of sharpness in the greyscale image. Such lack of sharpness must be elimi- nated by extensive pre-processing of the greyscale image.
- the contrast matrix is then used to produce coefficients for a subarea-specific contrast function, which is finally allowed to operate on the greyscale image in order to eliminate the lack of sharpness in the same.
- This pre-processing comprises several time-consuming operations and is, in addition, memory- intensive in that it requires the intermediate storage of both the greyscale image and the result of the high-pass filtering.
- Prior-art technique also includes US-A-4 593 325, which describes a method for adaptive thresholding of a greyscale image prior to binary duplication of the same.
- An object of the present invention is thus to demonstrate a technique which permits the identification of individual objects in a digital image in a quick and memory-efficient way.
- a further object is to demonstrate a technique that is relatively unaffected by variations in luminance and/ or sharpness within an image.
- a reference image which is representative of the digital image which is to be processed, for the calculation of the threshold matrix.
- This reference image is given two predetermined, overlapping divisions, into first and second subareas. For each first subarea, a background luminance value is estimated, and for each second subarea, an object luminance value is estimated. Based on these estimated values, a threshold value is calculated for each overlap- ping first and second subarea.
- the threshold values form a threshold matrix which is used for binarization of the digital image.
- the invention enables rapid extraction of a low resolution background image and a low resolution object image from the reference image, after which the threshold matrix is created by threshold values being determined, according to some criterion, to be between associated values in the background and object images.
- the method according to the invention is a direct method which can be carried out quickly and with relatively low demands as to available memory capacity, since the threshold values can be calculated directly, subarea by subarea, based on the estimated background and object luminance values.
- the method can be carried out without any calculation-intensitive operations, such as convolutions and divisions, even if the method, when necessary, can be supplemented with such an operation, for instance for filtering.
- individual objects can be identified in a digital image also with variations in luminance and/or sharpness within the same.
- the estimated background luminance values represent the variation in background luminance over the reference image, or at least a relevant portion thereof, which means that the threshold values can be determined in adequate relation to the background.
- the estimated object luminance values represent, in combination with the estimated background luminance values, the variation in contrast over the reference image or said portion, which means the threshold values can also be determined in adequate relation to the sharpness.
- the division of the reference image, or a portion thereof, into subareas does not have to be a physical division.
- the subareas are as a rule intended and used for extraction of data from the reference image.
- the first and second subareas conveniently extend over one and the same continuous portion of the reference image, and in particular so that each second subarea at least partly comprises a first subarea. This guarantees that a threshold value can be calculated for each sub- area.
- the first subareas are mutually exclusive and the second subareas are mutually exclusive.
- Such first and second subareas do not overlap, and are suitably arranged side by side within the rele- vant portion of the reference image.
- the first and second subareas may be identical, as regards size as well as relative position, i.e. the first and second subareas coincide.
- the first subareas partly overlap each other, and/or the second subareas partly overlap each other.
- Such an embodiment makes it possible to calculate the threshold matrix with greater accuracy and higher resolution. Moreover, the appearance of artefacts in the joint between subareas is minimised.
- the size of the first and second subareas may be adjusted to optimal estimation of the background luminance values and object luminance values respectively, as will be discussed in more detail below.
- threshold values are generated in a threshold matrix.
- the term threshold matrix is to be interpreted figuratively to relate to an assembly of threshold values which are related to one part each of the reference image.
- Such a threshold matrix can be stored in a memory unit as a matrix, a plurality of vectors, or an assembly of indi- vidual threshold values.
- the calculated threshold values may be written one upon the other in one and same memory space in the memory unit, or be given as a sequence of values which is directly used for binarisa- tion of the digital image.
- the above-mentioned reference image may be any image which is representative of the digital image which is to be binarised.
- the reference image may consist of an image in this sequence of digital images.
- the reference image consists of the digital image which is to be binarised.
- the digital image thus is received, after which the background and object luminance values are estimated for the subareas and the threshold matrix is calculated. Then the threshold matrix is used for binarisation of the digital image.
- the thresh- old matrix is calculated intermittently on the basis of the luminance values of a current image in the sequence of digital images, which threshold matrix can then be applied for the binarisation of one or more subsequent images in the sequence of digital images.
- the calculation of the threshold matrix can be carried out in parallel with the actual thresholding, whereby more images can be processed per unit of time.
- the need for intermediate storage of the digital images is avoided, as these can be processed by direct comparison with an already-calculated threshold matrix. This is due to the fact that the algorithms according to the invention are sufficiently robust to permit calculation of the threshold values from a reference image which is similar but not identical to the image to which the thresholding is to be applied.
- the method according to the invention can be based on a simple and calculation-efficient estimate of the background and object luminance values for each subarea.
- the background luminance value is estimated on the basis of first order statistics of the luminance values of the pixels within the first subarea.
- First order statistics for example comprising the least value, the greatest value, the median value, the mean value and the sum of the pixels' luminance values within a subarea, can be extracted from a greyscale image in a calculation-efficient way.
- the object luminance value can be estimated on the basis of first order statistics of the luminance values of the pixels within the second subarea.
- the background luminance value is estimated on the basis of the greatest luminance value of the pixels within the first subarea.
- a luminance value can be extracted from a greyscale image quickly and in a calculation-efficient way.
- the background luminance value is estimated on the basis of the mean value of the luminance values of the pixels within the first subarea.
- the background luminance value is estimated on the basis of a percentile value, for example in the range In a corresponding way, the object luminance value can be estimated on the basis of the least luminance value of the pixels within the second subarea.
- the subareas can be designed so that each of the second subareas comprises a whole number of the first subareas, whereby the threshold matrix is minimised since this only needs to contain one threshold value for each of the first subareas.
- the second subareas are designed in such a way that they each contain at least a part of at least one of the objects that are to be identified. Accordingly, each second subarea contains with certainty at least one value of the luminance within an object. All threshold values in the resultant threshold matrix will thereby be related both to the objects and to the background against which the objects appear, and this is achieved without a separate selection for the elimination of threshold values belonging to subareas that do not contain any part of an object. It is preferable that the second subareas are designed in such a way that they each contain at least one object in its entirety, which guarantees that each second subarea contains the object's extreme value in luminance. This simplifies the calculation of an adequate threshold value.
- the image is preferably divided into first subareas that are larger than the objects that are to be identified, whereby each first subarea contains with certainty at least one value of the background luminance between the objects, in relation to which the threshold value can be set .
- the objects are positioned relative to an invisible raster or grid of known dimensions.
- This application can serve for the calculation of posi- tions described by way of introduction, for digital recording of handwritten information.
- a position is coded by the shape and/or location of one or more individual objects in an image.
- the position-coding pattern can be mentioned that is described in Applicant's patent publications WO 01/16691, WO 01/26032 and WO 01/26033, which are herewith incorporated by reference.
- This position-coding pattern is constructed of marks, for example dots, which are located at a fixed distance from raster dots belonging to an invisible raster. The value of each mark is provided by its position relative to the associated raster dot.
- a plurality of such marks together code a position that is given by two or more coordinates.
- All of the position-coding patterns stated above contain objects that are a known distance apart, which is given by the distance between the raster dots and the distance between the objects and the raster dots.
- the subareas can thereby be designed in relation to the known dimensions of the raster, for example in such a way that each sub- area contains at least a part of at least one object or at least one object in its entirety. If the current image is recorded with a certain inclination of the sensor relative to the position-coding pattern, the distance between the objects will vary across the image, for which reason this type of perspective distortion must be taken into account when dimensioning the subareas.
- WO 00/73983 should also be mentioned, which describes a position-coding pattern containing marks of two different sizes, where each mark is centred at a respective raster dot of an invisible raster.
- US-A-5 221 833 describes a coding pattern containing marks in the form of lines with different angles of rotation around their dot of symmetry, where each such mark is centred at a respective raster dot of an invisible raster. Also in these cases, it is possible to size the subareas taking into account the dimensions of the raster, so that each subarea contains with certainty at least a part of an object or an object in its entirety.
- the classification of the subareas into at least a first category with a high signal-to-noise ratio and a second category with a low signal-to-noise ratio.
- the classi- fication is used for calculating the threshold value for each subarea, by setting the threshold value at a larger relative distance or contrast depth from the background luminance value in subareas belonging to the second category than in subareas belonging to the first cate- gory.
- the danger is reduced of the threshold value being at the level of noise in the proximity of the object.
- Such a threshold value could generate a number of fictitious structures around the actual object.
- the threshold value for subareas belonging to the first and second categories is set at a relative distance to the background luminance value of approximately 40-60% and approximately 60-80% of the contrast, respectively.
- the subareas can be further classified into a third category with a high signal-to-noise ratio and a low contrast.
- the subareas belonging to the third category are overexposed, for which reason the luminance depth of the object is given by only one or a few pixels.
- the threshold value for a subarea belonging to the third category is set at a relative distance to the background luminance value of approximately 30-50% of the contrast .
- the classification of the subareas with regard to their signal-to-noise ratio can be carried out based on a statistical variation measure, for example normalised standard deviation. For images comprising dark objects against a bright background, however, the signal-to-noise ratio is related to the background luminance value. Therefore in a calculation-efficient embodiment, the classification can be carried out by comparison of the greatest luminance value within the subarea and a limit level.
- This limit level can, for example, correspond to the mean value of the greatest luminance values of all the subareas.
- the limit level can, for example, consist of a predetermined fixed value.
- a characteristic level is estimated for the luminance values within each sub- area, where the relative distance between the threshold value and the background luminance value is set depending upon this characteristic level, which suitably is indicative of the signal-to-noise ratio in the subarea.
- the characteristic level can also indicate any overexposure of a subarea, with the ensuing danger of reduced contrast .
- the threshold value can be adapted to the conditions in the subarea, as indicated by the characteristic level.
- the characteristic level can be produced on the basis of the mean value, the median value or the sum of the luminance values of the pixels within a subarea. In certain cases, the least or the greatest luminance value within a subarea can represent the characteristic level of the subarea.
- the relative distance between the threshold value and the background luminance value is a monoto- nically decreasing function of the characteristic level.
- the calculated threshold value can be more or less representative of the pixels within the current subarea.
- the threshold value can be less representative at the edges of the subarea.
- a subsequent smoothing step is preferably implemented, in which each calculated threshold value is updated on the basis of adjacent calculated threshold values in the threshold matrix.
- the threshold matrix is given further threshold values in the smoothing step, by interpolation of adjacent calculated threshold values in the threshold matrix. In the interpolation, the threshold matrix is thus given further threshold values which are used for the thresholding of an associated part of the greyscale image.
- the interpolation can be of any kind, for example linear, and can be implemented in one or more steps .
- the smoothing step com- prises, alternatively or additionally, a low-pass filtering of the threshold matrix.
- the invention also relates to a device, a computer program product and a hardware circuit for the identification of individual objects in a digital image, and a hand-held apparatus for position determination.
- Fig. IA shows schematically an example of 4 x 4 marks that are used to code a position
- Fig. IB shows schematically a digital pen that is used to detect the marks in Fig. IA and to calculate a position on the basis of these
- Fig. 2 shows greyscale images, recorded by the pen in Fig. IB, of a position-coding pattern of the type shown in Fig. IA,
- FIG. 3 shows schematically the construction of a data processor in the pen in Fig. IB
- Figs 4A - 4B illustrate schematically the division of an image into subareas for the calculation of a threshold matrix according to a first and a second embodiment, respectively
- Fig. 5A illustrates schematically the classification of objects in the image
- Fig. 5B illustrates schematically a generalisation of the embodiment in Fig. 5A
- Fig. 6 shows luminance values along a line in a greyscale image, in which associated threshold values in the threshold matrix are indicated by broken lines, and
- Fig. 7 shows the greyscale images in Fig. 2 after the thresholding thereof according to the invention.
- the description below concerns position determination based on images of a position-coding pattern.
- the position-coding pattern can be of any type, for example any one of the patterns mentioned by way of introduction. In the following, however, the invention is exemplified in connection with the pattern that is described in Applicant's Patent Publications WO 01/16691, WO 01/26032 and WO 01/26033. This pattern will be described briefly below with reference to Fig. IA.
- the position-coding pattern comprises a virtual raster or grid 1, which is thus neither visible to the human eye nor can be detected directly by a device which is to determine positions on the surface, and a plurality of marks 2, each of which, depending upon its position, represents one of four values "1" to "4" .
- the value of the mark 2 depends upon where it is placed in relation to its nominal position 3.
- the nominal position 3, which can also be called a raster dot, is represented by the intersection of the raster lines. In one embodiment, the distance between the raster lines is 300 ⁇ m and the angle between the raster lines is 90 degrees.
- raster intervals are possible, for example 254 ⁇ m to suit printers and scanners which often have a resolution which is a multiple of 100 dpi, which corresponds to a distance between dots of 25.4 mm/100, that is 254 ⁇ m.
- each mark 2 is, at its centre of gravity, displaced relative to its nominal position 3, that is no mark is located at the nominal position. In addition, there is only one mark 2 per nominal position 3.
- the marks 2 are displaced relative to the nominal positions 3 by 50 ⁇ m along the raster lines.
- the displacement is preferably 1/6 of the raster interval, as it is then relatively easy to determine to which nominal position a particular mark belongs.
- the displacement should be at least approximately 1/8 of the raster interval, otherwise it becomes difficult to determine a displacement, that is the requirements for resolution become great .
- the displacement should be less than approximately 1/4 of the raster interval, in order for it to be possible to determine to which nominal position a mark belongs.
- Each mark 2 consists of a more or less circular dot with a radius which is approximately the same size as the displacement or somewhat less.
- the radius can be 25% to 120% of the displacement. If the radius is much larger than the displacement, it can be difficult to determine the raster lines. If the radius is too small, a greater resolution is required to record the marks.
- the marks do not, however, need to be circular or round, but any suitable shape can be used, such as square, triangular, elliptical, open or closed, etc.
- the pattern described above can be designed to code a very large number of absolute positions. For example, the pattern can be such that 6 x 6 adjacent marks together code a position, in the form of an x-coordinate and a y-coordinate .
- Fig. IB shows a hand-held apparatus 10, below called a pen, that is used for optical detection of the position-coding pattern in Fig. IA.
- a pen that is used for optical detection of the position-coding pattern in Fig. IA.
- the pen 10 has a casing 11 in the shape of a pen, which has an opening 12 at one end. This end is intended to abut against or to be held a short distance from the surface on which the position determination is to be carried out .
- One or more infrared light-emitting diodes 13 are arranged in the opening 12 for illuminating the surface area which is to be imaged, and an area sensor 14, sen- sitive to infrared light, for example a CCD or CMOS sensor, is arranged for recording a two-dimensional image of the surface area.
- an area sensor 14 sen- sitive to infrared light, for example a CCD or CMOS sensor, is arranged for recording a two-dimensional image of the surface area.
- the area sensor 14 is connected to a data processor 15 which is arranged to determine a position on the basis of the image recorded by the sensor 14.
- the data processor 15 can contain one or more processors (not shown) , programmed to record images from the sensor 15 or from a buffer memory (not shown) associated with the sensor, and to carry out position determination on the basis of these images.
- the pen 10 has also a pen point 16 which deposits pigment ink on the product. Using this, the user can write physically on the product, while at the same time what is being written is recorded digitally via optical detection of the position-coding pattern.
- the pigment ink is suitably transparent to infrared light, while the marks 2 of the position-coding pattern (Fig. IA) absorb infrared light. This means that the pigment ink does not interfere with the detection of the pattern.
- the area sensor 14 When the pen 10 is passed over the position-coding pattern, the area sensor 14 thus records a sequence of digital greyscale images which are transmitted to the data processor 15 for position determination.
- Fig. 2 shows examples of such greyscale images I . These contain 96 x 96 pixels, the luminance values of which are given with 8-bit resolution. To achieve an adequate temporal resolution for the digitally recorded information, the images are read off from the area sensor 14 at a frequency of approximately 100 Hz, that is approximately 10 ms is available per image I for calculating a position. In the images in Fig . 2 , the marks 2 appear as dark dots against a bright background. Normally each mark or object covers several pixels in the image.
- the sharpness varies within the image as a result of the pen and thus the area sensor being angled against the base when writing down information.
- the contrast can also vary within the image as a result of uneven scattering properties of the base.
- the illumination of the base is uneven.
- the images are well illuminated at their central parts, with, however, varying image sharpness, while the peripheral parts have low signal-to-noise ratios, due to insufficient illumination.
- Fig. 3 shows the data processor in the pen in greater detail.
- the data processor 15 comprises a preprocessing unit 20, a threshold calculation unit 21 and a position determination unit 22.
- the pre-processing unit 20 comprises in this case a hardware circuit (ASIC) which records a current greyscale image I from the area sensor 14, obtains a threshold matrix T from the threshold calculation unit 21, and generates a binary image B.
- ASIC hardware circuit
- the luminance value of each pixel in the current image is compared with an associated threshold value in the threshold matrix T. If the luminance value is greater than the threshold value, the corresponding luminance value in the binary image is set to one (1) , otherwise to zero (0) .
- the output binary image B thus contains dark objects (value 0) which ideally constitute the marks, against a bright background (value 1) .
- the pre-processing unit 20 also contains a statistics module which generates image statistical data S for given subareas or partitions in the current greyscale image I.
- This image statistical data S is stored in a memory 23, from which the threshold calculation unit 21 can obtain the relevant image statistical data S when " it is to commence the calculation of a new threshold matrix T.
- the threshold calculation unit 21 thus generates the threshold matrix T based on this image statistical data S, as will be described in greater detail below.
- the position determination unit 22 receives the binary image B from the pre-processing unit 20, identi- fies the marks in the binary image and calculates position coordinates (x,y) on the basis of the positions of the marks in relation to the virtual raster.
- the threshold calculation unit 21 and the position determination unit 22 consist of software which is executed in a micro- processor (not shown) .
- the decoding in the position determination unit will not be described here in greater detail, as the present invention relates to the prelimi- nary processing step, more particularly the binarisation of the greyscale images I Embodiment 1
- a first embodiment is based on a partitioning of greyscale images, each of which contains 96 x 126 pixels, into 63 (7 x 9) square subareas I s , each of which contains 14 x 14 pixels, as indicated by thin lines in Fig. 4A.
- This division is used on the one hand by the statistics module for the generation of statistical data S, and, on the other, by the threshold calculation unit 21 for the calculation of the threshold matrix T.
- the size of the subareas is set with knowledge of the images that are to be binarised. In the greyscale images the distance between the raster lines is known, in the present case approximately 7.5 pixels. On the basis of this information, the size of the subareas I s can be selected in such a way that each subarea I s with great certainty contains at least a part of a mark 2, as also shown in Fig. 4A. Since the images are taken using the hand-held apparatus 10, which is used like a pen, it must also be taken into account that the angle between the pen and the base, that is the perspective in the images, can vary depending upon the writing posture of the user.
- a raster is used with a distance between the raster lines of approximately 300 ⁇ m, together with circular dots with a displacement and a radius of approximately 50 ⁇ m.
- Square subareas I s with a side length of approximately 120% of the raster interval should then be sufficiently large to guarantee that each subarea I s in each image I contains at least a part of a dot. Also taking into account the varying perspective, side lengths in the range 150% - 300% of the raster interval have been found to give satisfactory results.
- the subareas I ⁇ are thereby so large that they essentially always contain at least one dot in its entirety, which simplifies the production of an adequate threshold value for each subarea.
- the upper limit for the size of the subareas I ⁇ is given by the lowest acceptable resolution of the threshold matrix, which depends among other things upon the spatial size of the luminance variations in the images . It is also possible to make the subareas I s smaller, in order to increase the resolution of the threshold matrix. In this case, certain subareas will only contain background and thus will not contain part of any mark, for which reason a correct threshold value cannot be calculated. These subareas should therefore be identified and allocated a correction value which, for example, is calculated on the basis of the threshold values for the surrounding subareas .
- the statistics module derives statistical data S in the form of the greatest luminance value (max) and the least luminance value (min) within each subarea I ⁇ .
- a threshold value Ti is calculated for each subarea. This threshold value is stored in the threshold matrix T, as illustrated in Fig. 4A. The threshold value Ti is calculated as a function of the contrast (max - min) within the subarea :
- each subarea should advantageously contain at least one mark in its entirety, so that the actual luminance depth of the mark can be used in the calculation of the threshold value.
- the factor ki determines at which contrast depth the threshold value is to be set. In order to reduce the effect of noise and lack of sharpness, the contrast depth factor ki is set on the basis of a classification of each subarea into the classes “sharp” , "lacking in contrast” and “noisy” .
- noise limit value is defined as the mean value of the greatest luminance values in all the subareas, which can be calculated in a simple way from said image statistical data S.
- the contrast limit value is defined as the mean value of the contrast in all the sub- areas in the current image, which can also be calculated in a simple way from said image statistical data S.
- Fig. 5a shows examples of the classification of the subareas in a diagram of the luminance as a function of pixels.
- the subarea I is “sharp”
- the subarea II "lacking in contrast”
- the subarea III is “noisy” .
- the threshold values i for the respective subareas are also indicated by broken lines. It has been found advantageous to set the factor ki to a value in the range 0.6-0.8 in noisy subareas, 0.3-0.5 in subareas lacking in contrast and 0.4-0.6 in sharp subareas. According to one embodiment, the central value in each of the above ranges is used.
- the threshold value is set relatively far from the background luminance (max) in noisy subareas, which are typically to be found at the periphery of the image. This reduces the risk of the threshold value being set at the level of the background noise, which could result in the thresholding generating a binary image with many small fictitious structures.
- the noisy subareas can undergo a supplementary contrast control, for example by the threshold value being set to zero in the noisy subareas that have a contrast similar to typical noise levels.
- the threshold value is set relatively close to the background luminance (max) .
- This increases the probability of associated marks being identified as structures with a plurality of connected pixels, which in turn provides a better estimation of the position of the mark in the subsequent decoding in the position determination unit 22.
- this type of subarea is in fact overexposed, for which reason the luminance depth of the marks is only given by one or a few pixels, as indicated in Fig. 5A.
- the threshold matrix T contains a threshold value Ti per subarea I s (cf . Fig. 4A) .
- the classification of the subareas can be made more sophisticated. For example, histograms or stan- dard deviations can be used to identify noise, mean values can be used to identify background luminance, etc.
- An advantage of the use of the minimum and the maximum for each subarea is, however, that these values can be extracted from a greyscale image in a calculation-effec- tive way. In addition, the number of calculation steps is minimised in the production of the threshold matrix.
- the mean value of the luminance values of the pixels is calculated within the respective subarea, after which the threshold value is calculated on the basis of this mean value and of the contrast in the subarea in question, in accordance with:
- f (m) is a function of the luminance mean value mi in the subarea in question and has a value in the range 0 - 1.
- Fig. 5B shows an example of the appearance of the function f (m) .
- f (m) is a continuous and mono- tonically decreasing function of the luminance mean value of the subarea.
- Low luminance mean values give both low contrast and low signal-to-noise ratio, as indicated by the luminance histogram (i) , for which reason the value of the function is set close to 1 (corresponding to the above class "noisy").
- the signal-to-noise ratio and the contrast tend to increase gradually, as indicated by the luminance histograms (ii) and (iii) , for which reason the value of the function is allowed to decrease to a corresponding extent.
- f (m) becomes approximately 0.5 (corresponding to the above class "sharp").
- the contrast is again reduced (see the luminance histogram (iv) ) as a result of over- exposure of the subarea (corresponding to the above class "lacking in contrast") .
- the function f (m) is here set to a value close to 0, that is the threshold value becomes close to the estimated background luminance .
- Fig. 5B can be modified.
- it can consist of more segments (classes) , with other breakpoints and inclinations.
- the function can be a curve, as given by a second degree equation or the like.
- the same subareas are used for estimating both background luminance and object luminance in the greyscale image.
- subareas of different sizes are instead used for estimating the background luminance and object luminance of the greyscale image.
- the threshold matrix is thus calcu- lated based on image statistical data for two different sets of subareas, object subareas and background sub- areas .
- the object subareas and the background subareas overlap each other and cover all that part of the image that is to be binarised.
- the object subareas correspond in size to the subareas that are used in the first embodiment, that is they are so large that they contain with certainty at least a part of a mark.
- the background sub- areas can, however, be made smaller, as they only need to be large enough to contain with certainty pixels that are representative of the image's local background luminance, that is they should be larger than each mark in the image. Any enlargement as a result of the effects of perspective should be taken into account.
- Fig. 4B shows an example of the partitioning into object subareas and background subareas.
- This partitioning is suited to the same coding pattern as the first embodiment, however for greyscale images consisting of 96 x 96 pixels.
- Each greyscale image is divided into 64 (8 x 8) object subareas I s , 0 , each of which contains 12 x 12 pixels, and into 256 (16 x 16) background sub- areas I s ,b/ each of which contains 6 x 6 pixels.
- the object subareas I s , 0 a e dimensioned to comprise a whole number of background subareas I s , b (in this case four) , whereby the calculation of the threshold matrix is simplified.
- a threshold value can now be calculated for each background subarea, in accordance with:
- the background luminance is estimated as the greatest luminance value within the background subarea and the object luminance as the least luminance value within the object subarea.
- the threshold value can be calculated in alternative ways, as described in connection with the first embodiment above .
- the threshold matrix is calculated based on a background matrix, which contains the background luminances estimated for the background subareas I S( b, and an object matrix which contains the object luminances estimated for the object subareas I s ,o-
- a background matrix which contains the background luminances estimated for the background subareas I S( b)
- an object matrix which contains the object luminances estimated for the object subareas I s ,o-
- the object subareas I s ,o overlap a whole number of background subareas I s ,b > as the data that requires intermediate storage in the background matrix and the object matrix is thereby minimised.
- the statistics module in the pre-processing unit 20 can be designed to generate separate image statistical data for the background subareas I S/b and the object subareas I s ,o-
- the image statistical data for the object areas I s , 0 can be calculated from the image statistical data for the background subareas I s ,b- For instance, this is the case in the example above, with estimation based on minimum and maximum values and with adjustment of the relative sizes of the background subareas and the object subareas.
- Both embodiments described above result in a threshold matrix T containing a threshold value Ti per subarea I s and I s ,:t>, respectively. It has, however, been found that the precision of the thresholding is improved if the threshold matrix is given additional threshold values by interpolation, between the threshold values calculated as defined above. Such additional threshold values can be created by linear interpolation of adjacent values in the threshold matrix. The linear interpolation is carried out in two steps, interpolation by rows and interpolation by columns. The threshold matrix interpolated in this way can then, if required, undergo a further interpolation. It should be noted that the rela- tionship between the threshold values and the subareas is changed when the threshold matrix is given additional threshold values by means of interpolation.
- each threshold value is now applicable to pixels within smaller thresholding areas of each image.
- Each first calculated threshold value is suitably allocated to a thresholding area in the centre of its subarea, whereupon the new threshold values can be allocated to thresholding areas in between.
- each such thresholding area has a size that is 1/4 or 1/16, respectively, of the size of the subarea.
- An alternative method for improving the precision of the thresholding is to have the threshold matrix calculated in accordance with the embodiment above under- go a low-pass filtering, for example by convolution of the threshold matrix with a suitable 3 x 3 matrix.
- Fig. 6 shows the luminance distribution along a line in a greyscale image, together with calculated threshold values Ti along this line.
- the threshold values in Fig. 6 are produced according to the second embodiment with a subsequent enlargement of the threshold matrix by linear interpolation. In spite of large variations in background luminance, signal-to-noise ratio and contrast, the calculated threshold values Ti accord well with the luminance values along the line.
- Fig. 7 shows a number of binary images B that have been generated by the thresholding of the greyscale images shown in Fig. 2.
- the thresholding is carried out according to the second embodiment with a subsequent enlargement of the threshold matrix by linear interpolation.
- a comparison of Figs 2 and 7 shows that the thresholding results in a satisfactory identification of the marks 2, even with variations in illumination, base properties and imaging conditions within the greyscale images .
- a threshold matrix can obviously be calculated for one particular greyscale image and then used for the thresholding of the same with high precision.
- the calculation of the threshold matrix can be carried out quickly, based on given image statistical data. It is estimated that it takes approximately 8000 clock cycles for the calculation of the threshold matrix according to the second embodiment, that is with a background matrix estimated for 16 x 16 subareas, an object matrix estimated for 8 x 8 subareas and a mean value matrix estimated for 8 x 8 subareas. For an 80 MHz processor, this corresponds to a calculation time of 100 ⁇ s .
- the algorithms described above can permit a further increase of the throughput of images I through the data processor 15, by carrying out the calculation of the threshold matrix in parallel with the actual thresholding.
- the threshold matrix is thus calculated on the basis of a preceding image, which is similar to the subsequent image or images to be thresholded by use of this calculated threshold matrix. This can be regarded as if the thresh- old matrix is periodically updated and then used in thresholding one or more subsequent images. It has been found fully possible to use one and the same threshold matrix for the thresholding of a plurality of, for example 5 - 10, consecutive greyscale images. According to this embodiment, a given greyscale image can be thresholded at the same time as it is being read in from the sensor by the data processor 15.
- This thresholding can thus be implemented in hardware and in this way relieve the processor (not shown) which carries out the calculations in the threshold calculation unit 21 and the position determination unit 22.
- This hardware can at the same time also generate the above-mentioned image statistical data S in order to relieve the processor still further.
- the need for intermediate storage of the greyscale image is avoided as this can be processed by direct comparison with an already-calculated threshold matrix.
- the algorithms according to the invention have a sufficient tolerance to variations in luminance and/or sharpness from image to image.
- the threshold matrix T is calculated on the basis of image statistical data for given subareas in the greyscale images and thereby contains threshold values that are related to the overall luminance distribution in the images, both with regard to the background and to the object.
- the threshold matrix contains both global information which is relevant for several consecutive images, and local information, which allows for the thresholding of each object in relation to its local surroundings.
- each subarea, which contains a plurality of pixels is allocated a threshold value in the threshold matrix, the effect of local variations is limited.
- the size of the subareas is selected in such a way that the calculated threshold value is sufficiently insensitive to local variations in order to achieve a desired tolerance to variations in luminance and/or sharpness from image to image.
- the preprocessing unit 20 is designed to buffer, before, after or during the gene- ration of said image statistical data S, one or more associated greyscale images, for instance in the memory 23.
- the threshold matrix T which is calculated by the threshold calculation unit 21 for a current greyscale image can thus be used by the preprocessing unit 21 for binarisation of the same current image, and optionally a subsequent image in the incoming sequence of images .
- the binary images before they are analysed further for position determination, they can undergo an area check, with the aim of eliminating fictitious marks on the basis of the number of connected pixels within each mark. Accordingly, marks consisting of one or a few pixels can be assumed to originate from noise and can therefore be removed. As the maximal size of the marks is known, an upper area threshold can also be set.
- the above-mentioned contrast depth factor can, ' instead of being set at a predetermined value or be calculated on the basis of the classification of associated subareas, be given by an external process, such as a control loop.
- an external process such as a control loop.
- the calculation of image statistical data can be carried out in the threshold calculation unit instead of in the pre-processing unit .
- the data processor can be realised completely in hardware or completely in software .
- sub- areas can be of any shape, such as square, rectangular, triangular, rhombic, hexagonal, etc.
- the invention is in no way restricted to the described position-coding pattern, but can also be used for the identification and decoding of other position- coding patterns.
- the raster described above can have other shapes than orthogonal, such as a rhombic grid, for example with 60 degree angles, a triangular or hexagonal grid, etc.
- the marks can be displaced in other directions than along the raster lines.
- the pattern is optically readable and the sensor is thus optical. It is recognised, however, that the images that are processed according to the invention can be generated in another way, for example by detection of chemical, acoustic, electromagnetic, capacitive or inductive parameters. Similarly, it is recognised that the invention can also be used for identification of bright marks against a dark background.
- the invention can be used in general for identification of individual objects in a digital image in a quick and memory-efficient way, particularly when there are variations in luminance and/or sharpness within an image.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
SE0102254A SE0102254L (sv) | 2001-06-26 | 2001-06-26 | Behandling av digitala bilder |
SE0102254 | 2001-06-26 | ||
PCT/SE2002/001244 WO2003001450A1 (en) | 2001-06-26 | 2002-06-25 | Processing of digital images |
Publications (1)
Publication Number | Publication Date |
---|---|
EP1421555A1 true EP1421555A1 (de) | 2004-05-26 |
Family
ID=20284605
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP02741593A Withdrawn EP1421555A1 (de) | 2001-06-26 | 2002-06-25 | Verarbeitung digitaler bilder |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP1421555A1 (de) |
SE (1) | SE0102254L (de) |
WO (1) | WO2003001450A1 (de) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7457476B2 (en) | 2001-10-03 | 2008-11-25 | Anoto Ab | Optical sensor device and a method of controlling its exposure time |
JP2009543181A (ja) | 2006-06-28 | 2009-12-03 | アノト アクティエボラーク | 電子ペンにおける動作制御およびデータ処理 |
US8311329B2 (en) | 2006-09-07 | 2012-11-13 | Lumex As | Relative threshold and use of edges in optical character recognition process |
CN116629289A (zh) * | 2023-05-23 | 2023-08-22 | 深圳市牛加技术有限公司 | 基于卷积神经网络的光学点阵二维坐标识别方法及设备 |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5661506A (en) | 1994-11-10 | 1997-08-26 | Sia Technology Corporation | Pen and paper information recording system using an imaging pen |
US5852434A (en) | 1992-04-03 | 1998-12-22 | Sekendur; Oral F. | Absolute optical position determination |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4593325A (en) | 1984-08-20 | 1986-06-03 | The Mead Corporation | Adaptive threshold document duplication |
US5051736A (en) | 1989-06-28 | 1991-09-24 | International Business Machines Corporation | Optical stylus and passive digitizing tablet data input system |
US5477012A (en) | 1992-04-03 | 1995-12-19 | Sekendur; Oral F. | Optical position determination |
US5629780A (en) * | 1994-12-19 | 1997-05-13 | The United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration | Image data compression having minimum perceptual error |
US5963676A (en) * | 1997-02-07 | 1999-10-05 | Siemens Corporate Research, Inc. | Multiscale adaptive system for enhancement of an image in X-ray angiography |
FR2786011B1 (fr) * | 1998-11-13 | 2001-01-19 | Centre Nat Etd Spatiales | Procede de comparaison d'images enregistrees formees de pixels representant des equipotentielles d'au moins une puce de circuit integre |
WO2000073983A1 (en) | 1999-05-28 | 2000-12-07 | Anoto Ab | Position determination |
-
2001
- 2001-06-26 SE SE0102254A patent/SE0102254L/xx not_active Application Discontinuation
-
2002
- 2002-06-25 WO PCT/SE2002/001244 patent/WO2003001450A1/en not_active Application Discontinuation
- 2002-06-25 EP EP02741593A patent/EP1421555A1/de not_active Withdrawn
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5852434A (en) | 1992-04-03 | 1998-12-22 | Sekendur; Oral F. | Absolute optical position determination |
US5661506A (en) | 1994-11-10 | 1997-08-26 | Sia Technology Corporation | Pen and paper information recording system using an imaging pen |
Non-Patent Citations (2)
Title |
---|
DYMETMAN M; COPPERMAN M: "Intelligent paper", LECTURE NOTES IN COMPUTER SCIENCE, vol. 1375, March 1998 (1998-03-01), pages 392 - 406, XP002328425 |
See also references of WO03001450A1 |
Also Published As
Publication number | Publication date |
---|---|
WO2003001450A1 (en) | 2003-01-03 |
SE0102254D0 (sv) | 2001-06-26 |
SE0102254L (sv) | 2002-12-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7110604B2 (en) | Processing of digital images | |
US11962875B2 (en) | Recycling methods and systems, and related plastic containers | |
KR101399709B1 (ko) | 모델-기반 디워핑 방법 및 장치 | |
US20030118233A1 (en) | Method and device for identifying objects in digital images | |
CN104781833B (zh) | 二维码 | |
US20090067743A1 (en) | Preprocessing for information pattern analysis | |
US20050207647A1 (en) | Image-processing device and image processing method | |
US11962876B2 (en) | Recycling methods and systems, and related plastic containers | |
JP2003526841A (ja) | 生物測定学に基づく顔の抽出システム及び方法 | |
KR20030010530A (ko) | 이미지 처리 방법, 장치 및 시스템 | |
KR101272448B1 (ko) | 관심영역 검출 장치와 방법 및 상기 방법을 구현하는 프로그램이 기록된 기록매체 | |
WO2007127085A1 (en) | Generating a bitonal image from a scanned colour image | |
EP1904952A2 (de) | Effiziente finder-muster und verfahren zur anwendung auf 2d-machine-vision-probleme | |
EP2748795A1 (de) | Rückmeldung an benutzer zur anzeige der vergrösserbarkeit eines bildes | |
WO2005081792A2 (en) | Method, apparatus and program for detecting an object | |
US20080247649A1 (en) | Methods For Silhouette Extraction | |
EP1081648A2 (de) | Verfahren zum Verarbeiten eines numerischen Bildes | |
CN112699704A (zh) | 一种条形码的检测方法、装置、设备、存储装置 | |
Bukhari et al. | Adaptive binarization of unconstrained hand-held camera-captured document images. | |
JP2001274990A (ja) | 画像処理方法及び装置 | |
US7602943B2 (en) | Image processing apparatus, image processing method, and image processing program | |
JP2002199206A (ja) | メッセージ埋込並びに抽出方法、装置および媒体 | |
CN113076952B (zh) | 一种文本自动识别和增强的方法及装置 | |
WO2003001450A1 (en) | Processing of digital images | |
CN108255298A (zh) | 一种投影交互系统中的红外手势识别方法及设备 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20040126 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE TR |
|
AX | Request for extension of the european patent |
Extension state: AL LT LV MK RO SI |
|
RAP1 | Party data changed (applicant data changed or rights of an application transferred) |
Owner name: ANOTO IP LIC HB |
|
111L | Licence recorded |
Free format text: 0100 LEAPFROG ENTERPRISES INC. Effective date: 20050530 |
|
RAP1 | Party data changed (applicant data changed or rights of an application transferred) |
Owner name: ANOTO AB |
|
TPAC | Observations filed by third parties |
Free format text: ORIGINAL CODE: EPIDOSNTIPA |
|
17Q | First examination report despatched |
Effective date: 20090128 |
|
RAP1 | Party data changed (applicant data changed or rights of an application transferred) |
Owner name: ANOTO AB |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN |
|
18D | Application deemed to be withdrawn |
Effective date: 20150106 |