US20140226856A1 - Method and apparatus for semi-automatic finger extraction - Google Patents

Method and apparatus for semi-automatic finger extraction Download PDF

Info

Publication number
US20140226856A1
US20140226856A1 US14/174,003 US201414174003A US2014226856A1 US 20140226856 A1 US20140226856 A1 US 20140226856A1 US 201414174003 A US201414174003 A US 201414174003A US 2014226856 A1 US2014226856 A1 US 2014226856A1
Authority
US
United States
Prior art keywords
image
pixel
unit
distance
edge
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US14/174,003
Other versions
US9311538B2 (en
Inventor
Shufu XIE
Yuan He
Jun Sun
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
PFU Ltd
Original Assignee
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Ltd filed Critical Fujitsu Ltd
Assigned to FUJITSU LIMITED reassignment FUJITSU LIMITED ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: HE, YUAN, SUN, JUN, Xie, Shufu
Publication of US20140226856A1 publication Critical patent/US20140226856A1/en
Application granted granted Critical
Publication of US9311538B2 publication Critical patent/US9311538B2/en
Assigned to PFU LIMITED reassignment PFU LIMITED ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: FUJITSU LIMITED
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • G06K9/00624
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/155Removing patterns interfering with the pattern to be recognised, such as ruled lines or underlines
    • G06K9/4604
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Definitions

  • the present disclosure relates to the field of image processing, particularly to a device and a method for detecting the boundary of an object image such as a finger image.
  • the user may hold both sides of the book with his/her fingers to complete the scanning process.
  • the fingers may appear on the side boundaries of the book in the corrected scanned image of the book, making the corrected image less nice-looking. Therefore, it is necessary to remove the finger image in the corrected image.
  • an image processing device including: an inputting unit for performing a click on an object image contained in an image to obtain a clicked point; a calculating unit for calculating an edge map of the image; an estimating unit for estimating a color model of the object image based on the clicked point and the edge map; an object classifying unit for classifying each pixel in the image, based on the edge map and the color model, so as to obtain a binary image of the image; and a detecting unit for detecting a region containing the object image based on the binary image.
  • an image processing method including: performing a click on an object image contained in an image to obtain a clicked point; calculating an edge map of the image; estimating a color model of the object image based on the clicked point and the edge map; classifying each pixel in the image, based on the edge map and the color model, so as to obtain a binary image of the image; and detecting a region containing the object image based on the binary image.
  • a program product including machine-readable instruction code stored therein which, when read and executed by a computer, causes the computer to perform the image processing method according to the present disclosure.
  • the image processing device and method according to the present disclosure require user interaction to obtain information on the clicked point. Further, the image processing device and method according to the present disclosure use color information and edge information to detect the boundary of an object image such as a finger image. Accordingly, the image processing device and method according to the present disclosure can improve the accuracy of detecting the boundary of an object image, thus facilitating removal of the object image from the image and making the processed image more nice-looking.
  • FIGS. 1( a ) and 1 ( b ) are schematic diagrams illustrating an exemplary image to be dealt with by the technical solution of the present disclosure
  • FIG. 2 is a block diagram illustrating an image processing device according to an embodiment of the present disclosure
  • FIG. 3 is a schematic diagram illustrating an exemplary application of the image processing device according to the embodiment of the present disclosure
  • FIG. 4 is a block diagram illustrating a calculating unit in the image processing device according to the embodiment of the present disclosure
  • FIG. 5 is a block diagram illustrating an estimating unit in the image processing device according to the embodiment of the present disclosure
  • FIGS. 6( a ) to 6 ( d ) are schematic diagrams illustrating an exemplary application of an extension region acquiring unit in the estimating unit in the image processing device according to the embodiment of the present disclosure
  • FIG. 7 is a block diagram illustrating a detecting unit in the image processing device according to the embodiment of the present disclosure.
  • FIGS. 8( a ) to 8 ( d ) are schematic diagrams illustrating an exemplary application of the detecting unit in the image processing device according to the embodiment of the present disclosure
  • FIG. 9 is a schematic diagram illustrating an exemplary application of an expanding unit in the detecting unit in the image processing device according to the embodiment of the present disclosure.
  • FIG. 10 is a flowchart of an image processing method according to an embodiment of the present disclosure.
  • FIG. 11 is a block diagram illustrating an exemplary structure of a general-purpose personal computer on which the image processing device and method according to the embodiments of the present disclosure can be implemented.
  • Example embodiments are provided so that this disclosure will be thorough, and will fully convey the scope to those who are skilled in the art. Numerous specific details are set forth such as examples of specific components, devices, and methods, to provide a thorough understanding of embodiments of the present disclosure. It will be apparent to those skilled in the art that specific details need not be employed, that example embodiments may be embodied in many different forms and that neither should be construed to limit the scope of the disclosure. In some example embodiments, well-known processes, well-known device structures, and well-known technologies are not described in detail.
  • FIGS. 1( a ) and 1 ( b ) are schematic diagrams illustrating an exemplary image to be dealt with by the technical solution of the present disclosure.
  • the user may hold both sides of the book with the fingers of his/her left hand LH and right hand RH to complete the scanning process, thus obtaining an image as shown in FIG. 1( a ).
  • Known methods in the prior art may be used to correct the obtained image. For example, the upper and lower boundaries of the image may be extracted, and then transformed from curved into flat so as to obtain the corrected image.
  • FIG. 1( b ) shows an example of the corrected image. As shown in FIG.
  • a finger image F in the corrected scanned image of the book, a finger image F may appear on the side boundaries of book, and the finger image F may get to book content T, making the corrected image less nice-looking. Therefore, it is necessary to remove the finger image F from the corrected image.
  • the image processing device 200 may include an inputting unit 210 , a calculating unit 220 , an estimating unit 230 , an object classifying unit 240 , and a detecting unit 250 .
  • the inputting unit 210 may click on an object image contained in an image to obtain a clicked point. For example, as shown in the left of FIG. 3 , on an image I cropped from the corrected image containing the finger image F, the inputting unit 210 may perform a click on the finger image F to obtain a clicked point P. In this way, it is clear that the clicked point P is within the finger region.
  • the inputting unit 210 may be any device that can perform click function, e.g. a mouse, and the present disclosure has no particular limitations thereto.
  • the calculating unit 220 may calculate an edge map of the image I.
  • the edge map is a map in relation to edge information of the image I.
  • the edge information indicates whether a pixel in the image I is an edge pixel or not.
  • the calculating unit 220 may calculate the edge map based on pixel information of the image I and information of the clicked point P obtained by the inputting unit 210 , or calculate the edge map based on the pixel information of the image I only. This will be described later in detail.
  • the estimating unit 230 may estimate a color model of the finger image (object image) F based on the clicked point P obtained by the inputting unit 210 and the edge map calculated by the calculating unit 220 .
  • the object classifying unit 240 may classify each pixel in the image I, based on the edge map calculated by the calculating unit 220 and the color model estimated by the estimating unit 230 , so as to obtain a binary image of the image I.
  • each pixel in the image I is simply classified as a finger (object) pixel or a non-finger (non-object) pixel.
  • the detecting unit 250 may detect a region containing the finger image F based on the binary image obtained by the object classifying unit 240 . Ideally, as shown in the right of FIG. 3 , the finger region represented with a shaded background may be obtained.
  • both the color model of the finger image and the edge map of the image are used to obtain the binary image of the image. Further, both the information on the clicked point and the edge map of the image are used to estimate the color model of the finger image. Therefore, the accuracy of detecting the finger region can be greatly improved, thus facilitating removal of the finger image from the image and making the processed image more nice-looking.
  • FIG. 4 is a block diagram illustrating a calculating unit 400 in the image processing device according to the embodiment of the present disclosure.
  • the calculating unit 400 shown in FIG. 4 corresponds to the calculating unit 220 shown in FIG. 2 .
  • the calculating unit 400 may include a distance calculating unit 410 , a distance gradient calculating unit 420 , and an edge classifying unit 430 .
  • the distance calculating unit 410 may calculate the distance between the color of each pixel in the image I (see FIG. 3 ) and the color of the clicked point P to obtain a distance map.
  • the color of the clicked point P may be the color of the pixel at the clicked point P, or may be an average color of pixels within a predetermined region containing the clicked point P.
  • the distance calculating unit 410 may calculate the distance between the color of each pixel (x i , y i ) in the image I, color xi, yi , and the color of the clicked point P, color click , according to the equation (1):
  • dist i,j
  • the distance gradient calculating unit 420 may apply a gradient operator (e.g., Sobel operator) to the distance map obtained by the distance calculating unit 410 to obtain a distance gradient image Grad click .
  • a gradient operator e.g., Sobel operator
  • the methods for calculating a gradient image are well-known in the prior art, and thus omitted herein.
  • the edge classifying unit 430 may classify pixels having a distance gradient larger than a predetermined distance gradient threshold in the image I as edge pixels, and the other pixels in the image I as non-edge pixels, thereby obtaining an edge map of the image I. Particularly, the edge classifying unit 430 may obtain the edge map of the image I according to the equation (2):
  • Edge click ⁇ ( x i , y i ) ⁇ 0 , ⁇ if ⁇ ⁇ Grad click ⁇ ( x i , y i ) > T click 255 , ⁇ else ( 2 )
  • T click denotes the predetermined distance gradient threshold
  • Grad click (x i , y i ) denotes the distance gradient between a pixel (x i , y i ) and the clicked point P
  • Edge click (x i , y i ) denotes edge information on whether the pixel (x i , y i ) is an edge pixel or a non-edge pixel.
  • the edge pixels are assigned with the value 0, and the non-edge pixels are assigned with the value 255. In this way, the calculating unit 400 obtains the edge map of the image I.
  • the calculating unit 400 may further include a gray converting unit 440 and an intensity gradient calculating unit 450 .
  • the gray converting unit 440 may convert the image I from a color image into a gray image.
  • the intensity gradient calculating unit 450 may apply a gradient operator (e.g., Sobel operator) to the gray image to obtain an intensity gradient image.
  • a gradient operator e.g., Sobel operator
  • the edge classifying unit 430 may classify pixels having a distance gradient larger than a predetermined distance gradient threshold or having an intensity gradient larger than a predetermined intensity gradient threshold in the image I as edge pixels, and the other pixels in the image I as non-edge pixels, based on the distance gradient image obtained by the distance gradient calculating unit 420 and the intensity gradient image obtained by the intensity gradient calculating unit 450 , thus obtaining an enhanced edge map of the image I.
  • the edge classifying unit 430 may obtain the enhanced edge map of the image I according to the equation (3):
  • Edge enhance ⁇ ( x i , y i ) ⁇ 0 , ⁇ if ⁇ ⁇ Grad click ⁇ ( x i , y i ) > T click ⁇ ⁇ or ⁇ ⁇ Grad intensity ⁇ ( x i , y i ) > T intensity 255 , ⁇ else ( 3 )
  • T intensity denotes the predetermined intensity gradient threshold
  • Grad intensity (x i , y i ) denotes the intensity gradient of a pixel (x i , y i )
  • Edge enhance (x i , y i ) denotes enhanced edge information on whether the pixel (x i , y i ) is an edge pixel or a non-edge pixel.
  • the edge pixels are assigned with the value 0, and the non-edge pixels are assigned with the value 255. In this way, the calculating unit 400 obtains the enhanced edge map of the image I.
  • the calculating unit 400 can detect the boundary of the finger image more completely by means of the information of both images.
  • the calculating unit 400 may as well include only the gray converting unit 440 , the intensity gradient calculating unit 450 , and the edge classifying unit 430 , without the distance calculating unit 410 and the distance gradient calculating unit 420 .
  • the edge classifying unit 430 may classify pixels having an intensity gradient larger than a predetermined intensity gradient threshold in the image I as edge pixels, and the other pixels in the image I as non-edge pixels, based on the intensity gradient image obtained by the intensity gradient calculating unit 450 , thus obtaining the edge map of the image I.
  • the calculating unit 400 calculates the edge map based on pixel information of the image I only, without using the information on the clicked point P.
  • the estimating unit 500 in the image processing device is described below in conjunction with FIG. 5 .
  • the estimating unit 50 shown in FIG. 5 corresponds to the estimating unit 230 shown in FIG. 2 .
  • the estimating unit 500 may include an extension region acquiring unit 510 and a color model acquiring unit 520 .
  • the extension region acquiring unit 510 may acquire an extension region containing the clicked point P based on the clicked point P and the edge map obtained by the calculating unit 220 ( 400 ), the extension region being within the finger image F.
  • the shaded part in FIG. 6( d ) represents the extension region.
  • the color model acquiring unit 520 may acquire the color model of the finger image F based on the color of each pixel within the extension region.
  • the extension region acquiring unit 510 acquires the extension region within the finger region F containing the clicked point P. Based on the color of each pixel within the extension region, instead of solely the color of the pixel at the clicked point P, the color model acquiring unit 520 can obtain a stable and effective color model of the finger image F.
  • the extension region acquiring unit 510 may include a setting unit 515 and searching units 511 - 514 .
  • the setting unit 515 may set a maximum extension region E containing the clicked point P, represented with dotted lines in FIG. 6( b ).
  • the searching unit 511 may search for the first one of boundary pixels leftward in a horizontal direction from the clicked point P, as a left boundary pixel of the extension region; and the searching unit 522 may search for the first one of the boundary pixels rightward in the horizontal direction from the clicked point P, as a right boundary pixel of the extension region.
  • the searching unit 513 may search for the first one of the boundary pixels upward in a vertical direction from the reference pixel, as an upper boundary pixel of the extension region; and the searching unit 514 may search for the first one of the boundary pixels downward in the vertical direction from the reference pixel, as a lower boundary pixel of the extension region.
  • the extension region acquiring unit 510 sets a sliding window by taking each pixel within the maximum extension region E as the center, counts the number of edge pixels in the sliding window, and defines a pixel satisfying a condition that the number of the edge pixels in the sliding window is larger than a predetermined threshold as a boundary pixel.
  • the horizontal scope [x 0-ext , x 1-ext ] of the extension region may be determined as follows. For a point (x click-r , y click ) on the right of the clicked point P(x click , y click ) in the horizontal direction from the clicked point within the maximum extension region E where x click ⁇ x click-r ⁇ x 1 , a sliding window is set by taking the point as the center, and the number of edge pixels in the sliding window is counted.
  • the searching unit 511 may detect from left to right the first pixel satisfying a condition where the number of the edge pixels in the sliding window is larger than a predetermined threshold, and designates the x coordinate of the detected pixel as x 1-ext .
  • the x coordinate of the right boundary pixel of the maximum extension region E may be designated as x 1-ext .
  • a sliding window is set by taking the point as the center, and the number of edge pixels in the sliding window is counted.
  • the searching unit 512 may detect from right to left the first pixel satisfying a condition where the number of the edge pixels in the sliding window is larger than a predetermined threshold, and designates the x coordinate of the detected pixel as x 0-ext .
  • the x coordinate of the left boundary pixel of the maximum extension region E may be designated as x 0-ext .
  • the vertical scope [y 0-ext , y 1-ext ] may be determined as follows. For a point (x, y up ) on the upper side of the reference point (x, y click ) in the vertical direction from the reference point within the maximum extension region E where y 0 ⁇ y up ⁇ y click , a sliding window is set by taking the point as the center, and the number of edge pixels in the sliding window is counted.
  • the searching unit 513 may detect from bottom to top the first pixel satisfying a condition where the number of the edge pixels in the sliding window is larger than a predetermined threshold, and designates the y coordinate of the detected pixel as y 0-ext .
  • the y coordinate of the upper boundary pixel of the maximum extension region E may be designated as y 0-ext .
  • a sliding window is set by taking the point as the center, and the number of edge pixels in the sliding window is counted.
  • the searching unit 514 may detect from top to bottom the first pixel satisfying a condition where the number of the edge pixels in the sliding window is larger than a predetermined threshold, and designates the y coordinate of the detected pixel as y 1-ext .
  • the y coordinate of the lower boundary pixel of the maximum extension region E may be designated as y 1-ext . In this way, the extension region within the finger region F containing the clicked point P is obtained.
  • the horizontal scope [x 0-ext , x 1-ext ] of the extension region is determined first; and then the vertical scope [y 0-ext , y 1-ext ] of the extension region is determined.
  • the present disclosure is not limited to this.
  • the vertical scope [y 0-ext , y 1-ext ] of the extension region may be determined first; and then the horizontal scope [x 0-ext , x 1-ext ] of the extension region may be determined.
  • the determination method thereof is similar to those described above, and thus omitted herein.
  • the color model acquiring unit 520 may acquire the color model of the finger image.
  • the color model of the finger image may be obtained by means of Gaussian Mixture Model, skin color threshold, histogram model with Bayes classifiers, etc.
  • a specific exemplary method for obtaining the color model is given below. Those skilled in the art shall understand that other methods that are different from the specific exemplary method may also be used for obtaining the color model.
  • any point in the extension region is represented as (x i , y i ) where 0 ⁇ i ⁇ N ⁇ 1, and N denotes the number of pixels in the extension region.
  • r′ i and g′ i may be calculated by:
  • r i ′ r i r i + g i + b i ( 4 )
  • g i ′ g i r i + g i + b i ( 5 )
  • r i , g i and b i denote r, g and b values of the pixel (x i , y i ), respectively.
  • K-means Clustering algorithm may be used to obtain K clusters.
  • the K-means Clustering algorithm may be applied to the pixels in the extension region, so that the pixels in the extension region are clustered into K clusters (w i , C i ), where 0 ⁇ i ⁇ K ⁇ 1, and N is a natural number.
  • w i denotes the weight of a cluster C i and equals the ratio of the number of pixels in the cluster C i to the number of all the pixels in the extension region.
  • the pixels in the cluster are used to calculate a mean vector m i and a covariance matrix S i of the color characteristics of the pixels clustered in the cluster as follows:
  • m _ i 1 Num i ⁇ ⁇ i ⁇ C i ⁇ f i ( 6 )
  • S i 1 Num i ⁇ ⁇ i ⁇ C i ⁇ ( f i - m _ i ) ⁇ ( f i - m _ i ) T ( 7 )
  • Num i denotes the number of the pixels in the cluster C i .
  • the Mahalanobis distance Ma ⁇ d(i,j,C k ) between the color characteristic of each pixel (i, j) in the extension region and any cluster C k may be calculated by:
  • a weighted Mahalanobis distance d(i, j) between the color characteristic of each pixel (i, j) in the extension region and K clusters may be calculated by:
  • a predetermined threshold which causes the ratio of the number of pixels having a weighted Mahalanobis distance smaller than the predetermined threshold to the number of all the pixels in the extension region to be equal to a setting ratio may be determined as the color threshold T color .
  • the distances d(i, j) of the pixels may be sorted from smallest to largest, and the color threshold may be selected according to a setting ratio ⁇ (e.g., 0.98).
  • the color threshold is selected such that the ratio of the number of pixels smaller than the predetermined threshold to the number of all the pixels in the extension region is equal to the setting ratio ⁇ .
  • the estimated color model includes K Gaussian models (w i m i , S i ) (0 ⁇ i ⁇ K ⁇ 1) and the color threshold T color .
  • the object classifying unit 240 may classify each pixel in the image I, based on the edge map of the image I and the color model of the finger image, so as to obtain the binary image of the image I.
  • the object classifying unit 240 may classify pixels in the image I that are non-edge pixels in the edge map and have a distance from the color model smaller than the color threshold as finger (object) pixels, and the other pixels in the image I as non-finger (non-object) pixels.
  • the object classifying unit 240 may classify each pixel (i, j) in the image I as follows. First, the color characteristic vector of the pixel (i, j) is calculated according to equations (4) and (5). Then, the distance between the pixel (i, j) and the color model is calculated according to equations (8) and (9). Finally, the pixel (i, j) is classified according to the equation (10):
  • Edge enhance (i, j) can be calculated by equation (3), and d(i, j) can be calculated by equation (9).
  • the binary image of the image I including pixels with values being 0 or 255 only may be obtained. Specifically, value 0 shows that the pixel is closer to a finger pixel, while value 255 shows that the pixel is closer to a non-finger pixel.
  • the detecting unit 700 of the image processing device is described below in conjunction with FIG. 7 .
  • the detecting unit 700 shown in FIG. 7 corresponds to the detecting unit 250 shown in FIG. 2 .
  • the detecting unit 700 may include a noise removing unit 710 for removing noise component in the binary image.
  • the noise removing unit 710 may set a sliding window in the binary image, and count the number of finger pixels (i.e., pixels whose pixel values are 0) in the sliding window. If the number of finger pixels in the sliding window is smaller than a predetermined threshold, it is determined that the finger pixels are actually noise pixels and the pixels are set as non-finger pixels, i.e., the values of the pixels are converted from 0 into 255.
  • the pixel converting unit 712 may convert all the finger pixels in the connected component into non-finger pixels if the connected component satisfies any of the conditions:
  • the aspect ratio of the connected component is larger than a predetermined ratio
  • the finger image is on the left side of the image, and the distance between a left boundary of the connected component and a left boundary of the image is larger than a predetermined threshold;
  • condition 1) the finger image takes a certain area; and when the area of the connected component is too small, the connected component is unlikely to be the finger image, instead, it may be a noise component.
  • condition 2) the finger image has a certain aspect ratio. As shown in FIG. 8( a ), when the aspect ratio is too large, the connected component is more likely to be a noise component such as book content T, and unlikely to be the finger image F 1 or F 2 .
  • the detecting unit 700 may further include a connected component processing unit 720 and a filling unit 730 .
  • the connected component processing unit 720 may, according to the clicked point, acquire a connected component F 1 where the clicked point is located, and search for a nearby connected component F 2 in a vertical direction.
  • the filling unit 730 may perform filling operation on the connected component F 1 containing the clicked point and the found connected component F 2 (i.e., region F′), thereby obtaining a filled connected component F′′.
  • the finger image may be classified into multiple separate parts (e.g., F 1 and F 2 shown in FIG. 8( a ) and FIG. 8( b )), the connected component F 1 containing the clicked point is combined with the nearby connected component F 2 in a vertical direction from F 1 .
  • the filling operation may be used to fill the holes. Specifically, for each column of the image I, the upper most and lower most finger pixels (i.e., pixels whose pixel values are 0) are detected, and then all pixels between the two pixels are set as finger pixels. After the filling operation, the hole regions on the finger are filled, as shown in FIG. 8( d ).
  • the detecting unit 700 may further include an expanding unit 740 for performing an expanding operation on the filled connected component in the binary image.
  • an expanding unit 740 for performing an expanding operation on the filled connected component in the binary image.
  • FIG. 9 because the boundary of the finger image may not be contained in the detected finger region A, it is necessary to perform an expanding operation so as to expand the finger region A into a region A′. Specific methods for the expanding operation are well-known in the prior art, and the present disclosure has no particular limitations thereto.
  • step S 110 a click is performed on an object image contained in an image to obtain a clicked point.
  • step S 120 an edge map of the image is calculated.
  • step S 130 a color model of the object image is estimated based on the clicked point and the edge map.
  • step S 150 a region containing the object image is detected based on the binary image.
  • the distance between the color of each pixel in the image and the color of the clicked point may be calculated to obtain a distance map.
  • a gradient operator may be applied to the distance map to obtain a distance gradient image. If a pixel in the image has a distance gradient larger than a predetermined distance gradient threshold, the pixel is classified as an edge pixel; otherwise, the pixel is classified as a non-edge pixel.
  • the distance between the color of each pixel in the image and the color of the clicked point may be calculated to obtain a distance map.
  • a gradient operator may be applied to the distance map to obtain a distance gradient image.
  • the image may be converted from a color image into a gray image, and a gradient operator may be applied to the gray image to obtain an intensity gradient image. If a pixel in the image has a distance gradient larger than a predetermined distance gradient threshold or has an intensity gradient larger than a predetermined intensity gradient threshold, the pixel is classified as an edge pixel; otherwise, the pixel is classified as a non-edge pixel.
  • an extension region containing the clicked point may be acquired based on the clicked point and the edge map, the extension region being within the object image. Then, the color model of the object image may be acquired based on a color of each pixel within the extension region.
  • a maximum extension region containing the clicked point may be set. Then, the first one of boundary pixels leftward in a horizontal direction from the clicked point may be searched for, as a left boundary pixel of the extension region; and the first one of the boundary pixels rightward in the horizontal direction from the clicked point may be searched for, as a right boundary pixel of the extension region.
  • the first one of the boundary pixels upward in a vertical direction from the reference pixel may be searched for, as an upper boundary pixel of the extension region; and the first one of the boundary pixels downward in the vertical direction from the reference pixel may be searched for, as an lower boundary pixel of the extension region.
  • a sliding window is set taking the pixel as the center, the number of edge pixels in the sliding window is counted, and a pixel satisfying a condition that the number of the edge pixels in the sliding window is larger than a predetermined threshold is defined as a boundary pixel.
  • a pixel in the image is a non-edge pixel in the edge map and has a distance from the color model less than a color threshold, the pixel is classified as an object pixel; otherwise, the pixel is classified as a non-object pixel.
  • a noise component in the binary image may be removed.
  • connected component analysis may be performed on the binary image, so as to obtain a connected component in the binary image, each of pixels in the connected component being an object pixel. And each object pixel in the connected component is converted into a non-object pixel if the connected component satisfies any of the conditions of:
  • the object image being on the left side of the image and the distance between a left boundary of the connected component and a left boundary of the image being larger than a predetermined threshold;
  • the object image being on the right side of the image and the distance between a right boundary of the connected component and a right boundary of the image being larger than the predetermined threshold.
  • a connected component where the clicked point is located may be acquired according to the clicked point, and a nearby connected component is searched for in a vertical direction. Then, a filling operation may be performed on the connected component containing the clicked point and the found connected component, thereby obtaining a filled connected component.
  • an expanding operation may further be performed on the filled connected component in the binary image.
  • respective operating processes of the image processing method above according to the present disclosure can be implemented in a manner of a computer executable program stored on a machine-readable storage medium.
  • the object of the present disclosure can be implemented in a manner that the storage medium on which the computer executable program above is carried is provided directly or indirectly to a system or apparatus, a computer or a Central Processing Unit (CPU) of which reads out and executes the computer executable program.
  • the implementation of the present disclosure is not limited to a program as long as the system or apparatus has a function to execute the program, and the program can be in arbitrary forms such as an objective program, a program executed by an interpreter, a script program provided to an operating system, etc.
  • the machine-readable storage medium mentioned above includes, but is not limited to, various memories and storage devices, a semiconductor device, a disk unit such as an optic disk, a magnetic disk and a magneto-optic disk, and other medium suitable for storing information.
  • the present disclosure can also be implemented by connecting to a corresponding web site on the Internet through a computer, downloading and installing the computer executable program according to the invention into the computer, and then executing the program.
  • FIG. 11 is a block diagram illustrating an exemplary structure of a general-purpose personal computer on which the image processing device and method according to the embodiments of the present disclosure can be implemented.
  • a CPU 1301 executes various processing according to a program stored in a Read Only Memory (ROM) 1302 or a program loaded to a Random Access Memory (RAM) 1303 from a storage device 1308 .
  • ROM Read Only Memory
  • RAM Random Access Memory
  • the CPU 1301 , the ROM 1302 and the RAM 1303 are connected to each other via a bus 1304 .
  • An input/output interface 1305 is also connected to the bus 1304 .
  • the following components are connected to the input/output interface 1305 : an input device 1306 including a keyboard, a mouse and the like, an output device 1307 including a display such as a Cathode Ray Tube (CRT) and a Liquid Crystal Display (LCD), a speaker and the like, the storage device 1308 including a hard disk and the like, and a communication device 1309 including a network interface card such as a LAN card, a modem and the like.
  • the communication device 1309 performs communication processing via a network such as the Internet.
  • a drive 1310 can also be connected to the input/output interface 1305 .
  • a removable medium 1311 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory and the like is mounted on the drive 1310 as necessary such that a computer program read out therefrom is installed in the storage device 1308 .
  • a program constituting the software is installed from the network such as the Internet or the storage medium such as the removable medium 1311 .
  • the storage medium is not limited to the removable medium 1311 shown in FIG. 11 in which the program is stored and which is distributed separately from the device so as to provide the program to the user.
  • the removable medium 1311 include a magnetic disk including a Floppy Disk (registered trademark), an optical disk including a Compact Disk Read Only Memory (CD-ROM) and a Digital Versatile Disc (DVD), a magneto-optical disk including a MiniDisc (MD) (registered trademark), and a semiconductor memory.
  • the storage medium may be the ROM 1302 , the hard disk contained in the storage device 1308 or the like.
  • the program is stored in the storage medium, and the storage medium is distributed to the user together with the device containing the storage medium.
  • An image processing device comprising:
  • an inputting unit for performing a click on an object image contained in an image to obtain a clicked point
  • a calculating unit for calculating an edge map of the image
  • an estimating unit for estimating a color model of the object image based on the clicked point and the edge map
  • an object classifying unit for classifying each pixel in the image, based on the edge map and the color model, so as to obtain a binary image of the image
  • a detecting unit for detecting a region containing the object image based on the binary image.
  • a distance calculating unit for calculating a distance between a color of each pixel in the image and a color of the clicked point to obtain a distance map
  • a distance gradient calculating unit for applying a gradient operator to the distance map to obtain a distance gradient image
  • an edge classifying unit for classifying a pixel having a distance gradient larger than a predetermined distance gradient threshold in the image into an edge pixel, and the other pixel in the image into a non-edge pixel.
  • a distance calculating unit for calculating a distance between a color of each pixel in the image and a color of the clicked point to obtain a distance map
  • a distance gradient calculating unit for applying a gradient operator to the distance map to obtain a distance gradient image
  • a gray converting unit for converting the image from a color image to a gray image
  • an intensity gradient calculating unit for applying a gradient operator to the gray image to obtain an intensity gradient image
  • an edge classifying unit for classifying a pixel having a distance gradient larger than a predetermined distance gradient threshold or having an intensity gradient larger than a predetermined intensity gradient threshold in the image into an edge pixel, and the other pixel in the image into a non-edge pixel.
  • the estimating unit comprises:
  • an extension region acquiring unit for acquiring an extension region containing the clicked point based on the clicked point and the edge map, the extension region being within the object image
  • a color model acquiring unit for acquiring the color model of the object image based on a color of each pixel within the extension region.
  • extension region acquiring unit comprises:
  • a setting unit for setting a maximum extension region containing the clicked point
  • a first searching unit for searching, as a left boundary pixel of the extension region, the first one of boundary pixels leftward in a horizontal direction from the clicked point;
  • a third searching unit for searching, for each reference pixel between the left boundary pixel and the right boundary pixel in the horizontal direction, as an upper boundary pixel of the extension region, the first one of the boundary pixels upward in a vertical direction from the reference pixel;
  • a fourth searching unit for searching, as an lower boundary pixel of the extension region, the first one of the boundary pixels downward in the vertical direction from the reference pixel, wherein,
  • the extension region acquiring unit sets a sliding window taking each pixel within the maximum extension region as a center, counts the number of edge pixels in the sliding window, and defines a pixel satisfying a condition that the number of the edge pixels in the sliding window is larger than a predetermined threshold as the boundary pixel.
  • the object classifying unit classifies a pixel in the image which is a non-edge pixel in the edge map and a distance from the color model is less than a color threshold into an object pixel, and the other pixel in the image into a non-object pixel.
  • the detecting unit comprises a noise removing unit for removing noise component in the binary image.
  • the noise removing unit comprises:
  • a connected component analyzing unit for performing connected component analysis algorithm on the binary image, so as to obtain a connected component in the binary image, each of pixels in the connected component being an object pixel;
  • a pixel converting unit for converting each object pixel in the connected component into a non-object pixel if the connected component satisfies any of conditions of:
  • an aspect ratio of the connected component being larger than a predetermined ratio
  • the object image being on the left side of the image and a distance between a left boundary of the connected component and a left boundary of the image being larger than a predetermined threshold;
  • the object image being on the right side of the image and a distance between a right boundary of the connected component and a right boundary of the image being larger than the predetermined threshold.
  • the detecting unit further comprises:
  • a connected component processing unit for acquiring a connected component where the clicked point locates according to the clicked point and searching a nearby connected component in a vertical direction;
  • a filling unit for performing filling operation on the connected component containing the clicked point and the searched connected component, so as to obtain the filled connected component.
  • the detecting unit further comprises:
  • an expanding unit for performing an expanding operation on the filled connected component in the binary image.
  • the color of the clicked point is the color of a pixel at the clicked point, or an average color of pixels within a predetermined region containing the clicked point.
  • An image processing method comprising:
  • step of estimating a color model of the object image based on the clicked point and the edge map comprises:
  • step of acquiring an extension region containing the clicked point based on the clicked point and the edge map comprises:
  • a sliding window is set taking each pixel within the maximum extension region as a center, the number of edge pixels in the sliding window is counted, and a pixel satisfying a condition that the number of the edge pixels in the sliding window is larger than a predetermined threshold is defined as the boundary pixel.
  • a program product comprising a machine-readable instruction code stored therein, wherein the instruction code, when read and executed by a computer, enables the computer to execute the method according to any of Appendixes 13-18.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

An image processing device includes: an inputting unit for performing a click on an object image contained in an image to obtain a clicked point; a calculating unit for calculating an edge map of the image; an estimating unit for estimating a color model of the object image based on the clicked point and the edge map; an object classifying unit for classifying each pixel in the image, based on the edge map and the color model, so as to obtain a binary image of the image; and a detecting unit for detecting a region containing the object image based on the binary image. The image processing device and method according to the present disclosure can improve the accuracy of detecting the boundary of an object image such as a finger image, thus facilitating removal of the object image from the image and making the processed image more nice-looking.

Description

    FIELD OF THE INVENTION
  • The present disclosure relates to the field of image processing, particularly to a device and a method for detecting the boundary of an object image such as a finger image.
  • BACKGROUND OF THE INVENTION
  • This section provides background information relating to the present disclosure, which is not necessarily prior art.
  • When scanning a book using an overhead scanner, for example, the user may hold both sides of the book with his/her fingers to complete the scanning process. The fingers may appear on the side boundaries of the book in the corrected scanned image of the book, making the corrected image less nice-looking. Therefore, it is necessary to remove the finger image in the corrected image.
  • In order to remove the finger image, generally two steps are to be taken: first, detecting the finger region; and secondly, removing the finger region. Clearly, automatic finger region detection and removal are useful. However, considering the variety of types of book contents and the possibility that the fingers may get to the book content, it is difficult to correctly detect the finger region.
  • SUMMARY OF THE INVENTION
  • This section provides a general summary of the present disclosure, and is not a comprehensive disclosure of its full scope or all of its features.
  • An object of the present disclosure is to provide an image processing device and an image processing method, which can improve the accuracy of detecting the boundary of an object image such as a finger image, thus facilitating removal of the object image from the image and making the processed image more nice-looking.
  • According to an aspect of the present disclosure, there is provided an image processing device including: an inputting unit for performing a click on an object image contained in an image to obtain a clicked point; a calculating unit for calculating an edge map of the image; an estimating unit for estimating a color model of the object image based on the clicked point and the edge map; an object classifying unit for classifying each pixel in the image, based on the edge map and the color model, so as to obtain a binary image of the image; and a detecting unit for detecting a region containing the object image based on the binary image.
  • According to another aspect of the present disclosure, there is provided an image processing method including: performing a click on an object image contained in an image to obtain a clicked point; calculating an edge map of the image; estimating a color model of the object image based on the clicked point and the edge map; classifying each pixel in the image, based on the edge map and the color model, so as to obtain a binary image of the image; and detecting a region containing the object image based on the binary image.
  • According to another aspect of the present disclosure, there is provided a program product including machine-readable instruction code stored therein which, when read and executed by a computer, causes the computer to perform the image processing method according to the present disclosure.
  • According to another aspect of the present disclosure, there is provided a machine-readable storage medium carrying the program product according to the present disclosure thereon.
  • The image processing device and method according to the present disclosure require user interaction to obtain information on the clicked point. Further, the image processing device and method according to the present disclosure use color information and edge information to detect the boundary of an object image such as a finger image. Accordingly, the image processing device and method according to the present disclosure can improve the accuracy of detecting the boundary of an object image, thus facilitating removal of the object image from the image and making the processed image more nice-looking.
  • Further areas of applicability will become apparent from the description provided herein. The description and specific examples in this summary are intended for purposes of illustration only and are not intended to limit the scope of the present disclosure.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The drawings described herein are for illustrative purposes only of selected embodiments and not all possible implementations, and are not intended to limit the scope of the present disclosure. In the drawings:
  • FIGS. 1( a) and 1(b) are schematic diagrams illustrating an exemplary image to be dealt with by the technical solution of the present disclosure;
  • FIG. 2 is a block diagram illustrating an image processing device according to an embodiment of the present disclosure;
  • FIG. 3 is a schematic diagram illustrating an exemplary application of the image processing device according to the embodiment of the present disclosure;
  • FIG. 4 is a block diagram illustrating a calculating unit in the image processing device according to the embodiment of the present disclosure;
  • FIG. 5 is a block diagram illustrating an estimating unit in the image processing device according to the embodiment of the present disclosure;
  • FIGS. 6( a) to 6(d) are schematic diagrams illustrating an exemplary application of an extension region acquiring unit in the estimating unit in the image processing device according to the embodiment of the present disclosure;
  • FIG. 7 is a block diagram illustrating a detecting unit in the image processing device according to the embodiment of the present disclosure;
  • FIGS. 8( a) to 8(d) are schematic diagrams illustrating an exemplary application of the detecting unit in the image processing device according to the embodiment of the present disclosure;
  • FIG. 9 is a schematic diagram illustrating an exemplary application of an expanding unit in the detecting unit in the image processing device according to the embodiment of the present disclosure;
  • FIG. 10 is a flowchart of an image processing method according to an embodiment of the present disclosure; and
  • FIG. 11 is a block diagram illustrating an exemplary structure of a general-purpose personal computer on which the image processing device and method according to the embodiments of the present disclosure can be implemented.
  • While the present disclosure is susceptible to various modifications and alternative forms, specific embodiments thereof have been shown by way of example in the drawings and are herein described in detail. It should be understood, however, that the description herein of specific embodiments is not intended to limit the present disclosure to the particular forms disclosed, but on the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the present disclosure. Note that corresponding reference numerals indicate corresponding parts throughout the several views of the drawings.
  • DETAILED DESCRIPTION OF THE EMBODIMENTS
  • Examples of the present disclosure will now be described more fully with reference to the accompanying drawings. The following description is merely exemplary in nature and is not intended to limit the present disclosure, application, or uses.
  • Example embodiments are provided so that this disclosure will be thorough, and will fully convey the scope to those who are skilled in the art. Numerous specific details are set forth such as examples of specific components, devices, and methods, to provide a thorough understanding of embodiments of the present disclosure. It will be apparent to those skilled in the art that specific details need not be employed, that example embodiments may be embodied in many different forms and that neither should be construed to limit the scope of the disclosure. In some example embodiments, well-known processes, well-known device structures, and well-known technologies are not described in detail.
  • FIGS. 1( a) and 1(b) are schematic diagrams illustrating an exemplary image to be dealt with by the technical solution of the present disclosure. When scanning a book B using an overhead scanner, for example, the user may hold both sides of the book with the fingers of his/her left hand LH and right hand RH to complete the scanning process, thus obtaining an image as shown in FIG. 1( a). Known methods in the prior art may be used to correct the obtained image. For example, the upper and lower boundaries of the image may be extracted, and then transformed from curved into flat so as to obtain the corrected image. FIG. 1( b) shows an example of the corrected image. As shown in FIG. 1( b), in the corrected scanned image of the book, a finger image F may appear on the side boundaries of book, and the finger image F may get to book content T, making the corrected image less nice-looking. Therefore, it is necessary to remove the finger image F from the corrected image.
  • In order to remove the finger image F, generally two steps are to be taken: first, detecting the finger region; and secondly, removing the finger region. By using the technical solution of the present disclosure, the accuracy of detecting the finger region as shown in FIG. 1( b) can be improved, thus facilitating removal of the finger region and making the corrected scanned image of the book more nice-looking.
  • As shown in FIG. 2, the image processing device 200 according to an embodiment of the present disclosure may include an inputting unit 210, a calculating unit 220, an estimating unit 230, an object classifying unit 240, and a detecting unit 250.
  • The inputting unit 210 may click on an object image contained in an image to obtain a clicked point. For example, as shown in the left of FIG. 3, on an image I cropped from the corrected image containing the finger image F, the inputting unit 210 may perform a click on the finger image F to obtain a clicked point P. In this way, it is clear that the clicked point P is within the finger region. The inputting unit 210 may be any device that can perform click function, e.g. a mouse, and the present disclosure has no particular limitations thereto.
  • The calculating unit 220 may calculate an edge map of the image I. The edge map is a map in relation to edge information of the image I. The edge information indicates whether a pixel in the image I is an edge pixel or not. The calculating unit 220 may calculate the edge map based on pixel information of the image I and information of the clicked point P obtained by the inputting unit 210, or calculate the edge map based on the pixel information of the image I only. This will be described later in detail.
  • The estimating unit 230 may estimate a color model of the finger image (object image) F based on the clicked point P obtained by the inputting unit 210 and the edge map calculated by the calculating unit 220.
  • Further, the object classifying unit 240 may classify each pixel in the image I, based on the edge map calculated by the calculating unit 220 and the color model estimated by the estimating unit 230, so as to obtain a binary image of the image I. In the binary image, each pixel in the image I is simply classified as a finger (object) pixel or a non-finger (non-object) pixel.
  • Further, the detecting unit 250 may detect a region containing the finger image F based on the binary image obtained by the object classifying unit 240. Ideally, as shown in the right of FIG. 3, the finger region represented with a shaded background may be obtained.
  • In the image processing device 200 according to the embodiment of the present disclosure, both the color model of the finger image and the edge map of the image are used to obtain the binary image of the image. Further, both the information on the clicked point and the edge map of the image are used to estimate the color model of the finger image. Therefore, the accuracy of detecting the finger region can be greatly improved, thus facilitating removal of the finger image from the image and making the processed image more nice-looking.
  • In order to provide a better understanding of the technical solution of the present disclosure, the components of the image processing device 200 shown in FIG. 2 are described below in more detail.
  • FIG. 4 is a block diagram illustrating a calculating unit 400 in the image processing device according to the embodiment of the present disclosure. The calculating unit 400 shown in FIG. 4 corresponds to the calculating unit 220 shown in FIG. 2.
  • The calculating unit 400 may include a distance calculating unit 410, a distance gradient calculating unit 420, and an edge classifying unit 430.
  • The distance calculating unit 410 may calculate the distance between the color of each pixel in the image I (see FIG. 3) and the color of the clicked point P to obtain a distance map. The color of the clicked point P may be the color of the pixel at the clicked point P, or may be an average color of pixels within a predetermined region containing the clicked point P.
  • Specifically, assuming that the width and height of the image I are w0 and h0 respectively, the coordinates of the clicked point P in the image I are (xclick, yclick), and the color of the clicked point P is represented by colorclick=(rclick, gclick, bclick), where rclick, gclick and bclick are R, G and B values of the color of the clicked point P, respectively. The distance calculating unit 410 may calculate the distance between the color of each pixel (xi, yi) in the image I, colorxi, yi, and the color of the clicked point P, colorclick, according to the equation (1):

  • disti,j=|colorxi,yi−colorclick|, 1≦y i ≦h 0, 1≦x i ≦w 0  (1)
  • In this way, the distance map of the image I can be obtained.
  • Further, the distance gradient calculating unit 420 may apply a gradient operator (e.g., Sobel operator) to the distance map obtained by the distance calculating unit 410 to obtain a distance gradient image Gradclick. The methods for calculating a gradient image are well-known in the prior art, and thus omitted herein.
  • Further, based on the distance gradient image Gradclick obtained by the distance gradient calculating unit 420, the edge classifying unit 430 may classify pixels having a distance gradient larger than a predetermined distance gradient threshold in the image I as edge pixels, and the other pixels in the image I as non-edge pixels, thereby obtaining an edge map of the image I. Particularly, the edge classifying unit 430 may obtain the edge map of the image I according to the equation (2):
  • Edge click ( x i , y i ) = { 0 , if Grad click ( x i , y i ) > T click 255 , else ( 2 )
  • where Tclick denotes the predetermined distance gradient threshold, Gradclick(xi, yi) denotes the distance gradient between a pixel (xi, yi) and the clicked point P, and Edgeclick(xi, yi) denotes edge information on whether the pixel (xi, yi) is an edge pixel or a non-edge pixel. Specifically, the edge pixels are assigned with the value 0, and the non-edge pixels are assigned with the value 255. In this way, the calculating unit 400 obtains the edge map of the image I.
  • According to a preferred embodiment of the present disclosure, the calculating unit 400 may further include a gray converting unit 440 and an intensity gradient calculating unit 450. The gray converting unit 440 may convert the image I from a color image into a gray image. The intensity gradient calculating unit 450 may apply a gradient operator (e.g., Sobel operator) to the gray image to obtain an intensity gradient image. The methods for converting a color image into a gray image and the methods for calculating an intensity gradient image are well-known in the prior art, and thus omitted herein.
  • In this case, the edge classifying unit 430 may classify pixels having a distance gradient larger than a predetermined distance gradient threshold or having an intensity gradient larger than a predetermined intensity gradient threshold in the image I as edge pixels, and the other pixels in the image I as non-edge pixels, based on the distance gradient image obtained by the distance gradient calculating unit 420 and the intensity gradient image obtained by the intensity gradient calculating unit 450, thus obtaining an enhanced edge map of the image I. Particularly, the edge classifying unit 430 may obtain the enhanced edge map of the image I according to the equation (3):
  • Edge enhance ( x i , y i ) = { 0 , if Grad click ( x i , y i ) > T click or Grad intensity ( x i , y i ) > T intensity 255 , else ( 3 )
  • where Tintensity denotes the predetermined intensity gradient threshold, Gradintensity(xi, yi) denotes the intensity gradient of a pixel (xi, yi), and Edgeenhance(xi, yi) denotes enhanced edge information on whether the pixel (xi, yi) is an edge pixel or a non-edge pixel. Specifically, the edge pixels are assigned with the value 0, and the non-edge pixels are assigned with the value 255. In this way, the calculating unit 400 obtains the enhanced edge map of the image I.
  • Because the distance gradient image and intensity gradient image of the image I are complementary to a certain degree, the calculating unit 400 can detect the boundary of the finger image more completely by means of the information of both images.
  • It is noted that the calculating unit 400 may as well include only the gray converting unit 440, the intensity gradient calculating unit 450, and the edge classifying unit 430, without the distance calculating unit 410 and the distance gradient calculating unit 420. In this case, the edge classifying unit 430 may classify pixels having an intensity gradient larger than a predetermined intensity gradient threshold in the image I as edge pixels, and the other pixels in the image I as non-edge pixels, based on the intensity gradient image obtained by the intensity gradient calculating unit 450, thus obtaining the edge map of the image I. In this case, the calculating unit 400 calculates the edge map based on pixel information of the image I only, without using the information on the clicked point P.
  • The estimating unit 500 in the image processing device according to the embodiment of the present disclosure is described below in conjunction with FIG. 5. The estimating unit 50 shown in FIG. 5 corresponds to the estimating unit 230 shown in FIG. 2.
  • The estimating unit 500 may include an extension region acquiring unit 510 and a color model acquiring unit 520.
  • As shown in FIG. 6, for example, the extension region acquiring unit 510 may acquire an extension region containing the clicked point P based on the clicked point P and the edge map obtained by the calculating unit 220 (400), the extension region being within the finger image F. Specifically, the shaded part in FIG. 6( d) represents the extension region.
  • Further, the color model acquiring unit 520 may acquire the color model of the finger image F based on the color of each pixel within the extension region.
  • In order to obtain a stable and effective skin color model, generally many samples (i.e., pixels) are needed; but the user clicks only one point (i.e., the clicked point) in the finger image. In this case, more pixels are needed to be obtained for estimating the skin color model. Therefore, it is necessary that the extension region acquiring unit 510 acquires the extension region within the finger region F containing the clicked point P. Based on the color of each pixel within the extension region, instead of solely the color of the pixel at the clicked point P, the color model acquiring unit 520 can obtain a stable and effective color model of the finger image F.
  • Specifically, the extension region acquiring unit 510 may include a setting unit 515 and searching units 511-514.
  • The setting unit 515 may set a maximum extension region E containing the clicked point P, represented with dotted lines in FIG. 6( b). The searching unit 511 may search for the first one of boundary pixels leftward in a horizontal direction from the clicked point P, as a left boundary pixel of the extension region; and the searching unit 522 may search for the first one of the boundary pixels rightward in the horizontal direction from the clicked point P, as a right boundary pixel of the extension region.
  • For each reference pixel between the left boundary pixel and the right boundary pixel in the horizontal direction, the searching unit 513 may search for the first one of the boundary pixels upward in a vertical direction from the reference pixel, as an upper boundary pixel of the extension region; and the searching unit 514 may search for the first one of the boundary pixels downward in the vertical direction from the reference pixel, as a lower boundary pixel of the extension region.
  • Specifically, the extension region acquiring unit 510 sets a sliding window by taking each pixel within the maximum extension region E as the center, counts the number of edge pixels in the sliding window, and defines a pixel satisfying a condition that the number of the edge pixels in the sliding window is larger than a predetermined threshold as a boundary pixel.
  • Description is given in conjunction with FIGS. 6( a) to 6(c). Assuming the ranges of x and y coordinates of the maximum extension region E are [x0, x1] and [y0, y1], as shown in FIG. 6( c), the horizontal scope [x0-ext, x1-ext] of the extension region may be determined as follows. For a point (xclick-r, yclick) on the right of the clicked point P(xclick, yclick) in the horizontal direction from the clicked point within the maximum extension region E where xclick≦xclick-r≦x1, a sliding window is set by taking the point as the center, and the number of edge pixels in the sliding window is counted. Then, the searching unit 511 may detect from left to right the first pixel satisfying a condition where the number of the edge pixels in the sliding window is larger than a predetermined threshold, and designates the x coordinate of the detected pixel as x1-ext. As a matter of course, it is possible that no edge pixel is found up to the right boundary pixel of the maximum extension region E. In this case, the x coordinate of the right boundary pixel of the maximum extension region E may be designated as x1-ext.
  • Correspondingly, for a point (xclick-1, yclick) on the left of the clicked point P(xclick, yclick) in the horizontal direction from the clicked point within the maximum extension region E where x0≦xclick-1≦xclick, a sliding window is set by taking the point as the center, and the number of edge pixels in the sliding window is counted. Then, the searching unit 512 may detect from right to left the first pixel satisfying a condition where the number of the edge pixels in the sliding window is larger than a predetermined threshold, and designates the x coordinate of the detected pixel as x0-ext. As a matter of course, it is possible that no edge pixel is found up to the left boundary pixel of the maximum extension region E. In this case, the x coordinate of the left boundary pixel of the maximum extension region E may be designated as x0-ext.
  • Upon determination of the horizontal scope [x0-ext, x1-ext] of the extension region, for each reference pixel (x, yclick) between the left boundary pixel and the right boundary pixel in the horizontal direction where x0-ext≦x≦x1-ext, the vertical scope [y0-ext, y1-ext] may be determined as follows. For a point (x, yup) on the upper side of the reference point (x, yclick) in the vertical direction from the reference point within the maximum extension region E where y0≦yup≦yclick, a sliding window is set by taking the point as the center, and the number of edge pixels in the sliding window is counted. Then, the searching unit 513 may detect from bottom to top the first pixel satisfying a condition where the number of the edge pixels in the sliding window is larger than a predetermined threshold, and designates the y coordinate of the detected pixel as y0-ext. As a matter of course, it is possible that no edge pixel is found up to the upper boundary pixel of the maximum extension region E. In this case, the y coordinate of the upper boundary pixel of the maximum extension region E may be designated as y0-ext.
  • Correspondingly, for a point (x, ydown) on the lower side of the reference point (x, yclick) in the vertical direction from the reference point within the maximum extension region E where yclick≦ydown≦y1, a sliding window is set by taking the point as the center, and the number of edge pixels in the sliding window is counted. Then, the searching unit 514 may detect from top to bottom the first pixel satisfying a condition where the number of the edge pixels in the sliding window is larger than a predetermined threshold, and designates the y coordinate of the detected pixel as y1-ext. As a matter of course, it is possible that no edge pixel is found up to the lower boundary pixel of the maximum extension region E. In this case, the y coordinate of the lower boundary pixel of the maximum extension region E may be designated as y1-ext. In this way, the extension region within the finger region F containing the clicked point P is obtained.
  • It is noted that in the technical solution described above, the horizontal scope [x0-ext, x1-ext] of the extension region is determined first; and then the vertical scope [y0-ext, y1-ext] of the extension region is determined. However, the present disclosure is not limited to this. For example, the vertical scope [y0-ext, y1-ext] of the extension region may be determined first; and then the horizontal scope [x0-ext, x1-ext] of the extension region may be determined. The determination method thereof is similar to those described above, and thus omitted herein.
  • Upon obtaining the extension region within the finger region F containing the clicked point P, the color model acquiring unit 520 may acquire the color model of the finger image. For example, the color model of the finger image may be obtained by means of Gaussian Mixture Model, skin color threshold, histogram model with Bayes classifiers, etc. A specific exemplary method for obtaining the color model is given below. Those skilled in the art shall understand that other methods that are different from the specific exemplary method may also be used for obtaining the color model.
  • Multiple Gaussian models are used here because the finger color may consist of multiple color centers. Assuming any point in the extension region is represented as (xi, yi) where 0≦i≦N−1, and N denotes the number of pixels in the extension region. The color characteristic of each point (xi, yi) in the extension region may be represented as a two-dimensional vector fi=(r′i, g′i). r′i and g′i may be calculated by:
  • r i = r i r i + g i + b i ( 4 ) g i = g i r i + g i + b i ( 5 )
  • where ri, gi and bi denote r, g and b values of the pixel (xi, yi), respectively.
  • In order to obtain multiple color centers, K-means Clustering algorithm may be used to obtain K clusters.
  • In order to obtain multiple color centers, the K-means Clustering algorithm may be applied to the pixels in the extension region, so that the pixels in the extension region are clustered into K clusters (wi, Ci), where 0≦i≦K−1, and N is a natural number. Specifically, wi denotes the weight of a cluster Ci and equals the ratio of the number of pixels in the cluster Ci to the number of all the pixels in the extension region.
  • For each cluster Ci, the pixels in the cluster are used to calculate a mean vector mi and a covariance matrix Si of the color characteristics of the pixels clustered in the cluster as follows:
  • m _ i = 1 Num i i C i f i ( 6 ) S i = 1 Num i i C i ( f i - m _ i ) ( f i - m _ i ) T ( 7 )
  • where Numi denotes the number of the pixels in the cluster Ci.
  • Then, based on the mean vector mk and the covariance matrix Sk of the color characteristics of the pixels in any cluster Ck, the Mahalanobis distance Ma−d(i,j,Ck) between the color characteristic of each pixel (i, j) in the extension region and any cluster Ck may be calculated by:

  • Ma−d(i,j,C k)=(f i,j m k )T S k −1(f i,j m k )  (8)
  • Further, based on the weight wk of each cluster Ck in the extension region, a weighted Mahalanobis distance d(i, j) between the color characteristic of each pixel (i, j) in the extension region and K clusters may be calculated by:
  • d ( i , j ) = k = 0 K - 1 w k * Ma - d ( i , j , C k ) ( 9 )
  • Furthermore, a predetermined threshold which causes the ratio of the number of pixels having a weighted Mahalanobis distance smaller than the predetermined threshold to the number of all the pixels in the extension region to be equal to a setting ratio may be determined as the color threshold Tcolor.
  • Specifically, the distances d(i, j) of the pixels may be sorted from smallest to largest, and the color threshold may be selected according to a setting ratio ζ (e.g., 0.98). For example, the color threshold is selected such that the ratio of the number of pixels smaller than the predetermined threshold to the number of all the pixels in the extension region is equal to the setting ratio ζ. Finally, the estimated color model includes K Gaussian models (wi mi , Si) (0≦i≦K−1) and the color threshold Tcolor.
  • As described above referring to FIG. 2, the object classifying unit 240 may classify each pixel in the image I, based on the edge map of the image I and the color model of the finger image, so as to obtain the binary image of the image I.
  • Specifically, the object classifying unit 240 may classify pixels in the image I that are non-edge pixels in the edge map and have a distance from the color model smaller than the color threshold as finger (object) pixels, and the other pixels in the image I as non-finger (non-object) pixels.
  • More specifically, for example, according to the estimated color model and the enhanced edge map described above, the object classifying unit 240 may classify each pixel (i, j) in the image I as follows. First, the color characteristic vector of the pixel (i, j) is calculated according to equations (4) and (5). Then, the distance between the pixel (i, j) and the color model is calculated according to equations (8) and (9). Finally, the pixel (i, j) is classified according to the equation (10):
  • Label ( i , j ) = { 0 if Edge enhance ( i , j ) = 255 and d ( i , j ) T color 255 if Edge ( i , j ) = 0 or d ( i , j ) > T color ( 10 )
  • where Edgeenhance(i, j) can be calculated by equation (3), and d(i, j) can be calculated by equation (9).
  • By these operations, the binary image of the image I including pixels with values being 0 or 255 only may be obtained. Specifically, value 0 shows that the pixel is closer to a finger pixel, while value 255 shows that the pixel is closer to a non-finger pixel.
  • The detecting unit 700 of the image processing device according to an embodiment of the present disclosure is described below in conjunction with FIG. 7. The detecting unit 700 shown in FIG. 7 corresponds to the detecting unit 250 shown in FIG. 2.
  • As shown in FIG. 7, the detecting unit 700 may include a noise removing unit 710 for removing noise component in the binary image.
  • Due to the variety of types of book contents, in the binary image obtained by the object classifying unit, some non-finger pixels may be classified as finger pixels, resulting in noise pixels. Therefore, it is necessary to remove the noise pixels.
  • Specifically, the noise removing unit 710 may set a sliding window in the binary image, and count the number of finger pixels (i.e., pixels whose pixel values are 0) in the sliding window. If the number of finger pixels in the sliding window is smaller than a predetermined threshold, it is determined that the finger pixels are actually noise pixels and the pixels are set as non-finger pixels, i.e., the values of the pixels are converted from 0 into 255.
  • Alternatively or additionally, the noise removal unit 710 may include a connected component analyzing unit 711 and a pixel converting unit 712. The connected component analyzing unit 711 may perform connected component analysis (CCA) on the binary image, so as to obtain a connected component in the binary image, each pixel in the connected component being a finger pixel. CCA algorithms are well-known in the prior art, and thus omitted herein.
  • For each obtained connected component, the pixel converting unit 712 may convert all the finger pixels in the connected component into non-finger pixels if the connected component satisfies any of the conditions:
  • 1) the area of the connected component is less than a predetermined area;
  • 2) the aspect ratio of the connected component is larger than a predetermined ratio;
  • 3) the finger image is on the left side of the image, and the distance between a left boundary of the connected component and a left boundary of the image is larger than a predetermined threshold; or
  • 4) the finger image is on the right side of the image, and the distance between a right boundary of the connected component and a right boundary of the image is larger than the predetermined threshold.
  • The four conditions above are explained below. First, regarding condition 1), the finger image takes a certain area; and when the area of the connected component is too small, the connected component is unlikely to be the finger image, instead, it may be a noise component. Further, regarding condition 2), the finger image has a certain aspect ratio. As shown in FIG. 8( a), when the aspect ratio is too large, the connected component is more likely to be a noise component such as book content T, and unlikely to be the finger image F1 or F2.
  • Further, regarding conditions 3) and 4), generally the finger is located on a vertical boundary of the image. When the connected component is away from the vertical boundary and close to the middle of the image, it is unlikely to be the finger image. Instead, it may be a noise component.
  • In addition, as shown in FIG. 7, the detecting unit 700 may further include a connected component processing unit 720 and a filling unit 730. As shown in FIGS. 8( b), 8(c) and 8(d), the connected component processing unit 720 may, according to the clicked point, acquire a connected component F1 where the clicked point is located, and search for a nearby connected component F2 in a vertical direction. The filling unit 730 may perform filling operation on the connected component F1 containing the clicked point and the found connected component F2 (i.e., region F′), thereby obtaining a filled connected component F″.
  • Considering that, during pixel classification, the finger image may be classified into multiple separate parts (e.g., F1 and F2 shown in FIG. 8( a) and FIG. 8( b)), the connected component F1 containing the clicked point is combined with the nearby connected component F2 in a vertical direction from F1. Moreover, due to the possible holes in the detected finger image, the filling operation may be used to fill the holes. Specifically, for each column of the image I, the upper most and lower most finger pixels (i.e., pixels whose pixel values are 0) are detected, and then all pixels between the two pixels are set as finger pixels. After the filling operation, the hole regions on the finger are filled, as shown in FIG. 8( d).
  • In addition, as shown in FIG. 7, the detecting unit 700 may further include an expanding unit 740 for performing an expanding operation on the filled connected component in the binary image. As shown in FIG. 9, because the boundary of the finger image may not be contained in the detected finger region A, it is necessary to perform an expanding operation so as to expand the finger region A into a region A′. Specific methods for the expanding operation are well-known in the prior art, and the present disclosure has no particular limitations thereto.
  • Description is given above taking the finger image as an example. According to the embodiments of the present disclosure, both the color model of the finger image and the edge map of the image are used to obtain the binary image of the image. Further, both the information on the clicked point and the edge map of the image are used to estimate the color model of the finger image. Therefore, the accuracy of detecting the finger region can be greatly improved, thus facilitating removal of the finger image from the image and making the processed image more nice-looking.
  • The image processing method according to an embodiment of the present disclosure will be described hereinafter in conjunction with FIG. 10. As shown in FIG. 10, the image processing method according to the embodiment of the present disclosure starts at step S110. In step S110, a click is performed on an object image contained in an image to obtain a clicked point.
  • Next, in step S120, an edge map of the image is calculated.
  • Next, in step S130, a color model of the object image is estimated based on the clicked point and the edge map.
  • Next, in step S140, each pixel in the image is classified based on the edge map and the color model, so as to obtain a binary image of the image.
  • Finally, in step S150, a region containing the object image is detected based on the binary image.
  • According to an embodiment of the present invention, in calculating the edge map of the image in step S120, the distance between the color of each pixel in the image and the color of the clicked point may be calculated to obtain a distance map. Then, a gradient operator may be applied to the distance map to obtain a distance gradient image. If a pixel in the image has a distance gradient larger than a predetermined distance gradient threshold, the pixel is classified as an edge pixel; otherwise, the pixel is classified as a non-edge pixel.
  • According to an embodiment of the present invention, in calculating the edge map of the image in step S120, the distance between the color of each pixel in the image and the color of the clicked point may be calculated to obtain a distance map. Then, a gradient operator may be applied to the distance map to obtain a distance gradient image. Further, the image may be converted from a color image into a gray image, and a gradient operator may be applied to the gray image to obtain an intensity gradient image. If a pixel in the image has a distance gradient larger than a predetermined distance gradient threshold or has an intensity gradient larger than a predetermined intensity gradient threshold, the pixel is classified as an edge pixel; otherwise, the pixel is classified as a non-edge pixel.
  • According to an embodiment of the present invention, in estimating the color model of the object in step S130, an extension region containing the clicked point may be acquired based on the clicked point and the edge map, the extension region being within the object image. Then, the color model of the object image may be acquired based on a color of each pixel within the extension region.
  • Specifically, in acquiring the extension region containing the clicked point, a maximum extension region containing the clicked point may be set. Then, the first one of boundary pixels leftward in a horizontal direction from the clicked point may be searched for, as a left boundary pixel of the extension region; and the first one of the boundary pixels rightward in the horizontal direction from the clicked point may be searched for, as a right boundary pixel of the extension region.
  • Further, for each reference pixel between the left boundary pixel and the right boundary pixel in the horizontal direction, the first one of the boundary pixels upward in a vertical direction from the reference pixel may be searched for, as an upper boundary pixel of the extension region; and the first one of the boundary pixels downward in the vertical direction from the reference pixel may be searched for, as an lower boundary pixel of the extension region.
  • Specifically, for each pixel within the maximum extension region, a sliding window is set taking the pixel as the center, the number of edge pixels in the sliding window is counted, and a pixel satisfying a condition that the number of the edge pixels in the sliding window is larger than a predetermined threshold is defined as a boundary pixel.
  • According to an embodiment of the present disclosure, in classifying each pixel in the image in step S140, if a pixel in the image is a non-edge pixel in the edge map and has a distance from the color model less than a color threshold, the pixel is classified as an object pixel; otherwise, the pixel is classified as a non-object pixel.
  • According to an embodiment of the present disclosure, in detecting the region containing the object image in step S150, a noise component in the binary image may be removed.
  • Specifically, in removing the noise component in the binary image, connected component analysis may be performed on the binary image, so as to obtain a connected component in the binary image, each of pixels in the connected component being an object pixel. And each object pixel in the connected component is converted into a non-object pixel if the connected component satisfies any of the conditions of:
  • 1) the area of the connected component being less than a predetermined area;
  • 2) the aspect ratio of the connected component being larger than a predetermined ratio;
  • 3) the object image being on the left side of the image and the distance between a left boundary of the connected component and a left boundary of the image being larger than a predetermined threshold; or
  • 4) the object image being on the right side of the image and the distance between a right boundary of the connected component and a right boundary of the image being larger than the predetermined threshold.
  • According to an embodiment of the present disclosure, in detecting the region containing the object image in step S150, a connected component where the clicked point is located may be acquired according to the clicked point, and a nearby connected component is searched for in a vertical direction. Then, a filling operation may be performed on the connected component containing the clicked point and the found connected component, thereby obtaining a filled connected component.
  • According to an embodiment of the present disclosure, in detecting the region containing the object image in step S150, an expanding operation may further be performed on the filled connected component in the binary image.
  • The various specific implementations of the respective steps above of the image processing method according to the embodiments of the present disclosure have been described in detail previously, and therefore the explanations thereof will not be repeated herein.
  • Apparently, respective operating processes of the image processing method above according to the present disclosure can be implemented in a manner of a computer executable program stored on a machine-readable storage medium.
  • And, the object of the present disclosure can be implemented in a manner that the storage medium on which the computer executable program above is carried is provided directly or indirectly to a system or apparatus, a computer or a Central Processing Unit (CPU) of which reads out and executes the computer executable program. Here, the implementation of the present disclosure is not limited to a program as long as the system or apparatus has a function to execute the program, and the program can be in arbitrary forms such as an objective program, a program executed by an interpreter, a script program provided to an operating system, etc.
  • The machine-readable storage medium mentioned above includes, but is not limited to, various memories and storage devices, a semiconductor device, a disk unit such as an optic disk, a magnetic disk and a magneto-optic disk, and other medium suitable for storing information.
  • Additionally, the present disclosure can also be implemented by connecting to a corresponding web site on the Internet through a computer, downloading and installing the computer executable program according to the invention into the computer, and then executing the program.
  • FIG. 11 is a block diagram illustrating an exemplary structure of a general-purpose personal computer on which the image processing device and method according to the embodiments of the present disclosure can be implemented.
  • As shown in FIG. 11, a CPU 1301 executes various processing according to a program stored in a Read Only Memory (ROM) 1302 or a program loaded to a Random Access Memory (RAM) 1303 from a storage device 1308. In the RAM 1303, if necessary, data required for the CPU 1301 in executing various processing and the like is also stored. The CPU 1301, the ROM 1302 and the RAM 1303 are connected to each other via a bus 1304. An input/output interface 1305 is also connected to the bus 1304.
  • The following components are connected to the input/output interface 1305: an input device 1306 including a keyboard, a mouse and the like, an output device 1307 including a display such as a Cathode Ray Tube (CRT) and a Liquid Crystal Display (LCD), a speaker and the like, the storage device 1308 including a hard disk and the like, and a communication device 1309 including a network interface card such as a LAN card, a modem and the like. The communication device 1309 performs communication processing via a network such as the Internet. If necessary, a drive 1310 can also be connected to the input/output interface 1305. A removable medium 1311 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory and the like is mounted on the drive 1310 as necessary such that a computer program read out therefrom is installed in the storage device 1308.
  • In a case that the series of processing above is implemented in software, a program constituting the software is installed from the network such as the Internet or the storage medium such as the removable medium 1311.
  • It is understood by those skilled in the art that the storage medium is not limited to the removable medium 1311 shown in FIG. 11 in which the program is stored and which is distributed separately from the device so as to provide the program to the user. Examples of the removable medium 1311 include a magnetic disk including a Floppy Disk (registered trademark), an optical disk including a Compact Disk Read Only Memory (CD-ROM) and a Digital Versatile Disc (DVD), a magneto-optical disk including a MiniDisc (MD) (registered trademark), and a semiconductor memory. Alternatively, the storage medium may be the ROM 1302, the hard disk contained in the storage device 1308 or the like. Herein, the program is stored in the storage medium, and the storage medium is distributed to the user together with the device containing the storage medium.
  • In the system and method of the present disclosure, it is obvious that respective components or steps can be decomposed and/or recombined. Such decomposition and/or recombination should be considered as an equivalent solution of the present disclosure. And, the steps performing a series of processing above can be performed in the describing order naturally, but this is not necessary. Some steps can be performed concurrently or independently with one another.
  • Although the embodiment of the present disclosure has been described in detail in combination with the drawings above, it should be understood that, the embodiment described above is only used to explain the invention and is not constructed as the limitation to the present disclosure. For those skilled in the art, various modification and alternation can be made to the above embodiment without departing from the essential and scope of the present disclosure. Therefore, the scope of the present disclosure is only defined by the appended claims and the equivalents thereof.
  • The present disclosure discloses the embodiments described above as well as the following appendix:
  • APPENDIX 1
  • An image processing device comprising:
  • an inputting unit for performing a click on an object image contained in an image to obtain a clicked point;
  • a calculating unit for calculating an edge map of the image;
  • an estimating unit for estimating a color model of the object image based on the clicked point and the edge map;
  • an object classifying unit for classifying each pixel in the image, based on the edge map and the color model, so as to obtain a binary image of the image; and
  • a detecting unit for detecting a region containing the object image based on the binary image.
  • APPENDIX 2
  • The device according to Appendix 1, wherein the calculating unit comprises:
  • a distance calculating unit for calculating a distance between a color of each pixel in the image and a color of the clicked point to obtain a distance map;
  • a distance gradient calculating unit for applying a gradient operator to the distance map to obtain a distance gradient image; and
  • an edge classifying unit for classifying a pixel having a distance gradient larger than a predetermined distance gradient threshold in the image into an edge pixel, and the other pixel in the image into a non-edge pixel.
  • APPENDIX 3
  • The device according to Appendix 1, wherein the calculating unit comprises:
  • a distance calculating unit for calculating a distance between a color of each pixel in the image and a color of the clicked point to obtain a distance map;
  • a distance gradient calculating unit for applying a gradient operator to the distance map to obtain a distance gradient image;
  • a gray converting unit for converting the image from a color image to a gray image;
  • an intensity gradient calculating unit for applying a gradient operator to the gray image to obtain an intensity gradient image; and
  • an edge classifying unit for classifying a pixel having a distance gradient larger than a predetermined distance gradient threshold or having an intensity gradient larger than a predetermined intensity gradient threshold in the image into an edge pixel, and the other pixel in the image into a non-edge pixel.
  • APPENDIX 4
  • The device according to Appendix 1, wherein the estimating unit comprises:
  • an extension region acquiring unit for acquiring an extension region containing the clicked point based on the clicked point and the edge map, the extension region being within the object image; and
  • a color model acquiring unit for acquiring the color model of the object image based on a color of each pixel within the extension region.
  • APPENDIX 5
  • The device according to Appendix 4, wherein the extension region acquiring unit comprises:
  • a setting unit for setting a maximum extension region containing the clicked point;
  • a first searching unit for searching, as a left boundary pixel of the extension region, the first one of boundary pixels leftward in a horizontal direction from the clicked point;
  • a second searching unit for searching, as a right boundary pixel of the extension region, the first one of the boundary pixels rightward in the horizontal direction from the clicked point;
  • a third searching unit for searching, for each reference pixel between the left boundary pixel and the right boundary pixel in the horizontal direction, as an upper boundary pixel of the extension region, the first one of the boundary pixels upward in a vertical direction from the reference pixel; and
  • a fourth searching unit for searching, as an lower boundary pixel of the extension region, the first one of the boundary pixels downward in the vertical direction from the reference pixel, wherein,
  • the extension region acquiring unit sets a sliding window taking each pixel within the maximum extension region as a center, counts the number of edge pixels in the sliding window, and defines a pixel satisfying a condition that the number of the edge pixels in the sliding window is larger than a predetermined threshold as the boundary pixel.
  • APPENDIX 6
  • The device according to Appendix 1, wherein the object classifying unit classifies a pixel in the image which is a non-edge pixel in the edge map and a distance from the color model is less than a color threshold into an object pixel, and the other pixel in the image into a non-object pixel.
  • APPENDIX 7
  • The device according to Appendix 1, wherein the detecting unit comprises a noise removing unit for removing noise component in the binary image.
  • APPENDIX 8
  • The device according to Appendix 7, wherein the noise removing unit comprises:
  • a connected component analyzing unit for performing connected component analysis algorithm on the binary image, so as to obtain a connected component in the binary image, each of pixels in the connected component being an object pixel; and
  • a pixel converting unit for converting each object pixel in the connected component into a non-object pixel if the connected component satisfies any of conditions of:
  • an area of the connected component being less than a predetermined area;
  • an aspect ratio of the connected component being larger than a predetermined ratio;
  • the object image being on the left side of the image and a distance between a left boundary of the connected component and a left boundary of the image being larger than a predetermined threshold; or
  • the object image being on the right side of the image and a distance between a right boundary of the connected component and a right boundary of the image being larger than the predetermined threshold.
  • APPENDIX 9
  • The device according to Appendix 8, wherein the detecting unit further comprises:
  • a connected component processing unit for acquiring a connected component where the clicked point locates according to the clicked point and searching a nearby connected component in a vertical direction; and
  • a filling unit for performing filling operation on the connected component containing the clicked point and the searched connected component, so as to obtain the filled connected component.
  • APPENDIX 10
  • The device according to Appendix 9, wherein the detecting unit further comprises:
  • an expanding unit for performing an expanding operation on the filled connected component in the binary image.
  • APPENDIX 11
  • The device according to Appendix 1, wherein the object image is a finger image.
  • APPENDIX 12
  • The device according to Appendix 1, wherein the color of the clicked point is the color of a pixel at the clicked point, or an average color of pixels within a predetermined region containing the clicked point.
  • APPENDIX 13
  • An image processing method comprising:
  • performing a click on an object image contained in an image to obtain a clicked point;
  • calculating an edge map of the image;
  • estimating a color model of the object image based on the clicked point and the edge map;
  • classifying each pixel in the image, based on the edge map and the color model, so as to obtain a binary image of the image; and
  • detecting a region containing the object image based on the binary image
  • APPENDIX 14
  • The method according to Appendix 13, wherein the step of calculating an edge map of the image comprises:
  • calculating a distance between a color of each pixel in the image and a color of the clicked point to obtain a distance map;
  • applying a gradient operator to the distance map to obtain a distance gradient image; and
  • classifying a pixel having a distance gradient larger than a predetermined distance gradient threshold in the image into an edge pixel, and the other pixel in the image into a non-edge pixel.
  • APPENDIX 15
  • The method according to Appendix 13, wherein the step of calculating an edge map of the image comprises:
  • calculating a distance between a color of each pixel in the image and a color of the clicked point to obtain a distance map;
  • applying a gradient operator to the distance map to obtain a distance gradient image;
  • converting the image from a color image to a gray image;
  • applying a gradient operator to the gray image to obtain an intensity gradient image; and
  • classifying a pixel having a distance gradient larger than a predetermined distance gradient threshold or having an intensity gradient larger than a predetermined intensity gradient threshold in the image into an edge pixel, and the other pixel in the image into a non-edge pixel.
  • APPENDIX 16
  • The method according to Appendix 13, wherein the step of estimating a color model of the object image based on the clicked point and the edge map comprises:
  • acquiring an extension region containing the clicked point based on the clicked point and the edge map, the extension region being within the object image; and
  • acquiring the color model of the object image based on a color of each pixel within the extension region.
  • APPENDIX 17
  • The method according to Appendix 16, wherein the step of acquiring an extension region containing the clicked point based on the clicked point and the edge map comprises:
  • setting a maximum extension region containing the clicked point;
  • searching, as a left boundary pixel of the extension region, the first one of boundary pixels leftward in a horizontal direction from the clicked point;
  • searching, as a right boundary pixel of the extension region, the first one of the boundary pixels rightward in the horizontal direction from the clicked point; and
  • setting, for each reference pixel between the left boundary pixel and the right boundary pixel in the horizontal direction, an upper boundary pixel and a lower boundary pixel of the extension region by steps of:
  • searching, as the upper boundary pixel of the extension region, the first one of the boundary pixels upward in a vertical direction from the reference pixel; and
  • searching, as the lower boundary pixel of the extension region, the first one of the boundary pixels downward in the vertical direction from the reference pixel, wherein,
  • a sliding window is set taking each pixel within the maximum extension region as a center, the number of edge pixels in the sliding window is counted, and a pixel satisfying a condition that the number of the edge pixels in the sliding window is larger than a predetermined threshold is defined as the boundary pixel.
  • APPENDIX 18
  • The method according to Appendix 13, wherein the step of classifying each pixel in the image based on the edge map and the color model so as to obtain a binary image of the image comprises:
  • classifying a pixel in the image which is a non-edge pixel in the edge map and a distance from the color model is less than a color threshold into an object pixel, and the other pixel in the image into a non-object pixel.
  • APPENDIX 19
  • A program product comprising a machine-readable instruction code stored therein, wherein the instruction code, when read and executed by a computer, enables the computer to execute the method according to any of Appendixes 13-18.
  • APPENDIX 20
  • A machine-readable medium on which the program product according to Appendix 19 is carried.

Claims (19)

1. An image processing device comprising:
an inputting unit for performing a click on an object image contained in an image to obtain a clicked point;
a calculating unit for calculating an edge map of the image;
an estimating unit for estimating a color model of the object image based on the clicked point and the edge map;
an object classifying unit for classifying each pixel in the image, based on the edge map and the color model, so as to obtain a binary image of the image; and
a detecting unit for detecting a region containing the object image based on the binary image.
2. The device according to claim 1, wherein the calculating unit comprises:
a distance calculating unit for calculating a distance between a color of each pixel in the image and a color of the clicked point to obtain a distance map;
a distance gradient calculating unit for applying a gradient operator to the distance map to obtain a distance gradient image; and
an edge classifying unit for classifying a pixel having a distance gradient larger than a predetermined distance gradient threshold in the image into an edge pixel, and the other pixel in the image into a non-edge pixel.
3. The device according to claim 1, wherein the calculating unit comprises:
a distance calculating unit for calculating a distance between a color of each pixel in the image and a color of the clicked point to obtain a distance map;
a distance gradient calculating unit for applying a gradient operator to the distance map to obtain a distance gradient image;
a gray converting unit for converting the image from a color image to a gray image;
an intensity gradient calculating unit for applying a gradient operator to the gray image to obtain an intensity gradient image; and
an edge classifying unit for classifying a pixel having a distance gradient larger than a predetermined distance gradient threshold or having an intensity gradient larger than a predetermined intensity gradient threshold in the image into an edge pixel, and the other pixel in the image into a non-edge pixel.
4. The device according to claim 1, wherein the estimating unit comprises:
an extension region acquiring unit for acquiring an extension region containing the clicked point based on the clicked point and the edge map, the extension region being within the object image; and
a color model acquiring unit for acquiring the color model of the object image based on a color of each pixel within the extension region.
5. The device according to claim 4, wherein the extension region acquiring unit comprises:
a setting unit for setting a maximum extension region containing the clicked point;
a first searching unit for searching, as a left boundary pixel of the extension region, the first one of boundary pixels leftward in a horizontal direction from the clicked point;
a second searching unit for searching, as a right boundary pixel of the extension region, the first one of the boundary pixels rightward in the horizontal direction from the clicked point;
a third searching unit for searching, for each reference pixel between the left boundary pixel and the right boundary pixel in the horizontal direction, as an upper boundary pixel of the extension region, the first one of the boundary pixels upward in a vertical direction from the reference pixel; and
a fourth searching unit for searching, as an lower boundary pixel of the extension region, the first one of the boundary pixels downward in the vertical direction from the reference pixel, wherein,
the extension region acquiring unit sets a sliding window taking each pixel within the maximum extension region as a center, counts the number of edge pixels in the sliding window, and defines a pixel satisfying a condition that the number of the edge pixels in the sliding window is larger than a predetermined threshold as the boundary pixel.
6. The device according to claim 1, wherein the object classifying unit classifies a pixel in the image which is a non-edge pixel in the edge map and a distance from the color model is less than a color threshold into an object pixel, and the other pixel in the image into a non-object pixel.
7. The device according to claim 1, wherein the detecting unit comprises a noise removing unit for removing noise component in the binary image.
8. The device according to claim 7, wherein the noise removing unit comprises:
a connected component analyzing unit for performing connected component analysis algorithm on the binary image, so as to obtain a connected component in the binary image, each of pixels in the connected component being an object pixel; and
a pixel converting unit for converting each object pixel in the connected component into a non-object pixel if the connected component satisfies any of conditions of:
an area of the connected component being less than a predetermined area;
an aspect ratio of the connected component being larger than a predetermined ratio;
the object image being on the left side of the image and a distance between a left boundary of the connected component and a left boundary of the image being larger than a predetermined threshold; or
the object image being on the right side of the image and a distance between a right boundary of the connected component and a right boundary of the image being larger than the predetermined threshold.
9. The device according to claim 8, wherein the detecting unit further comprises:
a connected component processing unit for acquiring a connected component where the clicked point locates according to the clicked point and searching a nearby connected component in a vertical direction; and
a filling unit for performing filling operation on the connected component containing the clicked point and the searched connected component, so as to obtain the filled connected component.
10. The device according to claim 9, wherein the detecting unit further comprises:
an expanding unit for performing an expanding operation on the filled connected component in the binary image.
11. The device according to claim 1, wherein the object image is a finger image.
12. The device according to claim 1, wherein the color of the clicked point is the color of a pixel at the clicked point, or an average color of pixels within a predetermined region containing the clicked point.
13. An image processing method comprising:
performing a click on an object image contained in an image to obtain a clicked point;
calculating an edge map of the image;
estimating a color model of the object image based on the clicked point and the edge map;
classifying each pixel in the image, based on the edge map and the color model, so as to obtain a binary image of the image; and
detecting a region containing the object image based on the binary image.
14. The method according to claim 13, wherein the step of calculating an edge map of the image comprises:
calculating a distance between a color of each pixel in the image and a color of the clicked point to obtain a distance map;
applying a gradient operator to the distance map to obtain a distance gradient image; and
classifying a pixel having a distance gradient larger than a predetermined distance gradient threshold in the image into an edge pixel, and the other pixel in the image into a non-edge pixel.
15. The method according to claim 13, wherein the step of calculating an edge map of the image comprises:
calculating a distance between a color of each pixel in the image and a color of the clicked point to obtain a distance map;
applying a gradient operator to the distance map to obtain a distance gradient image;
converting the image from a color image to a gray image;
applying a gradient operator to the gray image to obtain an intensity gradient image; and
classifying a pixel having a distance gradient larger than a predetermined distance gradient threshold or having an intensity gradient larger than a predetermined intensity gradient threshold in the image into an edge pixel, and the other pixel in the image into a non-edge pixel.
16. The method according to claim 13, wherein the step of estimating a color model of the object image based on the clicked point and the edge map comprises:
acquiring an extension region containing the clicked point based on the clicked point and the edge map, the extension region being within the object image; and
acquiring the color model of the object image based on a color of each pixel within the extension region.
17. The method according to claim 16, wherein the step of acquiring an extension region containing the clicked point based on the clicked point and the edge map comprises:
setting a maximum extension region containing the clicked point;
searching, as a left boundary pixel of the extension region, the first one of boundary pixels leftward in a horizontal direction from the clicked point;
searching, as a right boundary pixel of the extension region, the first one of the boundary pixels rightward in the horizontal direction from the clicked point; and
setting, for each reference pixel between the left boundary pixel and the right boundary pixel in the horizontal direction, an upper boundary pixel and a lower boundary pixel of the extension region by steps of:
searching, as the upper boundary pixel of the extension region, the first one of the boundary pixels upward in a vertical direction from the reference pixel; and
searching, as the lower boundary pixel of the extension region, the first one of the boundary pixels downward in the vertical direction from the reference pixel, wherein,
a sliding window is set taking each pixel within the maximum extension region as a center, the number of edge pixels in the sliding window is counted, and a pixel satisfying a condition that the number of the edge pixels in the sliding window is larger than a predetermined threshold is defined as the boundary pixel.
18. The method according to claim 13, wherein the step of classifying each pixel in the image based on the edge map and the color model so as to obtain a binary image of the image comprises:
classifying a pixel in the image which is a non-edge pixel in the edge map and a distance from the color model is less than a color threshold into an object pixel, and the other pixel in the image into a non-object pixel.
19. A program product comprising a machine-readable instruction code stored therein, wherein the instruction code, when read and executed by a computer, enables the computer to execute the method according to claim 13.
US14/174,003 2013-02-06 2014-02-06 Method and apparatus for semi-automatic finger extraction Active 2034-04-13 US9311538B2 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN201310048270 2013-02-06
CN201310048270.5A CN103971361B (en) 2013-02-06 2013-02-06 Image processing device and method
CN201310048270.5 2013-02-06

Publications (2)

Publication Number Publication Date
US20140226856A1 true US20140226856A1 (en) 2014-08-14
US9311538B2 US9311538B2 (en) 2016-04-12

Family

ID=51240810

Family Applications (1)

Application Number Title Priority Date Filing Date
US14/174,003 Active 2034-04-13 US9311538B2 (en) 2013-02-06 2014-02-06 Method and apparatus for semi-automatic finger extraction

Country Status (3)

Country Link
US (1) US9311538B2 (en)
JP (1) JP6277750B2 (en)
CN (1) CN103971361B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160358317A1 (en) * 2015-06-04 2016-12-08 Canon Kabushiki Kaisha Image processing apparatus, image processing method, and storage medium
CN110321894A (en) * 2019-04-23 2019-10-11 浙江工业大学 A kind of library book method for rapidly positioning based on deep learning OCR
CN111784715A (en) * 2020-08-13 2020-10-16 北京英迈琪科技有限公司 Image separation method and system
CN113781482A (en) * 2021-11-11 2021-12-10 山东精良海纬机械有限公司 Method and system for detecting crack defects of mechanical parts in complex environment

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109310306B (en) * 2016-06-28 2021-09-24 索尼公司 Image processing apparatus, image processing method, and medical imaging system
KR102300562B1 (en) * 2018-01-25 2021-09-13 삼성전자주식회사 Detecting method, detecting apparatus and computer readable storage medium
KR102078547B1 (en) * 2018-07-26 2020-02-19 오스템임플란트 주식회사 Method, Apparatus and Recording Medium For Automatic Setting Margin Line Of Providing Margin Line
CN111200699B (en) * 2018-11-19 2022-04-26 瑞昱半导体股份有限公司 Image adjusting method
KR102386930B1 (en) * 2019-11-28 2022-04-14 재단법인 경북아이티융합 산업기술원 Apparatus for detecting edge of road image and method thereof
CN113014846B (en) * 2019-12-19 2022-07-22 华为技术有限公司 Video acquisition control method, electronic equipment and computer readable storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6624833B1 (en) * 2000-04-17 2003-09-23 Lucent Technologies Inc. Gesture-based input interface system with shadow detection
US20120268364A1 (en) * 2008-04-24 2012-10-25 Minnen David Fast fingertip detection for initializing a vision-based hand tracker

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3384208B2 (en) * 1994-09-29 2003-03-10 ミノルタ株式会社 Image reading device
JP3436025B2 (en) * 1995-12-27 2003-08-11 ミノルタ株式会社 Correction method of read image and image reading device
JP3754786B2 (en) * 1997-02-19 2006-03-15 キヤノン株式会社 Image processing apparatus and method
TWI309026B (en) * 2005-04-12 2009-04-21 Newsoft Technology Corp Method for auto-cropping image objects and method for detecting image object contour
WO2011142313A1 (en) * 2010-05-11 2011-11-17 日本システムウエア株式会社 Object recognition device, method, program, and computer-readable medium upon which software is stored
JP5765026B2 (en) * 2011-04-06 2015-08-19 富士ゼロックス株式会社 Image processing apparatus and program
CN103377462B (en) * 2012-04-16 2016-05-04 富士通株式会社 The method and apparatus that scan image is processed
CN102682292B (en) * 2012-05-10 2014-01-29 清华大学 Method based on monocular vision for detecting and roughly positioning edge of road

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6624833B1 (en) * 2000-04-17 2003-09-23 Lucent Technologies Inc. Gesture-based input interface system with shadow detection
US20120268364A1 (en) * 2008-04-24 2012-10-25 Minnen David Fast fingertip detection for initializing a vision-based hand tracker

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160358317A1 (en) * 2015-06-04 2016-12-08 Canon Kabushiki Kaisha Image processing apparatus, image processing method, and storage medium
US10096090B2 (en) * 2015-06-04 2018-10-09 Canon Kabushiki Kaisha Image processing apparatus, image processing method, and storage medium, relating to emphasizing a contour region
CN110321894A (en) * 2019-04-23 2019-10-11 浙江工业大学 A kind of library book method for rapidly positioning based on deep learning OCR
CN111784715A (en) * 2020-08-13 2020-10-16 北京英迈琪科技有限公司 Image separation method and system
CN113781482A (en) * 2021-11-11 2021-12-10 山东精良海纬机械有限公司 Method and system for detecting crack defects of mechanical parts in complex environment

Also Published As

Publication number Publication date
CN103971361A (en) 2014-08-06
US9311538B2 (en) 2016-04-12
JP6277750B2 (en) 2018-02-14
CN103971361B (en) 2017-05-10
JP2014154160A (en) 2014-08-25

Similar Documents

Publication Publication Date Title
US9311538B2 (en) Method and apparatus for semi-automatic finger extraction
Wagner et al. SPHIRE-crYOLO is a fast and accurate fully automated particle picker for cryo-EM
US10430663B2 (en) Method, electronic device and non-transitory computer readable storage medium for image annotation
CN110334706B (en) Image target identification method and device
US8639042B2 (en) Hierarchical filtered motion field for action recognition
US20170228872A1 (en) Method and system for extracting a main subject of an image
US7840037B2 (en) Adaptive scanning for performance enhancement in image detection systems
US20150310302A1 (en) Image processing device and method
US9959466B2 (en) Object tracking apparatus and method and camera
JP2015032308A (en) Convolutional-neural-network-based classifier and classifying method and training methods for the same
US9524559B2 (en) Image processing device and method
KR100836740B1 (en) Video data processing method and system thereof
EP3664019A1 (en) Information processing device, information processing program, and information processing method
US20140112557A1 (en) Biological unit identification based on supervised shape ranking
US20090202145A1 (en) Learning appartus, learning method, recognition apparatus, recognition method, and program
US9501823B2 (en) Methods and systems for characterizing angle closure glaucoma for risk assessment or screening
CN107977658B (en) Image character area identification method, television and readable storage medium
US6738512B1 (en) Using shape suppression to identify areas of images that include particular shapes
TW200529093A (en) Face image detection method, face image detection system, and face image detection program
CN108509861B (en) Target tracking method and device based on combination of sample learning and target detection
CN110163103B (en) Live pig behavior identification method and device based on video image
CN109726621B (en) Pedestrian detection method, device and equipment
Chang Intelligent text detection and extraction from natural scene images
CN117058534A (en) Small sample remote sensing image target detection method based on meta-knowledge adaptive migration network
CN111461152B (en) Cargo detection method and device, electronic equipment and computer readable medium

Legal Events

Date Code Title Description
AS Assignment

Owner name: FUJITSU LIMITED, JAPAN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:XIE, SHUFU;HE, YUAN;SUN, JUN;REEL/FRAME:032755/0037

Effective date: 20140311

STCF Information on status: patent grant

Free format text: PATENTED CASE

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 4

AS Assignment

Owner name: PFU LIMITED, JAPAN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:FUJITSU LIMITED;REEL/FRAME:053680/0172

Effective date: 20200615

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 8