WO2010010926A1 - Procédé de suivi de points caractéristiques et dispositif de suivi de points caractéristiques - Google Patents

Procédé de suivi de points caractéristiques et dispositif de suivi de points caractéristiques Download PDF

Info

Publication number
WO2010010926A1
WO2010010926A1 PCT/JP2009/063197 JP2009063197W WO2010010926A1 WO 2010010926 A1 WO2010010926 A1 WO 2010010926A1 JP 2009063197 W JP2009063197 W JP 2009063197W WO 2010010926 A1 WO2010010926 A1 WO 2010010926A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
feature
coordinate system
part group
feature point
Prior art date
Application number
PCT/JP2009/063197
Other languages
English (en)
Japanese (ja)
Inventor
嘉伸 海老澤
Original Assignee
国立大学法人静岡大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 国立大学法人静岡大学 filed Critical 国立大学法人静岡大学
Priority to JP2010521735A priority Critical patent/JP5429885B2/ja
Publication of WO2010010926A1 publication Critical patent/WO2010010926A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/013Eye tracking input arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/59Context or environment of the image inside of a vehicle, e.g. relating to seat occupancy, driver state or inner lighting conditions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/167Detection; Localisation; Normalisation using comparisons between temporally consecutive images
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/61Control of cameras or camera modules based on recognised objects
    • H04N23/611Control of cameras or camera modules based on recognised objects where the recognised objects include parts of the human body
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30248Vehicle exterior or interior
    • G06T2207/30268Vehicle interior

Definitions

  • the present invention relates to a feature point tracking method and a feature point tracking apparatus for tracking feature points of a subject based on the subject's image.
  • the detection accuracy of the pupil may not be sufficient because the amount of movement of the nostril does not match the amount of movement of the pupil when the head is rotated.
  • corneal reflection does not appear if the line-of-sight direction is greatly deviated from the optical axis direction of the camera.
  • the present invention has been made in view of such problems, and it is possible to improve the robustness of feature point tracking by accurately predicting the movement of feature points in time-series image frames. It is an object to provide a feature point tracking method and a feature point tracking device.
  • a feature point tracking method of the present invention is a feature point tracking method for tracking the position of a feature point of a subject based on a head image of the subject.
  • a position detection step of capturing a two-dimensional position of a feature part group, which is a combination of feature points, by an imaging unit and detecting the three-dimensional position of the feature part group in time series, and a three-dimensional position of the feature part group at a past imaging timing A conversion coefficient calculating step for calculating a rotation angle and displacement of the face coordinate system from the predetermined reference coordinate system based on the feature part group, and rotation of the calculated face coordinate system from the reference coordinate system
  • the feature point tracking device of the present invention is a feature point tracking device that tracks the position of the feature point of the subject based on the head image of the subject, and is a combination of the three feature points of the subject.
  • the position detection means for capturing the two-dimensional position of a certain feature part group by the imaging means, detecting the three-dimensional position of the feature part group in time series, and the three-dimensional position of the feature part group at the past imaging timing
  • the conversion coefficient calculation means for calculating the rotation angle and displacement of the face coordinate system from the predetermined reference coordinate system based on the feature part group, and the calculated rotation angle and displacement of the face coordinate system from the reference coordinate system
  • conversion coefficient prediction means for predicting the rotation angle and displacement of the face coordinate system at the current imaging timing, and the three-dimensional prediction position of the characteristic part group at the current imaging timing based on the predicted rotation angle and displacement
  • Predicted position calculation means for calculating the feature position group, and the position detection means, based on the calculated three-dimensional predicted position of the feature part group, a
  • a feature part group that is a combination of three feature points of a subject is imaged, and its three-dimensional position is detected in time series, and at a past imaging timing.
  • the rotation angle and displacement of the face coordinate system based on the feature part group from the reference coordinate system are calculated, and based on this rotation angle and displacement, After the rotation angle and displacement of the face coordinate system are predicted and the three-dimensional predicted position at the current imaging timing of the feature part group is calculated, the calculated three-dimensional value is used when detecting the current feature part group image.
  • a window is set on the image frame based on the predicted position.
  • the position of the feature part group in the time-series image frame can be predicted for any movement of the subject's head, and the feature part group in the time-series image frame can be accurately tracked. It becomes possible. As a result, the robustness of the feature point tracking process targeting the subject's head can be improved.
  • the feature point tracking method and the feature point tracking apparatus of the present invention it is possible to improve the robustness of feature point tracking by accurately predicting the movement of feature points in time-series image frames.
  • the feature point tracking device of the present invention is a pointing device that moves a cursor on a monitor screen of a personal computer by detecting pupil movement, a drowsiness detection system that detects movement of a pupil and detects driver drowsiness, etc. Used.
  • FIG. 1 is a plan view showing a feature point tracking device 10 which is a preferred embodiment of the present invention.
  • the feature point tracking device 10 includes a single camera (imaging means) 2 that captures the face image of the subject A and a light source 3a provided in the vicinity of the imaging lens on the front surface 2a of the camera 2. And a light source 3b provided at a position away from the front surface 2a of the camera 2, and an image processing apparatus 1 connected to the camera 2 and the light sources 3a and 3b.
  • the image processing apparatus 1 functions as a position detection unit, a conversion coefficient calculation unit, a conversion coefficient prediction unit, and a predicted position calculation unit in the feature point tracking process.
  • the camera 2 is not limited to a specific type as long as it is an imaging means capable of generating the face image of the subject A, but an image sensor such as a CCD or CMOS is used in that the image data can be processed with high real-time properties. Use the built-in digital camera.
  • the camera 2 is arranged so that the subject A is positioned on the optical axis L1 of the imaging lens (not shown) of the camera 2.
  • the light source 3a is configured to be capable of irradiating illumination light having a near-infrared light component toward a range covering the subject A located on the optical axis L1 along the optical axis L1 of the camera 2.
  • the light source 3b is fixed at a position where the distance from the optical axis L1 is further away from the light source 3a, and irradiates illumination light having a near-infrared light component toward a range covering the subject A along the optical axis L1. It is configured to be possible.
  • the illumination light irradiated from the two light sources 3a and 3b is set so as to have different wavelength components (for example, the center wavelengths are 850 nm and 950 nm) that cause a luminance difference in the pupil part, and the light source 3b may be fixed at a position where the distance from the optical axis L1 is equal to the light source 3a.
  • the configuration of the light source can be simplified and reduced in size while causing a luminance difference in the pupil portion.
  • the camera 2 and the light sources 3a and 3b are provided for the purpose of preventing the reflected light from being reflected in the face image when the subject A is wearing glasses and making it easier to detect the nostril of the subject A.
  • the image processing apparatus 1 controls imaging by the camera 2 and irradiation of illumination light by the light sources 3a and 3b, and based on the head image of the subject A acquired by the camera 2, the pupil and nostril of the subject A A process for tracking the feature points is executed (details will be described later).
  • the distances between the four feature points (feature group) of the left and right pupil centers and the right and left nostrils centers of the subject A are measured.
  • the subject A is positioned on the optical axis L1 of the camera 2, and the image processing device 1 controls the imaging timing and the illumination timing, so that the face image of the subject A facing in an arbitrary direction is time-series.
  • the image is taken with.
  • the time-series image frames generated by the camera 2 in this way are sent to the image processing apparatus 1, and the image processing apparatus 1 uses the two-dimensional coordinates of the left and right pupil centers on the respective frame images and the center of the left and right nostrils.
  • a two-dimensional coordinate of the intermediate point is detected. Thereafter, these two-dimensional coordinates are converted into three-dimensional coordinates by the image processing apparatus 1 (the position detection step).
  • the image processing apparatus 1 alternately turns on the light sources 3a and 3b and alternately generates face images synchronized with the lighting, thereby obtaining a bright pupil image and a dark pupil image.
  • the bright pupil image is an image obtained with the irradiation of the light source 3a, and the luminance of the pupil portion is relatively bright.
  • the dark pupil image is an image obtained with the irradiation of the light source 3b, and the luminance of the pupil portion is relatively dark.
  • the light sources 3a and 3b are turned on in synchronization with the field signal of the camera 2 to separate the bright and dark pupil images between the odd and even fields. Can do.
  • the image processing apparatus 1 sets a window at a predetermined position of each of the bright pupil image and the dark pupil image, and after taking a difference between the bright pupil image and the dark pupil image between the images in the windows, the pupil portion Determine the range.
  • the positions of these windows are set based on the three-dimensional position of the pupil center at the past imaging timing, as will be described later. By performing such difference processing, it is possible to detect a highly robust pupil.
  • the image processing apparatus 1 specifies the detected outline of the pupil, calculates an ellipse that can be approximated to the outline, and determines the center of the ellipse as the center position of the pupil.
  • the position of the center of the pupil may be calculated using the centroid method after binarizing the image using the difference-processed image.
  • the position of the pupil center may be calculated using a separability filter. That is, the center coordinate that maximizes the degree of separation is obtained using a pattern close to a circle.
  • the image processing apparatus 1 detects the two-dimensional coordinates of the left and right nostril centers and the two-dimensional coordinates of the intermediate points with reference to the bright pupil image or the dark pupil image. That is, the center point of the left and right pupil centers is obtained, and a large window whose center substantially coincides with the nostril position when the subject A is assumed to face the front is set at a position below that. Detect nostrils with. Then, 0.8% of pixels having the lower luminance are detected by the P-tile method in the large window of the image, and converted to a binary image composed of HIGH pixels and LOW pixels.
  • the detected binarized image is repeatedly expanded and contracted (morphological processing) to clarify the area in the image, and then the labeling process is performed to select the two areas from the larger one.
  • the center, aspect ratio, and area of the rectangle formed from the top, bottom, left, and right end points are calculated.
  • the expansion process is a process of converting a target pixel into a HIGH pixel when at least one of the eight pixels in the vicinity of the target pixel is present in the binary image
  • the contraction process is a binary process. This is processing for converting a target pixel into a LOW pixel when at least one of the eight pixels near the target pixel is present in the image.
  • the area is smaller than 100 pixels or larger than 300 pixels, the region indicating the nostril image Judge that is not. If this is not the case, a small window of 30 ⁇ 30 pixels is set around the center of the rectangle, and 5% of the pixels with the lowest brightness are selected by the P-tile method in the small window of the original image. Extract. Thereafter, the above morphological process and labeling process are repeated to obtain a region having the maximum area.
  • the area of the region is 130 pixels or more and 70 pixels or less, it is determined not to be a nostril image, otherwise it is determined to be a nostril image, and the center of the rectangle formed by the upper, lower, left and right end points of the region is determined. Seek as the center. As a result, when two nostril centers are detected, the correspondence between the left and right nostrils is determined from the size of each coordinate value.
  • an optimum threshold value can be given to detect each of two nostrils having different imaging conditions, and the nostril can be detected reliably.
  • the camera optical system in the feature point tracking apparatus 10 that images the subject A can be assumed to be a pinhole model with a focal length f as shown in FIG.
  • the two-dimensional coordinates of the center points of the right pupil, the left pupil, the left nostril, and the right nostril on the frame image detected by the image processing apparatus 1 are respectively represented by Q 1 (x 1 , y 1 ).
  • the unit vector corresponding to the position vector from the pinhole O toward each feature point is expressed by the following equation (1);
  • the image processing apparatus 1 obtains position vectors P 1 , P 2 , and P 3 in the camera coordinate system by solving the following simultaneous equations (4) for the three feature points P 1 , P 2 , and P 3. be able to. Similarly, the position vector P 4 and the position vector P 0 at the midpoint between the left and right nostril centers can also be calculated.
  • This normal vector V F indicates the face direction of the subject A.
  • the image processing apparatus 1 performs facial coordinates based on the three-dimensional positions of the feature points P 0 , P 1 , P 2 detected in the past image frames.
  • the rotation angle and displacement of the system from the camera coordinate system are calculated (conversion coefficient calculation step).
  • a face coordinate system xyz based on feature points P 0 , P 1 , P 2 and their centroids G is defined for the camera coordinate system XYZ.
  • the x-axis, y-axis, and z-axis are set so that the origin of the face coordinate system is the center of gravity G.
  • the image processing apparatus 1 can obtain a face coordinate system corresponding to the feature points P 0 , P 1 , P 2 at a certain imaging timing as follows. First, as shown in FIG. 4, the angle of rotation ⁇ about the Y-axis of the normal vector V F, when the angle toward the X axis from the Z-axis positive, the following formula (5); It is obtained by.
  • T x ( ⁇ ) and T y ( ⁇ ) are the following formulas (8); Defined by
  • the image processing apparatus 1 can calculate the rotation angle ⁇ around the z axis from the reference posture regarding the posture of the subject A at an arbitrary timing.
  • FIG. 7 shows the relationship between the feature point coordinates of the reference posture viewed from the positive direction of the z-axis of the face coordinate system and the feature point coordinates of the subject A.
  • the rotation angle ⁇ is positive in the direction from the x axis to the y axis.
  • the image processing apparatus 1 uses the transformation matrix calculated by the following equations (10) and (11) to calculate the position vector of an arbitrary point around the face gravity center G in the face coordinate system in the camera coordinate system. Can be converted to a position vector. Further, the image processing apparatus 1 calculates a position vector of an arbitrary point on the camera coordinate system using the following formula (12); Can be converted into a position vector on the face coordinate system.
  • the image processing apparatus 1 determines the face at the imaging timing of the processing target based on the rotation angle ( ⁇ , ⁇ , ⁇ ) and the gravity center position G calculated for the image frame at the past timing.
  • the rotation angle ( ⁇ , ⁇ , ⁇ ) of the coordinate system and the displacement of the origin are predicted (conversion coefficient prediction step).
  • the image processing apparatus 1 detects the rotation angle and face of the face posture of the image frame to be processed.
  • the center of gravity can be predicted from past field images.
  • the image processing apparatus 1 includes the rotation angle ( ⁇ , ⁇ , ⁇ ) and the center-of-gravity position G of the m-th field image, and the rotation angle ( ⁇ , ⁇ , ⁇ ) and The predicted rotation angle ( ⁇ p , ⁇ p , ⁇ p ) and the predicted centroid position G p of the (m + 1) th field image are predicted by using a prediction method such as a Kalman filter using the centroid position G.
  • the image processing apparatus 1 refers to the predicted rotation angle ( ⁇ p , ⁇ p , ⁇ p ) and the predicted center-of-gravity position G p of the field image to be processed predicted by the transform coefficient prediction step, and the reference of the face coordinate system
  • the coordinates of the three-dimensional predicted positions of the left and right pupil centers and the left and right nostril centers are calculated.
  • the image processing apparatus 1 applies the transformation matrix given by Expression (10) to the calculated coordinates of the three-dimensional predicted position and then shifts the center of gravity G, thereby predicting the feature point 3 in the camera coordinate system.
  • Dimensional coordinates Pn1 are calculated.
  • the image processing apparatus 1 determines the difference between the bright pupil image and the dark pupil image between the consecutive mth and m + 1th fields. Position correction is performed on the window on the (m + 1) th field image in accordance with the amount of movement of the predicted three-dimensional coordinates P 11 and P 21 of the left and right pupil centers.
  • the image processing apparatus 1 does not immediately increase the window size, but uses the predicted coordinates of the pupil as the position in the field image. Assuming that the face posture in the (m + 1) th field image is predicted. However, if detection of one pupil image fails twice in succession, the size of the corresponding pupil window is gradually increased and detection in the next frame image is attempted. For example, each time one field is processed, the window size is increased by one pixel, and when detection fails ten times in succession, the pupil is detected by a medium window having a size of 150 ⁇ 60 pixels or the like.
  • a feature region group that is a combination of the three feature points P 0 , P 1 , and P 2 of the subject A is imaged. Then, the three-dimensional positions P 0 , P 1 , P 2 are detected in time series, and based on the three-dimensional positions P 0 , P 1 , P 2 of the feature part group at the past imaging timing, the feature part group is determined. The rotation angle ( ⁇ , ⁇ , ⁇ ) and displacement of the reference face coordinate system xyz from the camera coordinate system XYZ are calculated.
  • the predicted rotation angle ( ⁇ p , ⁇ p , ⁇ p ) and the origin G p of the face coordinate system xyz at the current photographing timing are determined.
  • a predicted three-dimensional coordinate P n1 at the current imaging timing of the characteristic part group is calculated.
  • a window is set on the image frame based on the calculated predicted three-dimensional coordinate P n1 .
  • position correction between different shooting timings functions even when the subject A is facing a direction greatly deviated from the direction of the camera 2 and corneal reflection is difficult to detect. As a result, accuracy in tracking feature points by image differences is improved.
  • the positions of the pupil and nostril are accurately predicted, even if the small window size for feature point detection is reduced, the probability that a part of the pupil will protrude from the window is reduced, and the center and area of the feature point are detected.
  • the robustness is improved and the processing speed is also improved.
  • the window of each feature point is pulled to the remaining three feature point windows, the phenomenon that the individual windows are separated is reduced. As a result, even when noise such as reflected light in the spectacle frame crosses the vicinity of the pupil, a window is provided at a position close to the actual pupil without being dragged by it, and erroneous detection of the pupil is extremely reduced.
  • the size of the window is increased, so that the feature point detection rate can be effectively improved.
  • the distance between the feature points of the subject A given first need not be strictly measured. If an inaccurate distance between feature points is given, the three-dimensional position of each feature point is erroneously determined, so that the face center of gravity and the face direction are erroneously detected. However, the window position on the two-dimensional image given based on the erroneously detected value is given correctly. Specifically, in the case of the subject A who is all shorter than the given distance between feature points by a certain rate, the face direction is hardly changed, and only the center of gravity of the face is recognized at a position far from the camera.
  • the center of gravity of the face is recognized to be far away in this way, the distance between the feature points is also converted to be shorter on the two-dimensional image, and as a result, the window position is correctly given on the two-dimensional image.
  • the face is simply recognized upward rather than the actual face direction, and the window position is given correctly.
  • the error in the distance between the feature points is absorbed by the shift in the center of gravity of the face and the face direction, and does not cause an error in the window position.
  • the proposed method works without problems.
  • FIG. 8 is a graph showing the detection results of the left and right pupils by the feature point tracking method of the present embodiment
  • FIG. 9 is a graph showing the detection results of the left and right pupils by the conventional pupil detection method.
  • this conventional method as disclosed in Japanese Patent Application Laid-Open No. 2008-029702, differential position correction using the nostril movement amount and differential position correction using the corneal reflection movement amount are used.
  • the size of the small window is a fixed square with sides of 70, 66, 56, 46, and 36 pixels.
  • the test subject shakes his / her face 3 times in the left / right direction, 3 times in the up / down direction, and 3 times in the diagonal direction for 26 seconds.
  • the “positive detection rate” in the measurement result is the ratio of the number of fields in which the pupil image is within the window and the pupil is correctly detected with respect to the total number of fields.
  • the “outside window ratio” is the ratio of the number of fields in which the pupil is correctly detected but the pupil image is not within the window, and the “false detection rate” is not a pupil (glass reflection or This is the ratio of the number of fields in which white-eye reflection) is detected as a pupil.
  • the positive detection rate is kept high in all window sizes as compared with the conventional method. It can also be seen that the false detection rate is greatly reduced even when the window size is increased, and the out-of-window rate is greatly reduced even when the window size is reduced.
  • the present invention is not limited to the embodiment described above.
  • the face coordinate system is set with reference to the middle point between the left and right pupil centers and the left and right nostril centers.
  • the center of the nostril may be used as a reference.
  • a small window is formed around the left and right nostril centers using the predicted three-dimensional coordinates P 31 and P 41 of the right and left nostril centers predicted at the current imaging timing. After setting, a nostril image may be detected.
  • an image frame is converted into a binarized image by using a predetermined threshold by a p-tile method for a window, and a binarized image is obtained.
  • the left and right nostril images are detected based on the threshold value, and the threshold value used at that time may be automatically determined based on the relationship with the threshold value that maximizes the differential value of the maximum area of the connected area in the binarized image. .
  • a threshold most suitable for the subject A can be determined. In other words, there are individual differences in the shape of the nostrils for each subject, and detection may become unstable when the threshold value is determined so that all subjects have the same ratio of pixels.
  • the appearance of the nostrils changes when the face moves, and the shape and area of the nostrils in the image fluctuate.Therefore, the face angle deviates from the reference posture when a certain ratio is used. Nasal detection may become unstable. By determining the threshold value as described above, it is possible to optimize the detection of nostrils even with respect to differences in subjects and facial movements.
  • the image processing apparatus 1 compares the threshold value with the pixel value while decreasing the threshold value by 1 from 255 to 1 for the image in the window set in the bright pupil image or dark pupil image related to the subject A.
  • the binarized image obtained corresponding to each threshold value is subjected to isolated point removal processing, expansion processing, contraction processing, and labeling processing. Therefore, the image processing apparatus 1 calculates the maximum area of the connected component (connected area) composed of the same pixel values in the binarized image for each threshold value, and the differential value regarding the threshold value of the maximum area takes the maximum value.
  • a threshold Th max is specified.
  • a threshold value Th min where the maximum area is first 0 when the threshold value is raised from the threshold value Th max is specified.
  • Th (Th max + Th min ) / 2 (14)
  • the threshold value automatic determination process as described above takes a long time because binarization, isolated point removal, expansion / contraction, and labeling are performed on 255 threshold values, and this process is performed for each frame.
  • the processing time becomes long and real-time characteristics are lacking. Therefore, it is preferable to perform the process once at the beginning, for example, when acquiring a nostril reference image as described later.
  • binarization is performed by binarizing pixels at a predetermined ratio (P tile value) of an image frame by a p-tile method for a window in a position detection step.
  • the image is converted into an image, a nostril image is detected based on the binarized image, and the nostril is detected based on the predicted rotation angle ( ⁇ p , ⁇ p , ⁇ p ) of the face coordinate system predicted in the conversion coefficient prediction step.
  • a nostril estimation image on the image frame may be predicted, and a P tile value in the p-tile method may be determined based on the nostril estimation image.
  • a plane on which the nostril is attached is defined as a nostril plane PL1.
  • the positional relationship between the pupil and the nostril is constant with respect to the subject A, and the angle of the nostril plane PL1 with respect to the face plane created by them is also constant.
  • nostrils plane PL1 is rotated by a predetermined angle ⁇ face plane around the x-axis of the face coordinate system can be considered as a plane passing through the nostrils midpoint between P 0.
  • a nostril direction vector k F those passing nostrils midpoint between P0 in the normal vector of nostrils plane PL1.
  • Those nostril direction vector k F is moved in parallel so as to pass nostrils midpoint between P 0 what similarly face direction vector V F and the relationship between the face plane and nostrils plane PL1 and rotated around the x axis by an angle ⁇ (FIG. 13).
  • the nostril direction vector k C can be calculated using the transformation matrix calculated by the equations (10) and (11) reflecting the rotation angles ( ⁇ , ⁇ , ⁇ ) of the face coordinate system. Further, the horizontal angle ⁇ N , the vertical angle ⁇ N , and the rotation angle ⁇ N of the nostril direction vector k C are expressed by the following formula (16); Can be obtained.
  • the image processing apparatus 1 stores a nostril image related to the subject A during the above-described binarization threshold automatic determination process. At this time, the image processing apparatus 1 deforms the nostril image when the binarization threshold value is automatically determined, so that the nostril direction vector is parallel to the Z axis of the camera coordinate system, that is, the nostril appears to be the maximum area. Acquire a nostril image. Specifically, when performing the binarization threshold automatic determination process, it is assumed that the subject A faces the camera 2 and faces upward from the optical axis L1 of the camera 2 by an angle ⁇ (about 60 degrees). .
  • the image processing apparatus 1 assumes that the coordinates (X 2 , Y 2 ) of the deformed nostril pixels are (X 1 , Y 1 ) where the coordinates of the pixels (nasal nostril pixels) in the nostril label before deformation are (X 1 , Y 1 ).
  • formula (17) Calculated by That is, the image processing apparatus 1 enlarges the y component of the nostril image by 1 / cos ( ⁇ ) times.
  • the nostril image thus deformed is used as a nostril reference image, and can be regarded as a nostril image when the horizontal angle ⁇ N , the vertical angle ⁇ N , and the rotation angle ⁇ N of the nostril direction vector are each 0 degrees. it can. Therefore, the image processing apparatus 1 uses the nose direction vector horizontal angle ⁇ N , vertical angle ⁇ N , based on the predicted rotation angle ( ⁇ p , ⁇ p , ⁇ p ) of the face coordinate system predicted in the transform coefficient prediction step. Then, the rotation angle ⁇ N is calculated, and the nostril image of the next frame can be estimated by rotating the nostril reference image in the three-dimensional space.
  • the image processing apparatus 1 rotates the nostril standard image of the three-dimensional image system in the three-dimensional space using the nostril direction vector horizontal angle ⁇ N , vertical angle ⁇ N , and rotation angle ⁇ N.
  • the image processing apparatus 1 calculates the number of pixels of the nostril region based on the nostril estimation image, and determines the P tile value when actually detecting the nostril based on the relationship between the number of pixels and the window size. For example, when the P tile value is P [%], pixels from the lower brightness to the ratio P [%] are detected and the binarization threshold is set for the image in the window.
  • the position detection step it is preferable to adjust the size of the window for the feature part group according to the size of the image of the feature part group detected at the past imaging timing.
  • a more efficient and high-speed tracking process can be realized by setting the window according to the size of the past image of the feature region group.
  • the position detection step it is also preferable to increase the size of the feature part group window when the detection of the feature part group image fails in the past imaging timing. In this way, the detection rate of the feature part group can be effectively improved.
  • the present invention uses a feature point tracking method and a feature point tracking device for tracking a feature point of a subject based on an image of the subject, and performs feature point movement prediction in a time-series image frame with high accuracy.
  • robustness of feature point tracking can be improved.
  • 10 characteristic point tracking device, 1 ... image processing apparatus, 2 ... camera (imaging means), A ... subject, P 0, P 1, P 2 ... characteristic points (characteristic part group), Q n ... 2-dimensional position, ⁇ , ⁇ , ⁇ : rotation angle, XYZ: camera coordinate system (reference coordinate system), xyz: face coordinate system.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Signal Processing (AREA)
  • Image Analysis (AREA)

Abstract

L’invention vise à améliorer la fiabilité du suivi de points caractéristiques par prédiction du mouvement de points caractéristiques dans des trames d’image en série chronologique avec une grande précision. Un dispositif de suivi de points caractéristiques (10) détecte les positions tridimensionnelles (P0, P1, P2) des trois points caractéristiques d’une personne objet (A) sur une base de série chronologique, calcule les angles rotationnels (α, β, γ) et le déplacement d’un système de coordonnées de visage par rapport à un système de coordonnées d’appareil photo sur la base des positions tridimensionnelles (P0, P1, P2) pour la synchronisation d’imagerie précédente, prédit les angles rotationnels (αp, βp, γp) et le déplacement du système de coordonnées de visage à une synchronisation d’imagerie actuelle sur la base des angles rotationnels (α, β, γ) et du déplacement du système de coordonnées de visage par rapport à un système de coordonnées d’appareil photo, et calcule la position de prédiction tridimensionnelle (Pn1) d’un point caractéristique à une synchronisation d’imagerie actuelle sur la base des angles rotationnels (αp, βp, γp) et du déplacement. Lors de la détection des points caractéristiques, le dispositif de suivi de points caractéristiques (10) fixe des fenêtres des trames d’image objet sur la base de la position de prédiction tridimensionnelles (Pn1) et détecte les points caractéristiques avec des fenêtres respectives comme objets.
PCT/JP2009/063197 2008-07-24 2009-07-23 Procédé de suivi de points caractéristiques et dispositif de suivi de points caractéristiques WO2010010926A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2010521735A JP5429885B2 (ja) 2008-07-24 2009-07-23 特徴点追跡方法及び特徴点追跡装置

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2008191136 2008-07-24
JP2008-191136 2008-07-24

Publications (1)

Publication Number Publication Date
WO2010010926A1 true WO2010010926A1 (fr) 2010-01-28

Family

ID=41570386

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2009/063197 WO2010010926A1 (fr) 2008-07-24 2009-07-23 Procédé de suivi de points caractéristiques et dispositif de suivi de points caractéristiques

Country Status (2)

Country Link
JP (1) JP5429885B2 (fr)
WO (1) WO2010010926A1 (fr)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102542797A (zh) * 2010-12-09 2012-07-04 财团法人工业技术研究院 图像式的交通参数检测系统与方法及计算机程序产品
JP2012125373A (ja) * 2010-12-15 2012-07-05 Hitachi Aloka Medical Ltd 超音波画像処理装置
KR101320337B1 (ko) 2012-05-02 2013-10-29 한국항공우주연구원 위치 및 자세 추정시스템
EP2857939A4 (fr) * 2012-05-25 2016-09-14 Univ Shizuoka Nat Univ Corp Procédé de détection de pupille, procédé de détection de réflexe cornéen, procédé de détection de position du visage, et procédé de suivi de pupille
JP2017097554A (ja) * 2015-11-20 2017-06-01 カシオ計算機株式会社 特徴点追跡装置、特徴点追跡方法及びプログラム
JP2022140386A (ja) * 2021-03-10 2022-09-26 キヤノン株式会社 顔の姿勢を検出する装置及び方法、画像処理システム、並びに記憶媒体

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002007095A1 (fr) * 2000-07-17 2002-01-24 Mitsubishi Denki Kabushiki Kaisha Dispositif de representation en 3d du visage et dispositif de reconnaissance peripherique comprenant ce dernier
JP2003015816A (ja) * 2001-06-29 2003-01-17 Honda Motor Co Ltd ステレオカメラを使用した顔・視線認識装置
JP2005309992A (ja) * 2004-04-23 2005-11-04 Toyota Motor Corp 画像処理装置および画像処理方法
JP2007026073A (ja) * 2005-07-15 2007-02-01 National Univ Corp Shizuoka Univ 顔姿勢検出システム
JP2007172237A (ja) * 2005-12-21 2007-07-05 Denso Corp 推定装置
JP2007268026A (ja) * 2006-03-31 2007-10-18 National Univ Corp Shizuoka Univ 瞳孔を検出する方法及び装置

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002007095A1 (fr) * 2000-07-17 2002-01-24 Mitsubishi Denki Kabushiki Kaisha Dispositif de representation en 3d du visage et dispositif de reconnaissance peripherique comprenant ce dernier
JP2003015816A (ja) * 2001-06-29 2003-01-17 Honda Motor Co Ltd ステレオカメラを使用した顔・視線認識装置
JP2005309992A (ja) * 2004-04-23 2005-11-04 Toyota Motor Corp 画像処理装置および画像処理方法
JP2007026073A (ja) * 2005-07-15 2007-02-01 National Univ Corp Shizuoka Univ 顔姿勢検出システム
JP2007172237A (ja) * 2005-12-21 2007-07-05 Denso Corp 推定装置
JP2007268026A (ja) * 2006-03-31 2007-10-18 National Univ Corp Shizuoka Univ 瞳孔を検出する方法及び装置

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102542797A (zh) * 2010-12-09 2012-07-04 财团法人工业技术研究院 图像式的交通参数检测系统与方法及计算机程序产品
CN102542797B (zh) * 2010-12-09 2014-07-09 财团法人工业技术研究院 图像式的交通参数检测系统与方法
US9058744B2 (en) 2010-12-09 2015-06-16 Industrial Technology Research Institute Image based detecting system and method for traffic parameters and computer program product thereof
JP2012125373A (ja) * 2010-12-15 2012-07-05 Hitachi Aloka Medical Ltd 超音波画像処理装置
KR101320337B1 (ko) 2012-05-02 2013-10-29 한국항공우주연구원 위치 및 자세 추정시스템
EP2857939A4 (fr) * 2012-05-25 2016-09-14 Univ Shizuoka Nat Univ Corp Procédé de détection de pupille, procédé de détection de réflexe cornéen, procédé de détection de position du visage, et procédé de suivi de pupille
US9514538B2 (en) 2012-05-25 2016-12-06 National University Corporation Shizuoka University Pupil detection method, corneal reflex detection method, facial posture detection method, and pupil tracking method
JP2017097554A (ja) * 2015-11-20 2017-06-01 カシオ計算機株式会社 特徴点追跡装置、特徴点追跡方法及びプログラム
JP2022140386A (ja) * 2021-03-10 2022-09-26 キヤノン株式会社 顔の姿勢を検出する装置及び方法、画像処理システム、並びに記憶媒体
JP7371154B2 (ja) 2021-03-10 2023-10-30 キヤノン株式会社 顔の姿勢を検出する装置及び方法、画像処理システム、並びに記憶媒体

Also Published As

Publication number Publication date
JP5429885B2 (ja) 2014-02-26
JPWO2010010926A1 (ja) 2012-01-05

Similar Documents

Publication Publication Date Title
JP4452833B2 (ja) 視線移動検出方法及び視線移動検出装置
JP4307496B2 (ja) 顔部位検出装置及びプログラム
JP4811259B2 (ja) 視線方向推定装置及び視線方向推定方法
JP5158842B2 (ja) 眼球運動計測方法および眼球運動計測装置
US10311583B2 (en) Eye motion detection method, program, program storage medium, and eye motion detection device
JP5429885B2 (ja) 特徴点追跡方法及び特徴点追跡装置
JP5578603B2 (ja) 視線制御装置、視線制御方法、及びそのプログラム
JP4452836B2 (ja) 瞳孔を検出する方法及び装置
JP4501003B2 (ja) 顔姿勢検出システム
JP7030317B2 (ja) 瞳孔検出装置及び瞳孔検出方法
US20220100268A1 (en) Eye tracking device and a method thereof
WO2016027627A1 (fr) Système d'estimation de position de reflet cornéen, procédé d'estimation de position de reflet cornéen, programme d'estimation de position de reflet cornéen, système de détection de pupille, procédé de détection de pupille, programme de détection de pupille, système de détection de regard, procédé de détection de regard, programme de détection de regard, système de détection d'orientation faciale, procédé de détection d'orientation faciale, et programme de détection d'orientation faciale
JP2003150942A (ja) 目位置追跡方法
JP4431749B2 (ja) 顔姿勢検出方法
JP6957048B2 (ja) 眼部画像処理装置
JP2010244156A (ja) 画像特徴量検出装置及びこれを用いた視線方向検出装置
JP2006236184A (ja) 画像処理による人体検知方法
JP2016099759A (ja) 顔検出方法、顔検出装置、及び顔検出プログラム
KR101961266B1 (ko) 시선 추적 장치 및 이의 시선 추적 방법
JP6555707B2 (ja) 瞳孔検出装置、瞳孔検出方法及び瞳孔検出プログラム
JP7046347B2 (ja) 画像処理装置及び画像処理方法
JP2004192552A (ja) 開閉眼判定装置
JP4696571B2 (ja) 眼位置検出装置
JP6652263B2 (ja) 口領域検出装置及び口領域検出方法
JP5004099B2 (ja) カーソル移動制御方法及びカーソル移動制御装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09800440

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2010521735

Country of ref document: JP

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 09800440

Country of ref document: EP

Kind code of ref document: A1