WO2017101292A1 - 自动对焦的方法、装置和系统 - Google Patents

自动对焦的方法、装置和系统 Download PDF

Info

Publication number
WO2017101292A1
WO2017101292A1 PCT/CN2016/087587 CN2016087587W WO2017101292A1 WO 2017101292 A1 WO2017101292 A1 WO 2017101292A1 CN 2016087587 W CN2016087587 W CN 2016087587W WO 2017101292 A1 WO2017101292 A1 WO 2017101292A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
pupil
pupil image
camera
gradient
Prior art date
Application number
PCT/CN2016/087587
Other languages
English (en)
French (fr)
Inventor
崔剑
王浩雷
Original Assignee
深圳市汇顶科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市汇顶科技股份有限公司 filed Critical 深圳市汇顶科技股份有限公司
Publication of WO2017101292A1 publication Critical patent/WO2017101292A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/67Focus control based on electronic image sensor signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/67Focus control based on electronic image sensor signals
    • H04N23/673Focus control based on electronic image sensor signals based on contrast or high frequency components of image signals, e.g. hill climbing method
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/61Control of cameras or camera modules based on recognised objects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/61Control of cameras or camera modules based on recognised objects
    • H04N23/611Control of cameras or camera modules based on recognised objects where the recognised objects include parts of the human body

Definitions

  • the present invention relates to the field of human-computer interaction, and more particularly to a method, apparatus and system for auto-focusing.
  • human eye tracking As a popular technology in the field of human-computer interaction, human eye tracking has attracted many researchers and industrial manufacturers to participate in the research and application. The use of human eye related visual information for corresponding operational control has some convenience compared to other limbs or auxiliary devices.
  • the premise of human eye tracking is to capture video information of human eye movements.
  • the imaging effect of the image is affected by various environments. For example, the exposure value of the high-speed camera cannot be adjusted to cause the image to be dark, the image gray value to be low, or the image signal-to-noise ratio to be low. Therefore, the quality of the image directly affects the effect of human eye tracking.
  • Autofocus technology is an important prerequisite and guarantee for the system to obtain clear images.
  • the quality of the image quality evaluation indicators has a direct impact on the system's autofocus technology.
  • the image quality evaluation index When determining the image quality evaluation index, it can be divided into a full reference, a half reference, and a non-reference image quality evaluation depending on whether or not the reference image is owned. In combination with the actual situation, the method of no reference image quality evaluation is more suitable for practical engineering applications. For example, images obtained by high-speed camera shooting have low exposure value and poor signal-to-noise ratio, resulting in no reference image.
  • the commonly used image quality evaluation methods can be divided into two categories: airspace and frequency domain. Although frequency domain evaluation in the actual application process has certain anti-noise performance, it needs to perform corresponding frequency domain transformation, which is complicated to calculate and consumes a large amount of calculation.
  • Embodiments of the present invention provide a method, device, and system for autofocus, which can control a camera Has a good focus.
  • a method for autofocusing includes: acquiring a pupil image of a pupil of a human eye; performing image degradation processing on the pupil image to obtain a degraded image; and determining a relative reference according to the pupil image and the degraded image
  • An image, the relative reference image is a convolution of the pupil image and the degraded image; determining an image quality evaluation index according to a normalized value of the gradient of the pupil image and an image structure similarity, wherein the image structure
  • the similarity is a structural similarity between the pupil image and the relative reference image; and the first camera is controlled to perform focusing according to the image quality evaluation index.
  • the image quality evaluation index is determined by the normalized value of the maximum gradient of the pupil image and the image structure similarity, and the camera is controlled according to the image quality evaluation index, and the focus technology can control the camera to have a good focusing effect.
  • the method further includes: dividing the pupil image into N block regions of equal size, where N is a positive integer; and from the N block regions Selecting K block regions as K pupil image block regions, K being a positive integer, K ⁇ N; selecting K relative reference image block regions corresponding to the K pupil image block regions from the relative reference images; Determining a block region structure similarity, the block region structure similarity being a structural similarity between the K pupil image block regions and the K reference image block regions; using the block region structure similarity as the Image structure similarity.
  • the value of K may be preset or an empirical value, and may also be determined based on the current pupil image.
  • the method further includes: determining contrast sensitivity of the pupil image; determining, according to N, contrast sensitivity of the pupil image K.
  • the image quality evaluation index when the K value is determined from the pupil image, the image quality evaluation index can be directly correlated with the pupil image, so that the image quality evaluation index is more favorable for the controller to control the camera auto focus.
  • the determining the contrast sensitivity of the pupil image comprises: according to a pixel width of each block region in the pupil image, a person Determining the spatial frequency of each pixel point by the distance from the camera to the position of each pixel in the pupil image; determining the return of the pupil image according to the spatial frequency of each pixel a spatial frequency; according to the normalized spatial frequency of the pupil image The contrast sensitivity of the pupil image is determined.
  • the spatial frequency of each pixel is:
  • the normalized spatial frequency of the pupil image is:
  • the contrast sensitivity of the pupil image is:
  • a is the human eye angle of view
  • L is the width of the image
  • D is the distance from the human eye to the camera
  • u, v are the horizontal and vertical coordinates of the position in the frequency domain after each pixel point undergoes frequency domain transformation
  • x', y' is the horizontal and vertical coordinates of the center position of the frequency domain image after the offset
  • f min represents the minimum value of the spatial frequency f
  • f max represents the maximum value of the spatial frequency f.
  • the method further includes: determining a gradient of the pupil image according to the pupil image; determining, according to a gradient of the pupil image The normalized value of the gradient of the pupil image.
  • the image structural similarity is adopted as one of the factors of the image quality evaluation index.
  • the peak of the image structure similarity of the pupil image may not be unique, and the effect of the controller controlling the camera to perform autofocus is not satisfactory.
  • the normalized value of the maximum gradient of the pupil image is used as the weight of the image structure similarity, and the peak value of the partial image is decreased within a certain range, so that the peak of the entire image is more prominent.
  • the ideal image quality evaluation index is the curve of increasing first and then decreasing, and the peak value is unique. When the image quality evaluation index takes the peak value, the position of the camera is the best.
  • the normalized value of the gradient of the pupil image is a normalized value of a maximum gradient of the pupil image;
  • the method further includes determining a normalized value of a maximum gradient of the pupil image based on a maximum value of a gradient of the pupil image.
  • the pupil image is represented by Rect
  • the gradient of the pupil image is:
  • Rb represents a convolution operation
  • Max represents the maximum gradient of the pupil image, and its expression is as follows:
  • Maxmium represents the maximum theoretical gradient of the pupil image.
  • the acquiring the pupil image of the human eye pupil comprises: controlling the second camera to capture a person target; determining the person according to the character target a face position; adjusting a pan/tilt of the first camera according to a face position of the person, so that the first camera captures a face image; performing binarization processing on the face image to obtain a processed image; An outline of a luminance region of the processed image; the pupil image is determined according to an area of the contour.
  • an apparatus for autofocusing comprising: an acquisition unit, configured to acquire a pupil image of a pupil of a human eye; and a processing unit configured to perform image degradation on the pupil image acquired by the acquisition unit Processing, obtaining a degraded image; a first determining unit, configured to determine a relative reference image according to the pupil image acquired by the acquiring unit and the degraded image obtained by the processing unit, where the relative reference image is the pupil image a convolution with the degraded image; a second determining unit, configured to determine an image quality evaluation index according to a normalized value of the gradient of the pupil image and an image structure similarity, wherein the image structure similarity is Obtaining a structural similarity between the pupil image obtained by the unit and the relative reference image obtained by the first determining unit; and a focusing unit, configured to control, according to the image quality evaluation index obtained by the second determining unit, the first The camera focuses.
  • the device further includes: a dividing unit, configured to divide the pupil image into N block regions of equal size, where N is a positive integer; a unit, configured to select K block regions from the N block regions as K pupil image block regions, K is a positive integer, K ⁇ N; and a second selecting unit, configured to select and select from the relative reference image The K corresponding reference image block regions corresponding to the K pupil image block regions; the third determining unit, configured to determine the block region structural similarity, wherein the block region structural similarity is the K pupil image block regions and a structural similarity between the K reference image block regions; and a fourth determining unit configured to use the block region structural similarity as the image structural similarity.
  • the device further includes: a fifth determining unit, configured to determine a contrast sensitivity of the pupil image; a determining unit for determining K according to N and contrast sensitivity of the pupil image.
  • the fifth determining unit is specifically configured to: according to a pixel width of each block region in the pupil image, a human eye to the The distance of the first camera, the position of each pixel of each block region in the pupil image determines the spatial frequency of each pixel, and the normalization of the pupil image is determined according to the spatial frequency of each pixel a spatial frequency, and determining a contrast sensitivity of the pupil image based on a normalized spatial frequency of the pupil image.
  • the spatial frequency of each pixel is:
  • the normalized spatial frequency of the pupil image is:
  • the contrast sensitivity of the pupil image is:
  • a is the human eye angle of view
  • L is the width of the image
  • D is the distance from the human eye to the first camera
  • u and v are the horizontal and vertical coordinates of the position in the frequency domain after each pixel point undergoes frequency domain transformation
  • x ', y' is the horizontal and vertical coordinates of the center position of the frequency domain image after the offset
  • f min represents the minimum value of the spatial frequency f
  • f max represents the maximum value of the spatial frequency f.
  • the normalized value of the gradient of the pupil image is a normalized value of a maximum gradient of the pupil image;
  • the apparatus further includes a normalization unit for determining a normalized value of a maximum gradient of the pupil image based on a maximum value of a gradient of the pupil image.
  • the pupil image is represented by Rect
  • the gradient of the pupil image is:
  • Rb represents a convolution operation
  • Max represents the maximum gradient of the pupil image, and its expression is as follows:
  • Maxmium represents the maximum theoretical gradient of the pupil image.
  • the acquiring unit is specifically configured to control the second camera to capture a person target, and determine a person's face according to the person target Positioning, adjusting a pan/tilt of the first camera according to a face position of the person, causing the first camera to capture a face image, performing binarization processing on the face image, obtaining a processed image, and acquiring the processing An outline of the luminance region of the image and determined through the pupil image based on the area of the contour.
  • the respective operations of the corresponding modules and/or devices of the device for controlling the autofocus of the camera in the embodiment of the present invention may refer to the respective steps of the method in the first aspect, and are not repeated here.
  • a system for autofocusing comprising: a first camera, a second camera, and a device for controlling autofocus of a first camera in any one of the foregoing second aspects, wherein the device and the device are provided The first camera is connected, and the device is connected to the second camera.
  • the above system may be a human-computer interaction system or a video surveillance system.
  • the first camera may be a high speed camera
  • the second camera may be a wide angle camera.
  • the first camera and the second camera are not specifically limited in the embodiment of the present invention.
  • the first camera is a high-speed camera
  • since the image obtained by the high-speed camera has low exposure value and poor signal-to-noise ratio, it is difficult to control the focus without the reference image, and the high-speed camera can be controlled by the method of the embodiment of the invention. Focusing effect.
  • FIG. 1 is a schematic diagram of a scenario of a human-machine interaction system to which an embodiment of the present invention is applicable.
  • FIG. 2 is a schematic flow chart of a method of autofocusing according to an embodiment of the present invention.
  • FIG. 3 is a block diagram of an apparatus for autofocusing in accordance with an embodiment of the present invention.
  • FIG. 4 is a block diagram of an apparatus for autofocusing according to another embodiment of the present invention.
  • FIG. 1 is a schematic diagram of a scenario of a human-machine interaction system to which an embodiment of the present invention is applicable.
  • the human-computer interaction system shown in FIG. 1 includes a first camera 11, a second camera 12, and a controller 13.
  • the controller 13 can be used to control the autofocus of the first camera 11, in other words, the means for controlling the autofocus of the first camera 11 can be the controller of FIG.
  • the controller 13 can be connected to the first camera 11, and the controller 13 can also be connected to the wide-angle camera 12.
  • the first camera 11 and the wide-angle camera 12 can be used to capture an image, such as an image of a human eye pupil 14 .
  • the first camera may be a high speed camera
  • the second camera may be a wide angle camera, which is exemplarily illustrated in the following embodiments of the present invention. It should be understood that the high speed camera and the wide angle camera are merely illustrative of the first camera and the second camera in the present invention, and do not limit the scope of protection of the present application.
  • a wide-angle camera can be used to capture a person's target, and a high-speed camera can be used to focus the human eye area to capture the pupil of the human eye. That is, the wide-angle camera is used to roughly search and locate the target, and then the high-speed camera is used to further accurately locate the desired pupil image.
  • This wide-angle camera and high-speed camera can acquire the pupil image faster and more accurately, which can improve the focus of the camera. effectiveness.
  • the controller can process the pupil image captured by the camera to obtain an image quality evaluation index, and control the first camera auto focus according to the image quality evaluation index.
  • the embodiment of the invention can be used for video monitoring, and after the first camera is controlled by the controller, the image captured by the first camera is tracked and monitored.
  • the method of autofocusing of the present invention will be described in detail below with reference to FIG. 2 and taking the first camera as a high speed camera and the second camera as a wide angle camera as an example.
  • FIG. 2 is a schematic flow chart of a method of autofocusing according to an embodiment of the present invention.
  • the method of Figure 2 can be used in a video surveillance system that can include a high speed camera, a wide angle camera, and a controller.
  • the method of FIG. 2 can be performed by a controller.
  • a device for controlling auto focus of a high speed camera is taken as an example of a controller.
  • the method for controlling the auto focus of the high speed camera by the controller will be described in detail below with reference to the specific embodiments.
  • the pupil image that the controller can acquire which can be taken by a high-speed camera, or other The camera was taken.
  • the controller can obtain a pupil image of a human eye pupil taken by a high-speed camera by controlling a wide-angle camera to capture a person's target, and determining a person's face position according to the person's target, and then adjusting the high-speed camera's cloud according to the person's face position.
  • the high-speed camera captures the face image, binarizes the face image, obtains the processed image, and finally obtains the contour of the brightness region of the processed image, and determines the pupil image according to the area of the contour.
  • the image quality evaluation index is determined by acquiring the pupil image of the pupil of the human eye photographed by the high-speed camera, thereby controlling the high-speed camera autofocus, so that the image quality evaluation index is calculated by using the image captured by the high-speed camera itself, and Conducive to the accuracy of the focus, can make the high-speed camera have a better focus.
  • the controller can control the wide angle camera to search for and locate the person target.
  • the wide-angle camera can capture the moving person target and find the face area so that the subsequent high-speed camera determines the pupil image. This implementation is unaffected by the detected human target movement or posture change, such that subsequent image quality evaluation indicators derived from the pupil image are not affected by the subject's target movement or posture change.
  • the controller can select a frame image from the video stream of the face image and draw a gray histogram hist of the image.
  • the controller may determine a threshold for binarizing the image based on the gray histogram of the image.
  • the image size of the video acquisition is denoted as R ⁇ C, for example, 2048 ⁇ 1088, R represents the width of the image, C represents the height of the image, and the units of R and C are pixels.
  • the gray value of the image corresponding to 95% of the sum of the area between the gray histogram and the coordinate axis of the image is selected as the threshold T of the image binarization process,
  • i represents the gray value of the image.
  • i represents the gray value of the image.
  • i ranges from 0 to 255.
  • the face image IM(x, y) is binarized according to the threshold T of the image binarization process obtained as described above.
  • IM represents the acquired grayscale image
  • (x, y) is the corresponding coordinate point position.
  • the frame rate of a high-speed camera is generally large, for example, the frame rate is 300 fps.
  • the exposure value of the image is relatively low, and the overall gray value of the image is not high, and the signal-to-noise ratio is poor.
  • the image is binarized, there are many discrete interference points due to noise, so it is necessary to perform corresponding morphological opening operation on the image.
  • the controller can find the contour of the processed image, and determine the position of the pupil image according to the size of the area of the contour, and then determine the pupil image according to the pupil area. For example, a contour detection (findcontours) function in the Open Computer Vision (Opencv) can be applied to obtain a corresponding contour. Corresponding area determination is performed on the obtained contour. If all the contour areas are small, and the face image including the human eye area in the image can be judged, the pupil image can be determined by the size of the area of the contour of the face image. position.
  • a contour detection (findcontours) function in the Open Computer Vision (Opencv) can be applied to obtain a corresponding contour.
  • Corresponding area determination is performed on the obtained contour. If all the contour areas are small, and the face image including the human eye area in the image can be judged, the pupil image can be determined by the size of the area of the contour of the face image. position.
  • the contour area of the face image is within the preset range, the contour is considered to include the pupil image.
  • the position at which the contour is located can be determined as the position of the pupil image, and the image at that position can be regarded as a pupil image.
  • the controller in the embodiment of the invention combines the wide-angle camera and the high-speed camera to obtain the pupil image of the pupil of the human eye, so that the obtained pupil image is more accurate, and is more favorable for determining the image quality evaluation index according to the pupil image, thereby enabling the controller to control the high speed.
  • the focus of the camera is more precise.
  • the pupil image is represented by F(x, y), and the pupil image is degraded to obtain a degraded image S(x, y).
  • M(x, y) is an out-of-focus image and N(x, y) is a noise image. Represents a convolution operation
  • Degraded images can be simulated empirically using the following Gaussian models:
  • the currently acquired pupil image may be degraded according to the blurring principle of the image defocusing, for example, Gaussian low-pass filtering is performed on the pupil image to obtain a degraded image.
  • the controller may use the convolution image of the pupil image F(x, y) and the degraded image S(x, y) as the relative reference image G(x, y):
  • the controller can obtain a normalized value of the gradient of the pupil image by the following method. For example, the controller may determine a gradient of the pupil image based on the pupil image and determine a normalized value of the gradient of the pupil image based on the gradient of the pupil image.
  • the controller may determine a normalized value of the maximum gradient of the pupil image based on the maximum value of the gradient of the pupil image.
  • the normalized value of the maximum image of the pupil image can be determined by the maximum value of the gradient of the pupil image, and the peak of the image quality evaluation index obtained by such normalization value is as unique as possible, and the image quality is The function image curve of the evaluation index is more obvious and lower, which is beneficial to the high-speed camera to achieve better focus.
  • the pupil image is represented by Rect, and the gradient of the pupil image is:
  • Rb can be composed of the following:
  • the normalized value of the maximum gradient of the pupil image is:
  • Max represents the maximum gradient of the pupil image, and its expression is as follows:
  • Maxmium represents the maximum theoretical gradient of the pupil image.
  • the controller can obtain the above image structure similarity in the following manner.
  • the pupil image is divided into N block regions of equal size, and N is a positive integer.
  • K block regions are selected as K pupil image block regions, and K is a positive integer, K ⁇ N.
  • Determining the block similarity of the block region by selecting K relative reference image block regions corresponding to the K pupil image block regions from the relative reference image, wherein the block region structure similarity is K pupil image block regions and K reference images Structural similarity between block regions.
  • K can be a preset value, an empirical value, or a value determined from the pupil image.
  • the structural similarity of the block region is calculated by selecting K pupil image block regions and K relative reference image block regions, and the value of K may be preset or an empirical value, so as to avoid using the entire image. All block regions calculate the structural similarity of the region, which can reduce the complexity of calculating the structural similarity of the region.
  • the controller can determine the value of K from the pupil image in the following manner. For example, the controller can determine the contrast sensitivity of the pupil image and determine K based on the contrast sensitivity of the N and pupil images.
  • K is determined by the contrast sensitivity of the N and pupil images, and an appropriate K value can be selected as much as possible, so that the complexity of the structural similarity of the calculation region can be reduced while ensuring the regional structure similarity as accurate as possible.
  • the controller can determine the contrast sensitivity of the pupil image in the following manner. For example, the controller may determine the spatial frequency of each pixel point based on the pixel width of each block region in the pupil image, the distance of the human eye to the high speed camera, and the position of each pixel point of each block region in the pupil image. The normalized spatial frequency of the pupil image is determined based on the spatial frequency of each pixel. The contrast sensitivity of the pupil image is determined according to the normalized spatial frequency of the pupil image.
  • the image structure similarity is directly related to the pupil image at this time.
  • the image quality evaluation index obtained by using the similarity of the image structure is also directly related to the image, so that the high-speed camera autofocus can be better controlled according to the pupil image, that is, the focusing effect is better.
  • a normal human eye angle can only recognize a finite number of gratings within a certain range of angles.
  • the formula for calculating the human eye angle of view a is:
  • L represents the width of the image in centimeters.
  • D represents the distance from the human eye to the high speed camera.
  • the position of each point in the image after frequency domain transformation is (u, v) in the frequency domain, and the center coordinate of the frequency domain image after offset is (x', y'), corresponding to the space of each point.
  • the frequency is:
  • f s represents the spatial frequency of each point in the calculated pupil image.
  • the controller can calculate the normalized spatial frequency ff of the pupil image according to the spatial frequency of each point in the pupil image:
  • ⁇ f is calculated using the square root of the spatial frequency sum of the x and y directions of the entire image
  • fmin represents the minimum of the spatial frequency
  • fmax represents the maximum of the spatial frequency
  • the controller can calculate the contrast sensitivity of the pupil image based on the normalized spatial frequency ff of the pupil image:
  • the controller can calculate the number of K values of the block region of the selected Sobel gradient magnitude image from the contrast sensitivity of the pupil image and the number N of block regions of the pupil region:
  • K block regions may be selected from the pupil image F(x, y), and K corresponding to the K block regions are selected from the relative reference image G(x, y).
  • the block area and calculates the block region structural similarity of the K block regions of the current image F(x, y) and the K regions of G(x, y).
  • the structural similarity of each block region is represented by SSIM, and the block region structural similarity is the sum of the structural similarities of each of the K block regions.
  • the structural similarity SSIM of each block region can be obtained by the following formula:
  • l, m and n represent the measurement parameters of the gray value, contrast and structural information contrast, respectively
  • ⁇ F and ⁇ G respectively represent the mean of the block regions corresponding to F(x, y) and G(x, y)
  • ⁇ F and ⁇ G represent the standard deviations of the block regions corresponding to F(x, y) and G(x, y), respectively
  • ⁇ FG represents the standard covariance of the block regions corresponding to the binary values.
  • ⁇ , ⁇ , ⁇ represent the weight of each parameter in the similarity SSIM result, and ⁇ , ⁇ , ⁇ can obtain corresponding values according to experience.
  • the image F(x, y) can be calculated based on the gradient of the Sobel operator by the following means.
  • the Sobel operator can be divided into a horizontal direction operator hx and a vertical direction operator vy. E.g:
  • the horizontal gradient, vertical gradient and gradient amplitude can be obtained from the images F(x, y), hx and vy:
  • the controller can select K regions in F(x, y). As an embodiment of the present invention, the controller may determine the specific locations of the K regions based on the gradient magnitude of F(x, y). For example, the controller may select K regions having a larger gradient magnitude as K block regions of the selected image F(x, y).
  • the controller can use the block region structure similarity as the image structure similarity FSSIM of the whole pupil image:
  • the controller may obtain a normalized value W and a graph according to the maximum gradient of the pupil image.
  • the image quality evaluation index LSSIM is determined.
  • LSSIM W x FSSIM.
  • the method for controlling the high-speed camera autofocus of the embodiment of the invention has certain anti-interference ability, and selects an appropriate K value according to the pupil image, so as to ensure a certain anti-interference ability and minimize the calculation amount.
  • the controller can control the high-speed camera to focus according to the image quality evaluation index.
  • the setting controls the initial position before the high-speed camera autofocus, the current position L of the high-speed camera, the minimum value S min of the camera moving step, the currently set moving step S, and the direction of the initial movement is the positive direction.
  • the controller can adjust the high-speed camera to the initial position before autofocus above, ready to start autofocus.
  • the position of the high-speed camera is adjusted by the step size S along the current direction, and the image quality evaluation index calculated when the high-speed camera is moved and the position of the corresponding high-speed camera are recorded by the interval step + S.
  • the controller can position the high-speed camera as the abscissa and the image quality evaluation index as the ordinate.
  • the image quality evaluation index is successively decremented, and it is proved that the obtained image starts to defocus, so the adjustment of the high speed camera is stopped.
  • the controller can also directly obtain the position of the high-speed camera when the image quality evaluation index is optimal according to the recorded image quality evaluation index and the position of the high-speed camera.
  • an image quality evaluation index may appear in a certain range as the position of the high-speed camera increases first, then decreases and then increases.
  • the controller can set the high-speed camera position corresponding to the peak to be the position at which the high-speed camera is in focus when the image quality evaluation index shows only one peak within a step size of several pixels.
  • the controller can recalculate the image quality evaluation index and control the high speed camera to focus.
  • the image quality evaluation index is determined by the normalized value of the maximum gradient of the pupil image and the image structure similarity, and the high-speed camera is controlled according to the image quality evaluation index, and the focus technology can control the camera to have a good focusing effect. . Especially for infrared images with low exposure values or signal to noise ratios, embodiments of the present invention have better focusing effects.
  • the image quality evaluation index in the embodiment of the present invention is dependent on the pupil image and is not affected by other factors in the environment. Therefore, the method for controlling the high-speed camera autofocus of the embodiment of the present invention has good anti-interference ability.
  • the method for controlling the auto focus of the high speed camera according to the embodiment of the invention can be used for a video monitoring system, which can include a high speed camera, a wide angle camera and a controller to realize auto focus of the high speed camera.
  • the device of the embodiment of the invention has simple requirements and the solution is simple and easy.
  • the human eye pupil for image tracking, the image can be tracked only by tracking the movement of the pupil.
  • the controller can locate the face position through the wide-angle camera, and then focus the human eye region through the high-speed camera, the source of the image quality evaluation index.
  • the image (for example, the pupil image here) is not affected by the movement and posture of the detection target.
  • FIG. 3 is a block diagram of an apparatus for autofocusing in accordance with an embodiment of the present invention.
  • the apparatus of Figure 3 can perform the method of the flow chart of Figure 2.
  • the apparatus 10 of FIG. 3 includes an acquisition unit 11, a first determination unit 12, a second determination unit 13, and a focus unit 14.
  • the apparatus 10 for controlling high speed camera autofocus of FIG. 3 may be the controller of FIGS. 1 and 2.
  • the acquisition unit 11 is configured to acquire a pupil image of a pupil of a human eye.
  • the processing unit 12 is configured to perform image degradation processing on the pupil image acquired by the acquiring unit to obtain a degraded image.
  • the first determining unit 13 is configured to determine a relative reference image according to the pupil image acquired by the acquiring unit and the degraded image obtained by the processing unit, and the relative reference image is a convolution of the pupil image and the degraded image.
  • the second determining unit 14 is configured to determine an image quality evaluation index according to a normalized value of the maximum gradient of the pupil image and an image structure similarity, wherein the image structure similarity is obtained by the acquiring unit and the relative position obtained by the first determining unit The structural similarity between the reference images.
  • the focusing unit 15 is configured to control the first camera to perform focusing according to the image quality evaluation index obtained by the second determining unit.
  • the normalized value of the maximum gradient of the pupil image and the similarity of the image structure in the embodiment of the present invention Determine the image quality evaluation index, and control the high-speed camera to focus according to the image quality evaluation index.
  • This focusing technology can control the camera to have a good focusing effect.
  • the apparatus 10 for autofocusing according to an embodiment of the present invention may correspond to a method of autofocusing according to an embodiment of the present invention, and each unit/module in the apparatus 10 and the other operations and/or functions described above are respectively implemented to implement the controller of FIG. The corresponding flow of the illustrated method is not repeated here for brevity.
  • FIG. 4 is a block diagram of an apparatus for autofocusing according to another embodiment of the present invention.
  • the apparatus 20 for autofocusing in FIG. 4 may be the controller of FIGS. 1 and 2, and the controller may be used to control high speed camera autofocus.
  • the controller 20 can include a processor 21 and a memory 22.
  • the various components of device 20 are coupled together by a bus system 23, which in addition to the data bus includes a power bus, a control bus, and a status signal bus.
  • bus system 23 which in addition to the data bus includes a power bus, a control bus, and a status signal bus.
  • bus system 23 can include read only memory and random access memory and provides instructions and data to processor 21.
  • a portion of the memory 22 may also include a non-volatile random access memory.
  • the processor 21 can be a general purpose processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic device, a discrete gate or transistor logic device, a discrete hardware component, and can be implemented or executed in an embodiment of the invention.
  • a general purpose processor can be a microprocessor or any conventional processor or the like.
  • the method disclosed in the foregoing embodiment of the present invention may be applied to the processor 21 or implemented by the processor 21.
  • the steps performed by the controller in FIG. 2 in the foregoing method embodiment may be completed by an integrated logic circuit of hardware in the processor 21 or an instruction in a form of software.
  • the processor 21 can read the information in the memory 22 and complete the steps of the method embodiments in conjunction with its hardware.
  • the processor 21 can be used to acquire a pupil image of a pupil of a human eye.
  • the processor 21 can also be configured to perform image degradation processing on the acquired pupil image to obtain a degraded image.
  • the processor 21 is further configured to determine a relative reference image according to the acquired pupil image and the degraded image obtained by the image degradation processing, and the relative reference image is a convolution of the pupil image and the degraded image.
  • the processor 21 is further configured to determine an image quality evaluation index according to a normalized value of the maximum gradient of the pupil image and an image structure similarity, wherein the image structure similarity is a structural similarity between the pupil image and the relative reference image.
  • the processor 21 can also be configured to control the first camera to perform focusing according to the image quality evaluation index.
  • the image quality evaluation index is determined by the normalized value of the maximum gradient of the pupil image and the image structure similarity, and the high-speed camera is controlled according to the image quality evaluation index, and the focus technology can control the camera to have a good focusing effect.
  • the apparatus 20 for autofocusing according to an embodiment of the present invention may correspond to a method of autofocusing according to an embodiment of the present invention, and each unit/module in the apparatus 20 and the other operations and/or functions described above are respectively implemented to implement the controller of FIG.
  • the processor 21 can perform the corresponding processes of the corresponding method in FIG. 2 of the foregoing method embodiment.
  • FIG. 2 For brevity, no further details are provided herein.
  • the disclosed systems, devices, and methods may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed.
  • the functional units in the various embodiments of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the functions may be stored in a computer readable storage medium if implemented in the form of a software functional unit and sold or used as a standalone product.
  • the technical solution of the present invention which is essential or contributes to the prior art, or a part of the technical solution, may be embodied in the form of a software product, which is stored in a storage medium, including Several instructions are used to make a computer device (which can be a personal computer, a server, Or a network device or the like) performing all or part of the steps of the method of the various embodiments of the present invention.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Studio Devices (AREA)
  • Automatic Focus Adjustment (AREA)

Abstract

提供了一种自动对焦的方法、装置和系统。其方法包括:获取人眼瞳孔的瞳孔图像,对瞳孔图像进行图像退化处理,得到退化图像,并根据瞳孔图像和退化图像确定相对参考图像,相对参考图像为瞳孔图像和退化图像的卷积。根据瞳孔图像的最大梯度的归一化值和图像结构相似度确定图像质量评价指标。最后,根据图像质量评价指标控制摄像头进行对焦。其中,图像结构相似度为所述瞳孔图像和所述相对参考图像之间的结构相似度。该方法可以控制摄像头具有良好的对焦效果。

Description

自动对焦的方法、装置和系统
本申请要求于2015年12月16日提交中国专利局、申请号为201510951729.1、发明名称为“自动对焦的方法、装置和系统”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及人机交互领域,并且更具体地,涉及自动对焦的方法、装置和系统。
背景技术
人眼追踪作为人机交互领域的一项热门技术,吸引了很多科研学者以及工业厂商参与到其中的研究与应用。利用人眼相关视觉信息进行相应的操作控制相比于通过其他肢体或者辅助设备具有一定的便利性。进行人眼追踪的前提是捕捉人眼运动的视频信息。图像的成像效果受到各种环境的影响,例如,高速摄像头下曝光值无法调高导致图像偏暗、图像灰度值偏低或者图像信噪比低。因此,图像的质量直接影响了人眼追踪的效果。自动对焦技术是系统获取清晰图像的重要前提和保障。图像质量评价指标的优劣又对系统的自动对焦技术产生直接影响。
在确定图像质量评价指标时,根据是否拥有参考图像可以分为全参考、半参考和无参考图像质量评价。结合实际情况,无参考图像质量评价的方式更加适合实际工程应用,例如,通过高速摄像头拍摄视频得到的图像曝光值低、信噪比差,导致没有参考图像。目前常用的图像质量评价方法可以分为空域和频域两类。在实际应用过程中进行频域评价虽然具有一定的抗噪性,但需要进行相应的频域变换,计算复杂,会消耗较大的计算量。采用空域评价的方法虽然计算量小,但重用的空间梯度、方差等图像质量评价函数容易受到噪声的影响,抗噪性较差。如何根据系统的实际性能选择合理的图像质量评价指标是实现自动对焦技术的关键。
发明内容
本发明实施例提供一种自动对焦的方法、装置和系统,可以控制摄像头 具有良好的对焦效果。
第一方面,提供了一种自动对焦的方法,包括:获取人眼瞳孔的瞳孔图像;对所述瞳孔图像进行图像退化处理,得到退化图像;根据所述瞳孔图像和所述退化图像确定相对参考图像,所述相对参考图像为所述瞳孔图像和所述退化图像的卷积;根据所述瞳孔图像的梯度的归一化值和图像结构相似度确定图像质量评价指标,其中,所述图像结构相似度为所述瞳孔图像和所述相对参考图像之间的结构相似度;根据所述图像质量评价指标控制第一摄像头进行对焦。
本发明实施例通过瞳孔图像的最大梯度的归一化值和图像结构相似度确定图像质量评价指标,并根据图像质量评价指标控制摄像头进行对焦,这种对焦技术可以控制摄像头具有良好的对焦效果。
结合第一方面,在第一方面的一种实现方式中,所述方法还包括:将所述瞳孔图像划分为大小相等的N个块区域,N为正整数;从所述N个块区域中选择K个块区域作为K个瞳孔图像块区域,K为正整数,K≤N;从所述相对参考图像中选择与所述K个瞳孔图像块区域相对应的K个相对参考图像块区域;确定块区域结构相似度,所述块区域结构相似度为所述K个瞳孔图像块区域和所述K个参考图像块区域之间的结构相似度;将所述块区域结构相似度作为所述图像结构相似度。
作为本发明的一个实施例,K的数值可以预先设定,也可以是经验值,还可以根据当前的瞳孔图像来确定。
结合第一方面及其上述实现方式,在第一方面的另一种实现方式中,所述方法还包括:确定所述瞳孔图像的对比敏感度;根据N和所述瞳孔图像的对比敏感度确定K。
本发明实施例中,当由瞳孔图像确定K值时,可以使得图像质量评价指标与瞳孔图像直接相关联,使得图像质量评价指标更有利于控制器控制摄像头自动对焦。
结合第一方面及其上述实现方式,在第一方面的另一种实现方式中,所述确定所述瞳孔图像的对比敏感度包括:根据所述瞳孔图像中每个块区域的像素宽度、人眼到所述摄像头的距离、所述瞳孔图像中每个块区域的每个像素点的位置确定每个像素点的空间频率;根据所述每个像素点的空间频率确定所述瞳孔图像的归一化空间频率;根据所述瞳孔图像的归一化空间频率确 定所述瞳孔图像的对比敏感度。
结合第一方面及其上述实现方式,在第一方面的另一种实现方式中,所述每个像素点的空间频率为:
Figure PCTCN2016087587-appb-000001
其中,
Figure PCTCN2016087587-appb-000002
所述瞳孔图像的归一化空间频率为:
Figure PCTCN2016087587-appb-000003
所述瞳孔图像的对比敏感度为:
Figure PCTCN2016087587-appb-000004
选取的块区域的数目为:K=N×P;
a为人眼视角,L表示图像的宽度,D表示人眼到所述摄像头的距离,u,v分别为每个像素点经过频域变换后在频域中的位置的横纵坐标,x′,y′分别为频域图像经过偏移之后的中心位置的横纵坐标,fmin表示空间频率f的最小值,fmax表示空间频率f的最大值。
结合第一方面及其上述实现方式,在第一方面的另一种实现方式中,所述方法还包括:根据所述瞳孔图像确定所述瞳孔图像的梯度;根据所述瞳孔图像的梯度确定所述瞳孔图像的梯度的归一化值。
本发明实施例中采用图像结构相似度作为图像质量评价指标的因素之一。当仅仅用图像结构相似度作为图像质量评价指标时,瞳孔图像的图像结构相似度的峰值可能不唯一,导致控制器控制摄像头进行自动对焦的效果不理想。本发明实施例中采用瞳孔图像的最大梯度的归一化值作为图像结构相似度的权重,将局部图像的峰值在一定范围内下降,而使得整个图像的峰值更突出。理想的图像质量评价指标为先增后减的曲线,峰值唯一,当图像质量评价指标取峰值时,摄像头所处的位置对焦效果最佳。
本发明实施例中还可以采用其它量作为图像结构相似度的权重。
结合第一方面及其上述实现方式,在第一方面的另一种实现方式中,所述瞳孔图像的梯度的归一化值为所述瞳孔图像的最大梯度的归一化值;其中,所述方法还包括:根据所述瞳孔图像的梯度的最大值确定所述瞳孔图像的最大梯度的归一化值。
结合第一方面及其上述实现方式,在第一方面的另一种实现方式中,以Rect表示所述瞳孔图像,则所述瞳孔图像的梯度为:
Figure PCTCN2016087587-appb-000005
所述瞳孔图像的最大梯度的归一化值为:W=Max/Maxmium,
其中,
Figure PCTCN2016087587-appb-000006
表示卷积运算,Rb由以下组成:
Figure PCTCN2016087587-appb-000007
Max表示所述瞳孔图像的最大梯度,其表达式如下:
Figure PCTCN2016087587-appb-000008
Maxmium表示所述瞳孔图像的最大理论梯度。
结合第一方面及其上述实现方式,在第一方面的另一种实现方式中,述获取人眼瞳孔的瞳孔图像包括:控制所述第二摄像头捕捉人物目标;根据所述人物目标确定人的脸部位置;根据人的脸部位置调节所述第一摄像头的云台,使得所述第一摄像头拍摄到人脸图像;对所述人脸图像进行二值化处理,得到处理图像;获取所述处理图像的亮度区域的轮廓;根据所述轮廓的面积确定所述瞳孔图像。
第二方面,提供了一种自动对焦的装置,所述装置包括:获取单元,用于获取人眼瞳孔的瞳孔图像;处理单元,用于对所述获取单元获取的所述瞳孔图像进行图像退化处理,得到退化图像;第一确定单元,用于根据所述获取单元获取的所述瞳孔图像和所述处理单元得到的所述退化图像确定相对参考图像,所述相对参考图像为所述瞳孔图像和所述退化图像的卷积;第二确定单元,用于根据所述瞳孔图像的梯度的归一化值和图像结构相似度确定图像质量评价指标,其中,所述图像结构相似度为所述获取单元得到的瞳孔图像和所述第一确定单元得到的所述相对参考图像之间的结构相似度;对焦单元,用于根据所述第二确定单元得到的所述图像质量评价指标控制第一摄像头进行对焦。
结合第二方面,在第二方面的一种实现方式中,所述装置还包括:划分单元,用于将所述瞳孔图像划分为大小相等的N个块区域,N为正整数;第一选取单元,用于从所述N个块区域中选择K个块区域作为K个瞳孔图像块区域,K为正整数,K≤N;第二选取单元,用于从所述相对参考图像中选择与所述K个瞳孔图像块区域相对应的K个相对参考图像块区域;第三确定单元,用于确定块区域结构相似度,所述块区域结构相似度为所述K个瞳孔图像块区域和所述K个参考图像块区域之间的结构相似度;第四确定单元,用于将所述块区域结构相似度作为所述图像结构相似度。
结合第二方面及其上述实现方式,在第二方面的另一种实现方式中,所述装置还包括:第五确定单元,用于确定所述瞳孔图像的对比敏感度;第六 确定单元,用于根据N和所述瞳孔图像的对比敏感度确定K。
结合第二方面及其上述实现方式,在第二方面的另一种实现方式中,所述第五确定单元具体用于根据所述瞳孔图像中每个块区域的像素宽度、人眼到所述第一摄像头的距离、所述瞳孔图像中每个块区域的每个像素点的位置确定每个像素点的空间频率,根据所述每个像素点的空间频率确定所述瞳孔图像的归一化空间频率,并根据所述瞳孔图像的归一化空间频率确定所述瞳孔图像的对比敏感度。
结合第二方面及其上述实现方式,在第二方面的另一种实现方式中,所述每个像素点的空间频率为:
Figure PCTCN2016087587-appb-000009
其中,
Figure PCTCN2016087587-appb-000010
所述瞳孔图像的归一化空间频率为:
Figure PCTCN2016087587-appb-000011
所述瞳孔图像的对比敏感度为:
Figure PCTCN2016087587-appb-000012
选取的块区域的数目为:K=N×P;
a为人眼视角,L表示图像的宽度,D表示人眼到所述第一摄像头的距离,u,v分别为每个像素点经过频域变换后在频域中的位置的横纵坐标,x′,y′分别为频域图像经过偏移之后的中心位置的横纵坐标,fmin表示空间频率f的最小值,fmax表示空间频率f的最大值。
结合第二方面及其上述实现方式,在第二方面的另一种实现方式中,所述瞳孔图像的梯度的归一化值为所述瞳孔图像的最大梯度的归一化值;其中,所述装置还包括归一化单元,所述归一化单元用于根据所述瞳孔图像的梯度的最大值确定所述瞳孔图像的最大梯度的归一化值。
结合第二方面及其上述实现方式,在第二方面的另一种实现方式中,以Rect表示所述瞳孔图像,则所述瞳孔图像的梯度为:
Figure PCTCN2016087587-appb-000013
所述瞳孔图像的最大梯度的归一化值为:W=Max/Maxmium,
其中,
Figure PCTCN2016087587-appb-000014
表示卷积运算,Rb由以下组成:
Figure PCTCN2016087587-appb-000015
Max表示所述瞳孔图像的最大梯度,其表达式如下:
Figure PCTCN2016087587-appb-000016
Maxmium表示所述瞳孔图像的最大理论梯度。
结合第二方面及其上述实现方式,在第二方面的另一种实现方式中,所述获取单元具体用于控制所述第二摄像头捕捉人物目标,并根据所述人物目标确定人的脸部位置,根据人的脸部位置调节所述第一摄像头的云台,使得所述第一摄像头拍摄到人脸图像,对所述人脸图像进行二值化处理,得到处理图像,获取所述处理图像的亮度区域的轮廓,并根据所述轮廓的面积确定通过所述瞳孔图像。
本发明实施例中控制摄像头自动对焦的装置的相应模块和/或器件的各个操作可以参照第一方面中的方法的各个步骤,在此不再重复。
第三方面,提供了一种自动对焦的系统,包括第一摄像头、第二摄像头和上述第二方面的任一种实现方式中的控制第一摄像头自动对焦的装置,其中,所述装置与所述第一摄像头连接,所述装置与所述第二摄像头连接。
在本发明的一个实施例中,上述系统可以为人机交互系统或视频监控系统。
上述具体实现方式中,第一摄像头可以为高速摄像头,第二摄像头可以为广角摄像头。本发明实施例对第一摄像头、第二摄像头不进行具体限定。当第一摄像头为高速摄像头时,由于高速摄像头拍摄得到的图像曝光值低、信噪比差,导致没有参考图像而很难控制其对焦,通过本发明实施例的方法可以控制高速摄像头具有良好的对焦效果。
附图说明
为了更清楚地说明本发明实施例的技术方案,下面将对本发明实施例中所需要使用的附图作简单地介绍,显而易见地,下面所描述的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是可应用本发明实施例的人机交互系统的场景的示意图。
图2是本发明一个实施例的自动对焦的方法的示意性流程图。
图3是本发明一个实施例的自动对焦的装置的框图。
图4是本发明另一实施例的自动对焦的装置的框图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行 清楚、完整地描述,显然,所描述的实施例是本发明的一部分实施例,而不是全部实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例,都应属于本发明保护的范围。
图1是可应用本发明实施例的人机交互系统的场景的示意图。
图1所示的人机交互系统包括第一摄像头11、第二摄像头12和控制器13。控制器13可以用于控制第一摄像头11的自动对焦,换句话说,控制第一摄像头11自动对焦的装置可以为图1中的控制器。其中,控制器13可以与第一摄像头11连接,控制器13还可以与广角摄像头12连接。第一摄像头11和广角摄像头12可以用于拍摄图像,例如拍摄人眼瞳孔14的图像。
在本发明的一个实施例中,第一摄像头可以为高速摄像头,第二摄像头可以为广角摄像头,本发明后面的实施例中以此为例进行示例性说明。应理解,高速摄像头和广角摄像头仅作为本发明中第一摄像头和第二摄像头的一个举例说明,并不对本申请的保护范围构成限定。
在本发明的一个实施例中,广角摄像头可以用于捕捉人物目标,高速摄像头可以用于聚焦人眼区域,拍摄人眼的瞳孔。即,使用广角摄像头对拍摄目标进行粗略搜索定位,再使用高速摄像头进一步精确定位所需的瞳孔图像,这种配合使用广角摄像头和高速摄像头可以更快更准地获取瞳孔图像,能够提高摄像头对焦的效率。
控制器可以对摄像头拍摄到的瞳孔图像进行处理,得到图像质量评价指标,并根据图像质量评价指标控制第一摄像头自动对焦。
本发明实施例可以用于视频监控,通过控制器控制第一摄像头对焦之后,对第一摄像头拍摄的图像进行跟踪监控等。
下面结合图2并以第一摄像头为高速摄像头、第二摄像头为广角摄像头为例对本发明自动对焦的方法进行详细说明。
图2是本发明一个实施例的自动对焦的方法的示意性流程图。图2的方法可以用于视频监控系统,视频监控系统可以包括高速摄像头、广角摄像头和控制器。图2的方法可以由控制器执行,本发明实施例中以控制高速摄像头自动对焦的装置为控制器为例进行示例性说明。下面结合具体实施例详细介绍控制器控制高速摄像头自动对焦的方法。
201,获取人眼瞳孔的瞳孔图像。
控制器可以获取的瞳孔图像,可以是高速摄像头拍摄的,也可以是其它 摄像头拍摄得到的。
例如,控制器可以通过下列方法获取高速摄像头拍摄的人眼瞳孔的瞳孔图像:控制广角摄像头捕捉人物目标,并根据人物目标确定人的脸部位置,再根据人的脸部位置调节高速摄像头的云台,使得高速摄像头拍摄到人脸图像,对人脸图像进行二值化处理,得到处理图像,最后获取处理图像的亮度区域的轮廓,根据轮廓的面积确定瞳孔图像。
在本发明的一个实施例中,通过获取高速摄像头拍摄的人眼瞳孔的瞳孔图像确定图像质量评价指标,进而控制高速摄像头自动对焦,这样通过使用高速摄像头自身拍摄的图像计算图像质量评价指标,更有利于对焦的准确性,可以使得高速摄像头具有更好的对焦效果。
在本发明的一个实施例中,控制器可以控制广角摄像头搜素并定位人物目标。当人物目标移动时,广角摄像头可以捕捉到移动中的人物目标,找出人脸区域,以使得后续高速摄像头确定瞳孔图像。这种实现方式不受检测的人物目标移动或姿势改变的影响,从而使得后续根据瞳孔图像得到的图像质量评价指标不受人物目标移动或姿势改变的影响。
控制器可以从人脸图像的视频流中选出一帧图像,并画出图像的灰度直方图hist。控制器可以根据图像的灰度直方图确定对图像进行二值化处理的阈值。
例如,视频获取的图像大小记为R×C,例如,2048×1088,R表示图像的宽度,C表示图像的高度,R和C的单位为像素。根据图像大小的实际情况选取上述图像的灰度直方图与坐标轴之间构成的面积总和的95%处对应的图像的灰度值作为图像二值化处理的阈值T,
Figure PCTCN2016087587-appb-000017
T=N,
上式中i表示图像的灰度值,例如,处理无符号8位灰度图像时,i的取值范围为从0到255。
根据上述得到的图像二值化处理的阈值T,对人脸图像IM(x,y)进行二值化处理。
Figure PCTCN2016087587-appb-000018
其中,IM表示获取得到的灰度图像,(x,y)为相应的坐标点位置。
高速摄像头的帧率一般比较大,例如帧率为300fps,此时,图像的曝光值比较低,图像整体的灰度值不高、信噪比差。在对图像进行二值化处理后由于噪声产生的离散干扰点较多,因此需要对图像进行相应的形态学开运算处理。
由于噪声影响可能比较大,经过形态学开运算处理后的人脸图像可能仍然存在一定的干扰点。控制器可以查找处理后图像的轮廓,并根据轮廓的面积的大小确定瞳孔图像的位置,进而根据瞳孔面积确定瞳孔图像。例如,可以应用开源计算机视觉库(Open computer vision,Opencv)中的轮廓检测(findcontours)函数用于获取相应的轮廓。对得到的轮廓进行相应的面积判定,如果所有的轮廓面积均很小,而且可以判断得到图像中包括人眼区域的人脸图像,那么可以通过人脸图像的轮廓的面积的大小确定瞳孔图像的位置。如果所有的轮廓面积均很小,且可以判断得到图像中不包括人眼区域的人脸图像,此时返回到视频流中,从视频流中重新选择图像,或者,根据广角摄像头重新定位人脸区域,直至获取瞳孔图像。当判定得到人脸图像的轮廓面积在预设范围内时,认为该轮廓包括瞳孔图像。例如,可以将该轮廓所在的位置确定为瞳孔图像的位置,该位置处的图像即可以视为瞳孔图像。
本发明实施例中的控制器结合广角摄像头、高速摄像头获取人眼瞳孔的瞳孔图像,这样获取的瞳孔图像更为准确,更有利于后续根据瞳孔图像确定图像质量评价指标,从而使得控制器控制高速摄像头的对焦更为精准。
202,对瞳孔图像进行图像退化处理,得到退化图像。
用F(x,y)表示瞳孔图像,对瞳孔图像进行退化处理,得到退化图像S(x,y)。
根据高速摄像头离焦时,图像的模糊原理可知,
Figure PCTCN2016087587-appb-000019
其中,M(x,y)为离焦图像,N(x,y)为噪声图像,
Figure PCTCN2016087587-appb-000020
表示卷积运算,
∫∫S(x,y)dxdy=1
退化图像可以根据经验使用下列高斯模型来模拟:
Figure PCTCN2016087587-appb-000021
203,根据瞳孔图像和退化图像确定相对参考图像。
在实际人眼追踪过程中,由于高速摄像头拍摄到的图像质量较差,无法在进行图像质量评价之前确定出任意一帧清晰的图像作为聚焦与离焦的参考图像,此时采用无参考图像质量评价的方式。
在本发明的一个实施例中,可以根据上述图像离焦的模糊原理,对当前采集到的瞳孔图像进行退化处理,例如,对瞳孔图像进行高斯低通滤波,得到退化图像。控制器可以将瞳孔图像F(x,y)和退化图像S(x,y)的卷积所得的图像作为相对参考图像G(x,y):
Figure PCTCN2016087587-appb-000022
204,根据瞳孔图像的梯度的归一化值和图像结构相似度确定图像质量评价指标,其中,图像结构相似度为瞳孔图像和相对参考图像之间的结构相似度。
作为本发明的一个实施例,控制器可以通过下列方式得到瞳孔图像的梯度的归一化值。例如,控制器可以根据瞳孔图像确定瞳孔图像的梯度,并根据瞳孔图像的梯度确定瞳孔图像的梯度的归一化值。
优选地,控制器可以根据瞳孔图像的梯度的最大值确定瞳孔图像的最大梯度的归一化值。
在本发明的一个实施例中,可以通过瞳孔图像的梯度的最大值确定瞳孔图像的最大图像的归一化值,通过这样的归一化值得到的图像质量评价指标峰值尽可能唯一,图像质量评价指标的函数图像曲线升降更明显,有利于高速摄像头更好的实现对焦。
具体地,以Rect表示所述瞳孔图像,则所述瞳孔图像的梯度为:
Figure PCTCN2016087587-appb-000023
其中,Rb可以由以下组成:
Figure PCTCN2016087587-appb-000024
瞳孔图像的最大梯度的归一化值为:
W=Max/Maxmium,
Max表示瞳孔图像的最大梯度,其表达式如下:
Figure PCTCN2016087587-appb-000025
Maxmium表示瞳孔图像的最大理论梯度。
作为本发明的一个实施例,控制器可以通过下列方式得到上述图像结构相似度。例如,将瞳孔图像划分为大小相等的N个块区域,N为正整数。从 N个块区域中选择K个块区域作为K个瞳孔图像块区域,K为正整数,K≤N。从相对参考图像中选择与K个瞳孔图像块区域相对应的K个相对参考图像块区域确定上述块区域结构相似度,其中,块区域结构相似度为K个瞳孔图像块区域和K个参考图像块区域之间的结构相似度。K可以为预设值,也可以为经验值,还可以是根据瞳孔图像确定的数值。
本发明实施例中,通过选择K个瞳孔图像块区域和K个相对参考图像块区域来计算上述块区域结构相似度,K的数值可以预先设定或取经验值,这样可以避免利用整个图像的所有块区域计算区域结构相似度,能够减少计算区域结构相似度的复杂性。
作为本发明的一个实施例,控制器可以通过下列方式根据瞳孔图像确定K的数值。例如,控制器可以确定瞳孔图像的对比敏感度,并根据N和瞳孔图像的对比敏感度确定K。
本发明实施例中通过N和瞳孔图像的对比敏感度确定K,可以尽可能选择合适的K值,这样能够在减少计算区域结构相似度的复杂性的同时保证区域结构相似度尽可能准确。
作为本发明的一个实施例,控制器可以通过下列方式确定瞳孔图像的对比敏感度。例如,控制器可以根据瞳孔图像中每个块区域的像素宽度、人眼到所述高速摄像头的距离、瞳孔图像中每个块区域的每个像素点的位置确定每个像素点的空间频率。根据每个像素点的空间频率确定瞳孔图像的归一化空间频率。并根据瞳孔图像的归一化空间频率确定瞳孔图像的对比敏感度。
当通过瞳孔图像确定得到K值时,图像结构相似度与此时的瞳孔图像直接相关。利用该图像结构相似度得到的图像质量评价指标也与图像直接相关,这样能够根据瞳孔图像更好地控制高速摄像头自动对焦,即对焦效果更好。
具体地,正常的人眼视角在一定的角度范围内只能识别有限周数的光栅。人眼视角a计算的公式为:
Figure PCTCN2016087587-appb-000026
上式中L表示图像的宽度,单位为厘米。D表示人眼到高速摄像头的距离。
图像中每个点的经过频域变换之后在频域中的位置为(u,v),频域图像经过偏移之后的中心坐标为(x′,y′),则对应每个点的空间频率为:
Figure PCTCN2016087587-appb-000027
Figure PCTCN2016087587-appb-000028
Figure PCTCN2016087587-appb-000029
其中,fs表示计算得到的瞳孔图像中每个点的空间频率。
控制器可以根据瞳孔图像中每个点的空间频率计算得到瞳孔图像的归一化空间频率ff:
Figure PCTCN2016087587-appb-000030
Figure PCTCN2016087587-appb-000031
其中,Δf的计算是利用整个图像的x与y方向的空间频率和的平方根,fmin表示空间频率的最小值,fmax表示空间频率的最大值。
控制器可以根据瞳孔图像的归一化空间频率ff计算得到评价瞳孔图像的对比敏感度为:
Figure PCTCN2016087587-appb-000032
控制器可以由瞳孔图像的对比敏感度和瞳孔区域的块区域的个数N,计算得出选取的Sobel梯度幅值图像的块区域的K值的数目:
K=N×P。
控制器得到K值之后,可以从瞳孔图像F(x,y)中选出K个块区域,并从相对参考图像G(x,y)中选出与上述K个块区域相对应的K个块区域,并计算当前图像F(x,y)的K个块区域与G(x,y)的K个区域的块区域结构相似度。以SSIM表示每个块区域的结构相似度,上述块区域结构相似度为K个块区域中每个块区域的结构相似度的和。每个块区域的结构相似度SSIM可以由下列公式得到:
Figure PCTCN2016087587-appb-000033
Figure PCTCN2016087587-appb-000034
Figure PCTCN2016087587-appb-000035
SSIM=lαmβnγ
上式中l、m和n分别代表灰度值、对比度和结构信息对比度的衡量参数,μF、μG分别表示F(x,y)和G(x,y)对应块区域的均值,σF、σG分别表示F(x,y)和G(x,y)对应块区域的标准差,σFG表示二值对应块区域的标准协方差。α、β、γ表示每个参数在相似度SSIM结果中的权重大小,α、β、γ可以根据经验得到相应的数值。
在本发明的一个实施例中,可以通过下列方式计算图像F(x,y)基于索贝尔(Sobel)算子的梯度。Sobel算子可以分为水平方向算子hx和垂直方向算子vy。例如:
Figure PCTCN2016087587-appb-000036
由图像F(x,y)、hx和vy可以得到水平梯度、垂直梯度和梯度幅值分别为:
Figure PCTCN2016087587-appb-000037
Figure PCTCN2016087587-appb-000038
Figure PCTCN2016087587-appb-000039
控制器在确定K值之后,可以选取F(x,y)中的K个区域。作为本发明的一个实施例,控制器可以根据F(x,y)的梯度幅值确定K个区域的具体位置。例如,控制器可以选择梯度幅值较大的K个区域作为所选择的图像F(x,y)的K个块区域。
控制器在得到块区域结构相似度SSIM之后,可以将块区域结构相似度作为整幅瞳孔图像的图像结构相似度FSSIM:
Figure PCTCN2016087587-appb-000040
作为本发明的一个实施例,在得到瞳孔图像的最大梯度的归一化值和图像结构相似度之后,控制器可以根据瞳孔图像的最大梯度的归一化值W和图 像结构相似度FSSIM确定图像质量评价指标LSSIM。例如,
LSSIM=W×FSSIM。
本发明实施例的控制高速摄像头自动对焦的方法具有一定的抗干扰能力,并根据瞳孔图像选择合适的K值,使得在保证一定的抗干扰能力的同时,尽量减小计算量。
205,根据图像质量评价指标控制高速摄像头进行对焦。
控制器在得到图像质量评价指标之后,可以根据图像质量评价指标控制高速摄像头进行对焦。
例如,设定控制高速摄像头自动对焦前的初始位置,高速摄像头当前所处的位置L,摄像头移动步长的最小值Smin,当前设定的移动步长S,初始移动的方向为正方向。
控制器可以调节高速摄像头到上述自动对焦前的初始位置,准备开始自动对焦。沿当前方向以步长S调节高速摄像头的位置,并间隔步长+S记录移动高速摄像头时计算得到的图像质量评价指标和对应的高速摄像头所处的位置。
在本发明第一个实施例中,控制器可以高速摄像头所处的位置为横坐标,以图像质量评价指标为纵坐标时,画出图像质量评价函数。当图像质量评价函数出现图像质量评价指标依次递减,则证明所得到的图像开始离焦,因此停止调节高速摄像头。控制器也可以直接根据记录的图像质量评价指标随高速摄像头所处的位置的变化,得到图像质量评价指标最优时高速摄像头所处的位置。
在本发明的一个实施例中,基于图像质量评价指标控制高速摄像头对焦时,在一定范围内可能出现图像质量评价指标随着高速摄像头所处的位置先增大后减小再增大的情况,控制器可以设定在以若干像素的步长范围内图像质量评价指标仅出现一个峰值时,将该峰值对应的高速摄像头位置确认为控制高速摄像头对焦的位置。当在以若干像素的步长范围内图像质量评价指标出现若干个峰值时,控制器可以重新计算图像质量评价指标,并控制高速摄像头进行对焦。
在高速摄像头移动结束之后立即返回到之前遍历时记录的图像质量评价指标最大值对应的高速摄像头的位置处。此时认为对焦效果最好,对焦结束。
本发明实施例通过瞳孔图像的最大梯度的归一化值和图像结构相似度确定图像质量评价指标,并根据图像质量评价指标控制高速摄像头进行对焦,这种对焦技术可以控制摄像头具有良好的对焦效果。尤其是对于曝光值低或信噪比的红外图像,本发明实施例具有更好的对焦效果。
本发明实施例中的图像质量评价指标依赖于瞳孔图像,不受环境中其它因素的影响,因此,本发明实施例的控制高速摄像头自动对焦的方法具有良好的抗干扰能力。
本发明实施例的控制高速摄像头自动对焦的方法,可以用于视频监控系统,该视频监控系统可以包括高速摄像头、广角摄像头和控制器即可实现高速摄像头的自动对焦。本发明实施例的设备需求简单,方案简单易行。当利用人眼瞳孔进行图像跟踪时,仅通过跟踪瞳孔的移动即可实现对图像的跟踪,控制器可以通过广角摄像头定位人脸位置后,通过高速摄像头聚焦人眼区域,图像质量评价指标的源图像(例如这里的瞳孔图像)不受检测目标的移动与姿势等的影响。
上文结合图2详细说明用于本发明实施例的自动对焦的方法及具体流程,下面结合图3和图4详细说明用于本发明实施例的自动对焦的装置。
图3是本发明一个实施例的自动对焦的装置的框图。
图3的装置可执行图2流程图中的方法。图3的装置10包括获取单元11、第一确定单元12、第二确定单元13和对焦单元14。图3的控制高速摄像头自动对焦的装置10可以为图1和图2中的控制器。
获取单元11用于获取人眼瞳孔的瞳孔图像。
处理单元12用于对获取单元获取的瞳孔图像进行图像退化处理,得到退化图像。
第一确定单元13用于根据获取单元获取的瞳孔图像和处理单元得到的退化图像确定相对参考图像,相对参考图像为瞳孔图像和退化图像的卷积。
第二确定单元14用于根据瞳孔图像的最大梯度的归一化值和图像结构相似度确定图像质量评价指标,其中,图像结构相似度为获取单元得到的瞳孔图像和第一确定单元得到的相对参考图像之间的结构相似度。
对焦单元15用于根据第二确定单元得到的图像质量评价指标控制第一摄像头进行对焦。
本发明实施例通过瞳孔图像的最大梯度的归一化值和图像结构相似度 确定图像质量评价指标,并根据图像质量评价指标控制高速摄像头进行对焦,这种对焦技术可以控制摄像头具有良好的对焦效果。
根据本发明实施例的自动对焦的装置10可对应于本发明实施例自动对焦的方法,并且,该装置10中的各个单元/模块和上述其他操作和/或功能分别为了实现图2中控制器执行的所示方法的相应流程,为了简洁,在此不再赘述。
图4是本发明另一实施例的自动对焦的装置的框图。
图4中自动对焦的装置20可以为图1和图2中的控制器,控制器可以用于控制高速摄像头自动对焦。控制器20可以包括处理器21和存储器22。装置20的各个组件通过总线系统23耦合在一起,其中总线系统23除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见,在图中将各种总线都标为总线系统23。存储器22可以包括只读存储器和随机存取存储器,并向处理器21提供指令和数据。存储器22的一部分还可以包括非易失性随机存取存储器。处理器21可以是通用处理器、数字信号处理器、专用集成电路、现场可编程门阵列或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件,可以实现或者执行本发明实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者任何常规的处理器等。
上述本发明实施例揭示的方法可以应用于处理器21中,或者由处理器21实现。在实现过程中,上述方法实施例图2中控制器执行的各步骤可以通过处理器21中的硬件的集成逻辑电路或者软件形式的指令完成。处理器21可以读取存储器22中的信息,结合其硬件完成方法实施例的步骤。
具体地,处理器21可以用于获取人眼瞳孔的瞳孔图像。
处理器21还可以用于对获取的瞳孔图像进行图像退化处理,得到退化图像。
处理器21还可以用于根据获取的瞳孔图像和图像退化处理得到的退化图像确定相对参考图像,相对参考图像为瞳孔图像和退化图像的卷积。
处理器21还可以用于根据瞳孔图像的最大梯度的归一化值和图像结构相似度确定图像质量评价指标,其中,图像结构相似度为瞳孔图像和相对参考图像之间的结构相似度。
处理器21还可以用于根据图像质量评价指标控制第一摄像头进行对焦。
本发明实施例通过瞳孔图像的最大梯度的归一化值和图像结构相似度确定图像质量评价指标,并根据图像质量评价指标控制高速摄像头进行对焦,这种对焦技术可以控制摄像头具有良好的对焦效果。
根据本发明实施例的自动对焦的装置20可对应于本发明实施例自动对焦的方法,并且,该装置20中的各个单元/模块和上述其他操作和/或功能分别为了实现图2中控制器执行的所示方法的相应流程,例如,处理器21可以执行上述方法实施例图2中相应方法的相应流程,为了简洁,在此不再赘述。
应理解,说明书通篇中提到的“一个实施例”或“一实施例”意味着与实施例有关的特定特征、结构或特性包括在本发明的至少一个实施例中。因此,在整个说明书各处出现的“在一个实施例中”或“在一实施例中”未必一定指相同的实施例。此外,这些特定的特征、结构或特性可以任意适合的方式结合在一个或多个实施例中。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。
在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器, 或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以所述权利要求的保护范围为准。

Claims (17)

  1. 一种自动对焦的方法,其特征在于,包括:
    获取人眼瞳孔的瞳孔图像;
    对所述瞳孔图像进行图像退化处理,得到退化图像;
    根据所述瞳孔图像和所述退化图像确定相对参考图像,所述相对参考图像为所述瞳孔图像和所述退化图像的卷积;
    根据所述瞳孔图像的梯度的归一化值和图像结构相似度确定图像质量评价指标,其中,所述图像结构相似度为所述瞳孔图像和所述相对参考图像之间的结构相似度;
    根据所述图像质量评价指标控制第一摄像头进行对焦。
  2. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    将所述瞳孔图像划分为大小相等的N个块区域,N为正整数;
    从所述N个块区域中选择K个块区域作为K个瞳孔图像块区域,K为正整数,K≤N;
    从所述相对参考图像中选择与所述K个瞳孔图像块区域相对应的K个相对参考图像块区域;
    确定块区域结构相似度,所述块区域结构相似度为所述K个瞳孔图像块区域和所述K个参考图像块区域之间的结构相似度;
    将所述块区域结构相似度作为所述图像结构相似度。
  3. 根据权利要求2所述的方法,其特征在于,所述方法还包括:
    确定所述瞳孔图像的对比敏感度;
    根据N和所述瞳孔图像的对比敏感度确定K。
  4. 根据权利要求3所述的方法,其特征在于,所述确定所述瞳孔图像的对比敏感度包括:
    根据所述瞳孔图像中每个块区域的像素宽度、人眼到所述第一摄像头的距离、所述瞳孔图像中每个块区域的每个像素点的位置确定每个像素点的空间频率;
    根据所述每个像素点的空间频率确定所述瞳孔图像的归一化空间频率;
    根据所述瞳孔图像的归一化空间频率确定所述瞳孔图像的对比敏感度。
  5. 根据权利要求4所述的方法,其特征在于,
    所述每个像素点的空间频率为:
    Figure PCTCN2016087587-appb-100001
    其中,
    Figure PCTCN2016087587-appb-100002
    所述瞳孔图像的归一化空间频率为:
    Figure PCTCN2016087587-appb-100003
    所述瞳孔图像的对比敏感度为:
    Figure PCTCN2016087587-appb-100004
    选取的块区域的数目为:K=N×P;
    a为人眼视角,L表示图像的宽度,D表示人眼到所述第一摄像头的距离,u,v分别为每个像素点经过频域变换后在频域中的位置的横纵坐标,x′,y′分别为频域图像经过偏移之后的中心位置的横纵坐标,fmin表示空间频率f的最小值,fmax表示空间频率f的最大值。
  6. 根据权利要求1-5任一项所述的方法,其特征在于,所述瞳孔图像的梯度的归一化值为所述瞳孔图像的最大梯度的归一化值;
    其中,所述方法还包括:
    根据所述瞳孔图像的梯度的最大值确定所述瞳孔图像的最大梯度的归一化值。
  7. 根据权利要求6所述的方法,其特征在于,
    以Rect表示所述瞳孔图像,则所述瞳孔图像的梯度为:
    Figure PCTCN2016087587-appb-100005
    所述瞳孔图像的最大梯度的归一化值为:W=Max/Maxmium,
    其中,
    Figure PCTCN2016087587-appb-100006
    表示卷积运算,Rb由以下组成:
    Figure PCTCN2016087587-appb-100007
    Figure PCTCN2016087587-appb-100008
    Max表示所述瞳孔图像的最大梯度,其表达式如下:
    Figure PCTCN2016087587-appb-100009
    Maxmium表示所述瞳孔图像的最大理论梯度。
  8. 根据权利要求1-7任一项所述的方法,其特征在于,所述获取人眼瞳孔的瞳孔图像包括:
    控制所述第二摄像头捕捉人物目标;
    根据所述人物目标确定人的脸部位置;
    根据人的脸部位置调节所述第一摄像头的云台,使得所述第一摄像头拍摄到人脸图像;
    对所述人脸图像进行二值化处理,得到处理图像;
    获取所述处理图像的亮度区域的轮廓;
    根据所述轮廓的面积确定通过所述瞳孔图像。
  9. 一种自动对焦的装置,其特征在于,所述装置包括:
    获取单元,用于获取人眼瞳孔的瞳孔图像;
    处理单元,用于对所述获取单元获取的所述瞳孔图像进行图像退化处理,得到退化图像;
    第一确定单元,用于根据所述获取单元获取的所述瞳孔图像和所述处理单元得到的所述退化图像确定相对参考图像,所述相对参考图像为所述瞳孔图像和所述退化图像的卷积;
    第二确定单元,用于根据所述瞳孔图像的梯度的归一化值和图像结构相似度确定图像质量评价指标,其中,所述图像结构相似度为所述获取单元得到的瞳孔图像和所述第一确定单元得到的所述相对参考图像之间的结构相似度;
    对焦单元,用于根据所述第二确定单元得到的所述图像质量评价指标控制第一摄像头进行对焦。
  10. 根据权利要求9所述的装置,其特征在于,所述装置还包括:
    划分单元,用于将所述瞳孔图像划分为大小相等的N个块区域,N为正整数;
    第一选取单元,用于从所述N个块区域中选择K个块区域作为K个瞳孔图像块区域,K为正整数,K≤N;
    第二选取单元,用于从所述相对参考图像中选择与所述K个瞳孔图像块区域相对应的K个相对参考图像块区域;
    第三确定单元,用于确定块区域结构相似度,所述块区域结构相似度为所述K个瞳孔图像块区域和所述K个参考图像块区域之间的结构相似度;
    第四确定单元,用于将所述块区域结构相似度作为所述图像结构相似度。
  11. 根据权利要求10所述的装置,其特征在于,所述装置还包括:
    第五确定单元,用于确定所述瞳孔图像的对比敏感度;
    第六确定单元,用于根据N和所述瞳孔图像的对比敏感度确定K。
  12. 根据权利要求11所述的装置,其特征在于,所述第五确定单元具体用于根据所述瞳孔图像中每个块区域的像素宽度、人眼到所述第一摄像头的距离、所述瞳孔图像中每个块区域的每个像素点的位置确定每个像素点的 空间频率,根据所述每个像素点的空间频率确定所述瞳孔图像的归一化空间频率,并根据所述瞳孔图像的归一化空间频率确定所述瞳孔图像的对比敏感度。
  13. 根据权利要求12所述的装置,其特征在于,
    所述每个像素点的空间频率为:
    Figure PCTCN2016087587-appb-100010
    其中,
    Figure PCTCN2016087587-appb-100011
    所述瞳孔图像的归一化空间频率为:
    Figure PCTCN2016087587-appb-100012
    所述瞳孔图像的对比敏感度为:
    Figure PCTCN2016087587-appb-100013
    选取的块区域的数目为:K=N×P;
    a为人眼视角,L表示图像的宽度,D表示人眼到所述第一摄像头的距离,u,v分别为每个像素点经过频域变换后在频域中的位置的横纵坐标,x′,y′分别为频域图像经过偏移之后的中心位置的横纵坐标,fmin表示空间频率f的最小值,fmax表示空间频率f的最大值。
  14. 根据权利要求9-13任一项所述的装置,其特征在于,所述瞳孔图像的梯度的归一化值为所述瞳孔图像的最大梯度的归一化值;
    其中,所述装置还包括归一化单元,所述归一化单元用于根据所述瞳孔图像的梯度的最大值确定所述瞳孔图像的最大梯度的归一化值。
  15. 根据权利要求14所述的装置,其特征在于,
    以Rect表示所述瞳孔图像,则所述瞳孔图像的梯度为:
    Figure PCTCN2016087587-appb-100014
    所述瞳孔图像的最大梯度的归一化值为:W=Max/Maxmium,
    其中,
    Figure PCTCN2016087587-appb-100015
    表示卷积运算,Rb由以下组成:
    Figure PCTCN2016087587-appb-100016
    Figure PCTCN2016087587-appb-100017
    Max表示所述瞳孔图像的最大梯度,其表达式如下:
    Figure PCTCN2016087587-appb-100018
    Maxmium表示所述瞳孔图像的最大理论梯度。
  16. 根据权利要求9-15任一项所述的装置,其特征在于,所述获取单元具体用于控制所述第二摄像头捕捉人物目标,并根据所述人物目标确定人的脸部位置,根据人的脸部位置调节所述第一摄像头的云台,使得所述第一摄像头拍摄到人脸图像,对所述人脸图像进行二值化处理,得到处理图像, 获取所述处理图像的亮度区域的轮廓,并根据所述轮廓的面积确定通过所述瞳孔图像。
  17. 一种自动对焦的系统,其特征在于,所述系统包括第一摄像头、第二摄像头和如权利要求9-16任一项所述的装置,其中,所述装置与所述第一摄像头连接,所述装置与所述第二摄像头连接。
PCT/CN2016/087587 2015-12-16 2016-06-29 自动对焦的方法、装置和系统 WO2017101292A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510951729.1 2015-12-16
CN201510951729.1A CN106791353B (zh) 2015-12-16 2015-12-16 自动对焦的方法、装置和系统

Publications (1)

Publication Number Publication Date
WO2017101292A1 true WO2017101292A1 (zh) 2017-06-22

Family

ID=58965355

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/087587 WO2017101292A1 (zh) 2015-12-16 2016-06-29 自动对焦的方法、装置和系统

Country Status (2)

Country Link
CN (1) CN106791353B (zh)
WO (1) WO2017101292A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111797993A (zh) * 2020-06-16 2020-10-20 东软睿驰汽车技术(沈阳)有限公司 深度学习模型的评价方法、装置、电子设备及存储介质

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107422749B (zh) * 2017-07-06 2021-03-12 深圳Tcl数字技术有限公司 电视的方位调整方法、装置、电视及计算机可读存储介质
CN109448037B (zh) * 2018-11-14 2020-11-03 北京奇艺世纪科技有限公司 一种图像质量评价方法及装置
CN111010507B (zh) * 2019-11-26 2021-08-03 迈克医疗电子有限公司 相机自动聚焦方法和装置、分析仪器和存储介质
CN114373216B (zh) * 2021-12-07 2024-07-02 图湃(北京)医疗科技有限公司 用于眼前节octa的眼动追踪方法、装置、设备和存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100231504A1 (en) * 2006-03-23 2010-09-16 Koninklijke Philips Electronics N.V. Hotspots for eye track control of image manipulation
CN101976444A (zh) * 2010-11-11 2011-02-16 浙江大学 一种基于像素类型的结构类似性图像质量客观评价方法
CN202602795U (zh) * 2012-06-04 2012-12-12 深圳市强华科技发展有限公司 面向线阵ccd的自动调焦系统
CN103067662A (zh) * 2013-01-21 2013-04-24 天津师范大学 一种自适应视线跟踪系统
US20130187773A1 (en) * 2012-01-19 2013-07-25 Utechzone Co., Ltd. Gaze tracking password input method and device utilizing the same
CN104834446A (zh) * 2015-05-04 2015-08-12 惠州Tcl移动通信有限公司 一种基于眼球追踪技术的显示屏多屏控制方法及系统

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002035452A1 (en) * 2000-10-24 2002-05-02 Alpha Engineering Co., Ltd. Eye image obtaining method, iris recognizing method, and system using the same
CN1180368C (zh) * 2003-05-22 2004-12-15 上海交通大学 虹膜识别系统的图像质量评价方法
CN102421007B (zh) * 2011-11-28 2013-09-04 浙江大学 基于多尺度结构相似度加权综合的图像质量评价方法
CN102740114B (zh) * 2012-07-16 2016-12-21 公安部第三研究所 一种视频主观质量的无参评估方法
JP2014098835A (ja) * 2012-11-15 2014-05-29 Canon Inc 顕微鏡用照明光学系およびこれを用いた顕微鏡

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100231504A1 (en) * 2006-03-23 2010-09-16 Koninklijke Philips Electronics N.V. Hotspots for eye track control of image manipulation
CN101976444A (zh) * 2010-11-11 2011-02-16 浙江大学 一种基于像素类型的结构类似性图像质量客观评价方法
US20130187773A1 (en) * 2012-01-19 2013-07-25 Utechzone Co., Ltd. Gaze tracking password input method and device utilizing the same
CN202602795U (zh) * 2012-06-04 2012-12-12 深圳市强华科技发展有限公司 面向线阵ccd的自动调焦系统
CN103067662A (zh) * 2013-01-21 2013-04-24 天津师范大学 一种自适应视线跟踪系统
CN104834446A (zh) * 2015-05-04 2015-08-12 惠州Tcl移动通信有限公司 一种基于眼球追踪技术的显示屏多屏控制方法及系统

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111797993A (zh) * 2020-06-16 2020-10-20 东软睿驰汽车技术(沈阳)有限公司 深度学习模型的评价方法、装置、电子设备及存储介质
CN111797993B (zh) * 2020-06-16 2024-02-27 东软睿驰汽车技术(沈阳)有限公司 深度学习模型的评价方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
CN106791353B (zh) 2019-06-14
CN106791353A (zh) 2017-05-31

Similar Documents

Publication Publication Date Title
CN108496350B (zh) 一种对焦处理方法及设备
WO2017101292A1 (zh) 自动对焦的方法、装置和系统
CN110691193B (zh) 摄像头切换方法、装置、存储介质及电子设备
US7912252B2 (en) Time-of-flight sensor-assisted iris capture system and method
US8203602B2 (en) Depth-aware blur kernel estimation method for iris deblurring
US9373023B2 (en) Method and apparatus for robustly collecting facial, ocular, and iris images using a single sensor
US10659676B2 (en) Method and apparatus for tracking a moving subject image based on reliability of the tracking state
WO2021057652A1 (zh) 对焦方法和装置、电子设备、计算机可读存储介质
WO2017043031A1 (en) Image processing apparatus, solid-state imaging device, and electronic apparatus
CN109376729B (zh) 虹膜图像采集方法及装置
CN111080542B (zh) 图像处理方法、装置、电子设备以及存储介质
CN109981972B (zh) 一种机器人的目标跟踪方法、机器人及存储介质
US10594939B2 (en) Control device, apparatus, and control method for tracking correction based on multiple calculated control gains
CN111246093B (zh) 图像处理方法、装置、存储介质及电子设备
US20200221005A1 (en) Method and device for tracking photographing
CN112800966B (zh) 一种视线的追踪方法及电子设备
WO2022021093A1 (zh) 拍摄方法、拍摄装置及存储介质
TWI641265B (zh) Mobile target position tracking system
CN106842496B (zh) 基于频域比较法的自动调节焦点的方法
CN109598195B (zh) 一种基于监控视频的清晰人脸图像处理方法与装置
KR20080079506A (ko) 촬영장치 및 이의 대상 추적방법
Hui et al. An improved focusing algorithm based on image definition evaluation
CN107959767B (zh) 一种以电视跟踪结果为导向的调焦调光方法
KR101070448B1 (ko) 표적 추적 방법 및 그 장치
Liu et al. Real time auto-focus algorithm for eye gaze tracking system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16874356

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16874356

Country of ref document: EP

Kind code of ref document: A1