CN112232332B

CN112232332B - Non-contact palm detection method based on video sequence

Info

Publication number: CN112232332B
Application number: CN202011499311.9A
Authority: CN
Inventors: 赵国栋; 杨爽; 李学双; 张烜
Original assignee: Sichuan Shengdian Century Technology Co ltd
Current assignee: Jiangsu Shengdian Century Technology Co ltd
Priority date: 2020-12-17
Filing date: 2020-12-17
Publication date: 2021-04-13
Anticipated expiration: 2040-12-17
Also published as: CN112232332A

Abstract

The invention relates to a non-contact palm detection method based on a video sequence, which comprises the following steps: 1) acquiring a palm video image under non-contact imaging; 2) carrying out skin color detection on each frame of image to obtain a skin color area binary image; 3) detecting a moving target to obtain a moving area binary image; 4) counting the number of white pixel points of the binary image of the motion area at the time t, comparing the number with a threshold value, and continuing to perform the step 5) if the number is larger than the threshold value; 5) fusing the skin color area binary image and the motion area binary image; 6) removing the wrist area; 7) and extracting the HOG characteristic vector to finally judge the palm area. Aiming at the palm detection problem under non-contact imaging, the invention improves the elliptical skin color model to carry out skin color detection by adopting self-adaptive illumination compensation based on gamma conversion and a mode of carrying out segmentation judgment on ellipses with different sizes according to the brightness value, and the obtained region characteristics improve the accuracy of subsequent palm detection.

Description

Non-contact palm detection method based on video sequence

Technical Field

The invention belongs to the technical field of palm vein identification and information safety under non-contact imaging, and particularly relates to a non-contact palm detection method based on a video sequence.

Background

The non-contact palm vein recognition technology is a new biological characteristic recognition technology, and the palm vein recognition technology has the advantages of safety and difficulty in counterfeiting because the palm vein recognition technology acquires internal information of a human body. The contact palm vein is easy to cause the rejection psychology of a user in the aspect of sanitation due to the large contact area of the contact palm vein, the development of a palm vein identification technology is not facilitated, the non-contact palm vein identification technology is not limited by a fixing device, the user does not need to be in direct contact with equipment, hidden dangers in the aspect of sanitation can be eliminated, a novel biological feature identification experience is brought to the user, and the application and development prospect is wide.

Compared with a contact palm vein recognition technology, the non-contact palm vein recognition technology increases the steps of palm detection and palm interested region extraction, wherein the palm detection plays an important role as the first step of the whole recognition system.

Chinese patent CN104102347A discloses a fingertip positioning method, which can be used for extracting palm contours, and includes the following steps: acquiring a gesture image containing a gesture; extracting a palm outline from the gesture image; carrying out polygonal approximation on the palm outline to obtain an outline convex hull; and sequentially judging whether each salient point of the outline convex hull is a fingertip point. The fingertip positioning method obtains the outline convex hull by performing polygon approximation on the outline of the palm, and the fingertips in the gesture usually correspond to the positions of the salient points in the outline convex hull. The fingertip positioning method only needs to judge whether each salient point is a fingertip point, so that the number of the fingertip points needing to be judged is reduced, the calculated amount is reduced, the running speed is improved, and the real-time requirement is met; meanwhile, the curvature calculation of the edge points on the palm profile is not needed, so that certain errors are allowed in the extracted palm profile, and the requirement on the precision of the extracted palm profile is reduced.

Chinese patent CN109350018A discloses an image-based palm detection method for a palm herpes detection system, comprising the following steps: acquiring an RGB color image; converting the RGB color image into a YCrCb three-color image, obtaining a Cr image, and then carrying out binarization on the Cr image; skin color detection, obtaining skin color outline and convex outline characteristic detection; judging whether the convex profile characteristic number is larger than zero, and screening out a candidate palm area if the convex profile characteristic number is larger than zero; and judging whether the candidate palm region is a palm contour region. The patent performs skin color detection by extracting Cr components of YCrCb color space from the palm region to obtain a palm profile and performs palm detection according to its convex profile features.

The non-contact palm vein recognition technology has the advantages that the palm is placed at random due to the fact that the non-contact palm vein recognition technology is not guided by a fixed position in the palm collecting process, images are prone to deformation such as translation and rotation and are prone to being affected by the surrounding environment, and the palm area is detected from a complex background with certain difficulty.

Disclosure of Invention

The invention aims to solve the problem that a non-contact palm vein recognition technology is easily influenced by the surrounding environment in the image acquisition process, and provides a non-contact palm detection method based on a video sequence, which can effectively detect a palm area and prepare for the subsequent recognition process.

In order to achieve the purpose, the technical scheme provided by the invention is as follows:

the invention relates to a non-contact palm detection method based on a video sequence, which comprises the following steps:

1) acquiring a palm video image under non-contact imaging, wherein a first frame image is a background without a palm, and the palm is placed from a second frame image;

2) carrying out skin color detection on each frame of image, and obtaining a skin color area binary image I in a frame t₁；

3) Detecting the moving object and obtaining a motion area binary image I in the t frame₂；

4) Statistic t frame motion area binary image I₂Comparing the number of the white pixel points with a threshold value, and continuing to perform the step 5) if the number is larger than the threshold value;

5) binary image I of skin color area in t frame₁And exercise ofRegional binary image I₂Performing fusion to obtain a fused binary image I₃Performing morphological treatment on the obtained product, and reserving a maximum communication area;

6) for the fused binary image I₃Removing the wrist area to obtain a palm area binary image I₄；

7) Extracting a palm region binary image I₄The final judgment of the palm region is performed by the HOG feature of (1).

Preferably, the step 2) of performing skin color detection on the current frame image according to the improved elliptical skin color model specifically includes:

2.1) carrying out adaptive illumination compensation based on gamma transformation on the current frame image, and converting from an RGB color space to a YCbCr color space:

wherein Y is a luminance component of a color in the YCbCr color space, Cb, Cr are density offset components of blue and red in the YCbCr color space, respectively, R, G, B are colors of three channels of red, green, and blue in the RGB color space, respectively, c is a constant,

in order to be the gamma conversion coefficient(s),

th0 is the pixel threshold value and is 0 ≦ th0 ≦ 255;

2.2) clustering the skin points according to the elliptical skin color model, wherein the elliptical skin color model is as follows:

wherein, x in the formula (3) is an abscissa in the ellipse, y is an ordinate in the ellipse, ecx and ecy are an abscissa and an ordinate of the center point of the ellipse, a is the major axis of the ellipse, and b is the minor axis of the ellipse; theta in the formula (4) is the rotation angle of the ellipse, Cb and Cr are the concentration offset components of blue and red colors in the YCbCr color space, respectively,

is the horizontal coordinate of the center point of the skin color model in the CbCr coordinate system,

the longitudinal coordinate of the skin color model at the central point of the CbCr coordinate system;

2.3) dividing elliptical models with different sizes according to the Y-channel brightness value, and judging whether the pixels are skin points according to whether the pixels are positioned in an ellipse with corresponding brightness:

wherein, I₁(i, j) is the skin color region binary image pixel value, Y (i, j) is the Y channel pixel value, i is the row number index of the image, j is the column number index of the image,

the average brightness of the Y channel is shown, ecx and ecy are the abscissa and ordinate of the center point of the ellipse, a is the major axis of the ellipse, and b is the minor axis of the ellipse.

Preferably, the step 3) of detecting the moving target according to the ViBe algorithm includes the following specific steps:

3.1) initializing the background model with the first frame image to

As a neighborhood point set around point x, a background model is constructed:

wherein the content of the first and second substances,

are the pixels in the initial background model,

for the pixel values in the original image,

is a set of neighborhood points around point x, y is

Point (2).

3.2) updating the background model.

Preferably, the specific steps in step 4) include:

4.1) statistical motion region binary map I₂The number of white pixels:

wherein num is a motion region binary image I₂M, N respectively represent the line height and the line width of the current frame image, i is the line index of the image, and j is the line index of the image;

4.2) according to I₂The number of white areas (th 1) is a threshold value:

wherein, I₂Is a motion region binary image with num of I₂Th1 is a quantity threshold value, th1 is an integer;

4.3) if the current frame image has a moving target, performing step 5), if no moving target exists, judging whether the t-1 frame image contains a rectangular frame mark, if so, continuing to mark the same position of the t frame image, and if not, returning to the step 1) to detect the t +1 frame image.

Preferably, the specific steps in the step 5) include:

5.1) mixing I₁And I₂Performing fusion with the operation:

wherein, I₃To fuse binary images, I₁Is a binary map of the skin color region, I₂The motion area binary image is shown as i, the row index of the image and j, the column index of the image;

5.2) pairs of I₃Performing morphological opening operation and closing operation processing in sequence;

5.3) searching a connected region of the result binary image after morphological processing, and reserving the maximum connected region.

Preferably, the step 6) is performed according to Hough transform pair I₃The method for removing the wrist area comprises the following specific steps:

6.1) Pair fusion binary image I₃Performing edge detection to obtain I₃Is (d) profile L₃；

6.2) pairs of L₃Carrying out Hough transform to obtain L₃The upper straight line set L comprises a starting point and an end point of each line segment;

6.3) calculation of I₃The center of gravity of (1):

wherein the content of the first and second substances,

is the abscissa of the point of gravity,

is the ordinate of the center of gravity point, ((

,

) Is the coordinates of the pixel points of the image,

is the pixel value of the point;

6.4) detecting whether a parallel straight line section below the gravity center exists in the straight line set L, and if so, detecting I₃And setting the pixel points below the straight line starting point of the middle corresponding position as 0:

wherein, I₄For a palm region binary map, x0 represents the x coordinate of the start of a parallel straight line segment located below the center of gravity, i.e., the x coordinate

I is the index of the number of rows of the image and j is the index of the number of columns of the image.

Preferably, the specific steps in step 7) include:

7.1) extracting a palm region binary image I₄HOG feature vector H;

7.2) carrying out cosine similarity calculation on the H and the HOG characteristic vector U of the palm image of the preset template:

wherein H isPalm area binary image I₄U is the HOG feature vector of the preset template, H_iIs a palm area binary image I₄Of the ith HOG feature vector, U_iIs the ith HOG feature vector of the template, n is the number of feature vectors, i is an integer and

；

7.3) comparing the cosine similarity with the similarity threshold value, and judging the palm:

wherein, I₄Is a palm area binary diagram, th2 is a similarity threshold, th2 is a constant and-1 is greater than or equal to th2 is less than or equal to 1.

7.4) Using rectangular frame pairs I₄And marking the current frame image corresponding to the palm area.

Compared with the prior art, the technical scheme provided by the invention has the following beneficial effects:

1. aiming at the palm detection problem under non-contact imaging, the invention improves the elliptical skin color model to carry out skin color detection by adopting self-adaptive illumination compensation based on gamma conversion and a mode of carrying out segmentation judgment on ellipses with different sizes according to the brightness value, and the obtained region characteristics improve the accuracy of subsequent palm detection;

2. the invention adopts a mode of combining skin color information and motion information, eliminates the interference of human face, judges whether a moving target exists according to a binary image obtained by a motion model, and solves the problem that palm staying cannot be detected;

3. according to the method, the possibly-appearing wrist area is removed according to the linear detection, the final palm area judgment is carried out according to the shape characteristics of the palm, and the average detection rate is as high as 99.5%;

4. the method is suitable for various conditions in the palm acquisition process, can more accurately detect the palm area, and lays a good foundation for subsequent feature extraction and identification.

Drawings

FIG. 1 is a flow chart of a non-contact palm detection method based on video sequences according to the present invention;

FIG. 2 is a frame of acquired palm image;

FIG. 3 is a binary image after skin color detection;

FIG. 4 is a binary image after moving object detection;

FIG. 5 is a fused binary image;

FIG. 6 is a binary image after morphological processing;

FIG. 7 is a binary image with the wrist area removed;

fig. 8 is a final palm detection result image.

Detailed Description

For further understanding of the present invention, the present invention will be described in detail with reference to examples, which are provided for illustration of the present invention but are not intended to limit the scope of the present invention.

Example 1

Referring to fig. 1, the present embodiment relates to a non-contact palm detection method based on a video sequence, which includes the following steps:

1) acquiring a palm video image under non-contact imaging, wherein the first frame image is a background without a palm, the palm is placed at the beginning of the second frame, and each frame of the image has a size of M × N, wherein M, N represents the number of rows and columns of the image respectively, and M × N is 640 × 480 in the embodiment, as shown in fig. 2.

2) Using an improved elliptical skin color model to detect skin color of each frame of image, and obtaining a skin color area binary image in a frame t

As shown in fig. 3, the specific steps include:

in order to be the gamma conversion coefficient(s),

th0 is the pixel threshold and 0 ≦ th0 ≦ 255 for the average pixel value of each component of the image. Th0 in this example is 150;

in this example

=113，

=155.5，

=2.51，ecx=1.63，ecy=2.44，a=23.20，b=15.2；

3) Detecting the moving target by using a ViBe algorithm, and obtaining a motion region binary image in a t frame

As shown in fig. 4, the specific steps include:

3.1) initializing the background model with the first frame image to

As a neighborhood point set around point x, a background model is constructed:

wherein the content of the first and second substances,

are the pixels in the initial background model,

for the pixel values in the original image,

is a set of neighborhood points around point x, y is

Point (2).

3.2) updating the background model.

4) Statistic t frame motion area binary image I₂The number of the white pixel points is compared with the threshold value, if the number is larger than the threshold value, the step 5) is continued, and the specific steps comprise:

4.1) statistical binary map

The number of white pixels:

wherein num is a motion region binary image I₂The number of white pixels M, N represents the line height and line width of the current frame image, respectively, 640 and 480 in this embodiment; i is the row index of the image, and j is the column index of the image;

4.2) according to I₂The area of the white region (th 1) is a threshold value:

wherein, I₂Is a motion region binary image with num of I₂Th1 is a number threshold, th1 is an integer, and th1 is 1000 in this embodiment;

4.3) if the current frame image has a moving target, performing step 5), if the current frame image does not have the moving target, judging whether the t-1 moment image contains a rectangular frame mark, if so, continuing to mark the same position of the t frame image, and if not, returning to the step 1) to perform detection on the t +1 moment image.

5) At t frame will

And

fusing to obtain a binary image

And performing morphological treatment on the obtained product, and reserving a maximum communication area, wherein the method specifically comprises the following steps:

5.1) mixing I₁And I₂The fusion with operation is performed as shown in fig. 5:

wherein, I₃To fuse binary images, I₁Is a binary map of the skin color region, I₂For the motion region binary image, i is the row index of the image, and j is the column index of the image.

5.3) searching a connected region of the result binary image after the morphological processing, and reserving the maximum connected region, wherein the result is shown in FIG. 6.

6) According to Hough transform pair I₃To carry outRemoving the wrist area to obtain a palm area binary image

As shown in fig. 7, the specific steps include:

6.3) calculation of I₃The center of gravity of (1):

wherein the content of the first and second substances,

is the abscissa of the point of gravity,

is the ordinate of the center of gravity point, ((

,

) Is the coordinates of the pixel points of the image,

is the pixel value of the point;

7) Extracting a binary image

The final judgment of the palm area is carried out by the HOG characteristic vector, and the specific steps comprise:

7.1) extracting a palm region binary image I₄HOG feature vector H;

wherein H is a palm area binary image I₄U is the HOG feature vector of the preset template, H_iIs a palm area binary image I₄Of the ith HOG feature vector, U_iIs the ith HOG feature vector of the template, n is the number of feature vectors, i is an integer and

；

wherein, I₄Is a palm area binary diagram, th2 is a similarity threshold, th2 is a constant, th2 is equal to or more than-1 and equal to or less than 1, in this embodiment, th2 is 0.8;

7.4) Using rectangular frame pairs I₄Marking the current frame image corresponding to the palm areaAs shown in fig. 8.

The following are experimental results and analyses of different video sequences using the detection method of the present invention.

In order to verify the non-contact palm detection method based on the video sequence, four groups of video sequences are collected, namely palm collection, palm resting collection, fist collection and palm not-off collection, wherein each group of collected 1000 palm videos respectively comprises 500 frames, and an experiment database is formed. Based on OpenCV4.1.0, Visual Studio2019 is used as compiling software, and an operating system of a computer is 64-bit Window10, a memory 8G and a main frequency of 2.30 GHz. The palm area of each group of video sequences is detected by the detection method and the fingertip positioning method disclosed in the chinese patent CN104102347A, and the detection results are shown in table 1.

TABLE 1

As can be seen from the data in table 1, for the video images with the first group and the second group placed on the palm, the detection rate of the method of the present invention is up to 99.5% on average, while the detection rate of the fingertip positioning method is 87.4%, and for the video images without the palm placed on the third group and the fourth group, the detection rate of the method of the present invention is 0, and the fingertip positioning method has more false detections, so that the non-contact palm detection method based on the video sequence provided by the present invention can effectively detect the palm area acquired under the non-contact imaging, and provides a good guarantee for the subsequent identification process.

The above-described embodiments are illustrative examples of the present invention, which are provided for better understanding of the present invention and are not to be construed as limiting the scope of the present invention. All equivalent changes, modifications and the like made in accordance with the scope of the present invention shall fall within the protection scope of the present invention.

Claims

1. A non-contact palm detection method based on a video sequence is characterized in that: which comprises the following steps:

5) binary image I of skin color area in t frame₁And a motion region binary image I₂Performing fusion to obtain a fused binary image I₃Performing morphological treatment on the obtained product, and reserving a maximum communication area;

7) Extracting a palm region binary image I₄Carrying out final judgment on the palm area by using the HOG characteristic;

the step 2) of performing skin color detection on the current frame image according to the improved elliptical skin color model comprises the following specific steps:

in order to be the gamma conversion coefficient(s),

th0 is the pixel threshold value and is 0 ≦ th0 ≦ 255;

the average brightness of a Y channel is obtained, ecx and ecy are the abscissa and ordinate of the central point of the ellipse, a is the major axis of the ellipse, and b is the minor axis of the ellipse;

the step 6) is carried out according to Hough transform pair I₃The method for removing the wrist area comprises the following specific steps:

6.3) calculation of I₃The center of gravity of (1):

wherein the content of the first and second substances,

is the abscissa of the point of gravity,

is the ordinate of the center of gravity point, ((

,

) Is the coordinates of the pixel points of the image,

is the pixel value of the point;

I is the index of the number of rows of the image, and j is the index of the number of columns of the image;

the step 7) comprises the following specific steps:

7.1) extracting a palm region binary image I₄HOG feature vector H;

；

wherein, I₄Is a palm area binary image, th2 is a similarity threshold, th2 is a constant, and-1 is more than or equal to th2 is less than or equal to 1;

2. The non-contact palm detection method based on video sequence according to claim 1, characterized in that: the step 3) of detecting the moving target according to the ViBe algorithm comprises the following specific steps:

3.1) initializing the background model with the first frame image to

As a neighborhood point set around point x, a background model is constructed:

wherein the content of the first and second substances,

are the pixels in the initial background model,

for the pixel values in the original image,

is a set of neighborhood points around point x, y is

A point of (1);

3.2) updating the background model.

3. The non-contact palm detection method based on video sequence according to claim 1, characterized in that: the specific steps in the step 4) comprise:

4.1) statistical motion region binary map I₂The number of white pixels:

4.2) according to I₂The number of white areas (2) is used for judging whether a moving object exists or not:

4.3) if the current frame image has a moving target, performing step 5), if no moving target exists, judging whether the t-1 time image contains a rectangular frame mark, if so, continuing to mark the same position of the t frame image, and if not, returning to the step 1) to detect the t +1 time image.

4. The non-contact palm detection method based on video sequence according to claim 1, characterized in that: the step 5) comprises the following specific steps:

5.1) mixing I₁And I₂Performing fusion with the operation:

wherein, I₃To fuse binary images, I₁Is a binary map of the skin color region, I₂Is a binary map of the motion region, i is a mapRow index of the image, j is column index of the image;