Palm key point positioning method based on convolutional neural network
Technical Field
The invention relates to the technical field of palm key point positioning, in particular to a palm key point positioning method based on a convolutional neural network.
Background
The palm print and palm vein feature recognition technology is generally realized by acquiring a palm image under visible light or near infrared light of a palm by using a camera and performing the steps of preprocessing, identifying area positioning, feature extraction, comparison and matching and the like on the palm image. The positioning of the identification area is a basic link of palm print and palm vein identification, and how to quickly, accurately and high-quality position the identification area is a very critical step and directly influences the performance of the whole identification system. Generally, locating the identification area requires locating key points of the palm image, and intercepting the identification area by the located key points. In general, the outline of the palm can be described by the edge information between the palm image and the collected background, and then the key point can be positioned.
The invention discloses a biological characteristic region positioning method in Chinese patent application publication No. CN102542242A, which comprises the steps of binarizing a biological characteristic image, removing an image background, denoising, obtaining edge point information, positioning key points, and determining a biological characteristic region according to the key points; the invention discloses a method for extracting a palm-shaped image by performing image segmentation on a palm-shaped area according to a posterior probability map and palm-shaped edge information of a foreground image in patent application publication specification CN 104361339A; the invention patent application publication specification CN106991380A discloses a method for extracting an image contour of a binarized palm vein image by using a Canny algorithm, positioning a finger root point according to a search method, and obtaining a midpoint as a key point by using a finger root point connecting line, so as to obtain an roi (region of interest) image.
Although the method for positioning the key points according to the edge information and the contour features can determine a relatively fixed identification area, the method needs to have higher requirements on the completeness and the clarity of the edge information and the contour features, and the high-quality key point positioning and identification area is often difficult to obtain under the condition of changing factors such as light, visual angle, background and distance.
Disclosure of Invention
The invention aims to solve the defects in the prior art, and provides a method for positioning the palm key points based on a convolutional neural network, which is used for defining the segment of the finger knuckle line, positioning the middle point of the lower end joint line segment of the lower finger knuckle, connecting the middle points of the lower end joint line segments of two adjacent finger knuckles to obtain a connecting line, positioning the middle point of the connecting line as the key point of a palm, and obtaining 3 key points from an index finger to four fingers of a little finger.
The method for obtaining the palm key point by positioning the middle point of the knuckle line segment can also obtain stable middle point positioning of the knuckle line segment under the condition that edge information and contour characteristics are changed to a certain extent, and by utilizing the advantages of a convolutional neural network in image processing, a better key point positioning model can be obtained through a large amount of training and learning modes, so that the fast and accurate palm key point positioning under large data volume is realized.
In order to achieve the purpose, the invention provides the following technical scheme: a palm key point positioning method based on a convolutional neural network comprises the following specific steps:
s1, collecting palm images, marking key point information, inputting the key point information as a training sample set into a convolutional neural network, and training the network;
s2, detecting a palm image by a first layer of the convolutional neural network, dividing the palm image into a finger area and a palm area, and collecting the finger area image as a data set;
s3, the second layer carries out key point positioning on the finger area image data set collected by the first layer of convolutional neural network, positions 6 key points of each finger, and cuts out 4 finger images as a data set;
s4, positioning the middle point of the lower knuckle line segment of each finger and the farthest point of the fingertip end away from the middle point of the lower knuckle line segment in the corresponding finger range, wherein the middle point and the farthest point of the fingertip end of the lower knuckle line segment are used as 2 key points of the finger;
s5, connecting the midpoints of the joint line segments at the lower ends of the lower knuckles of the two adjacent fingers by the convolutional neural network, taking the midpoints of connecting lines as palm key points, and defining 3 palm key points among the four fingers as GapB, GapC and GapD respectively.
Preferably, the palm image in step S1 is acquired by a shooting device, and the image is preprocessed by using an image enhancement technique to make the palm image meet the format requirement, and the palm image is subjected to key point labeling and input as a sample set for training a convolutional neural network and trained.
Preferably, the convolutional neural network in step S1 includes a convolutional layer and a pooling layer, the convolutional layer is mainly used for the calculation of the feature map, and the pooling layer is mainly used for reducing the size of the feature map while maintaining the rotation and translation characteristics of the feature map, which is as follows:
when the characteristic diagrams meet the requirements of the designed size and layer number, the two-dimensional characteristic diagrams are arranged in sequence and converted into one-dimensional characteristic vectors, and finally the one-dimensional characteristic vectors are connected and output through a full-connection layer, wherein the operation of the convolutional layer can be expressed as:
wherein, X(l,k)Set k of profiles, n, representing the output of the l-th layerlNumber of layers, W, representing the characteristic diagram of the l-th layer(l,k,p)The filter required for mapping the p-th group of characteristic graphs in the l-1 layer to the k-th group of characteristic graphs in the l layer is shown, and n is required for generating each group of characteristic graphs of the l layerl-1A filter and an offset;
the pooling layer adopts a maximum pooling method, the size of the feature image after maximum pooling is reduced to 1/step according to the step length step, and the form of maximum pooling can be expressed as:
wherein, X(l+1,k)(m, n) is the value at the kth group of characteristic diagram coordinates (m, n) output by the l +1 th layer, s is the size of the pooling kernel, step is the step length when the pooling kernel moves, and s and step are both set to be 2 in the invention.
Preferably, the finger key points in step S3 are marked by using two end points of the finger-joint line lower end line segment on the image as key points, and each finger has 3 finger-joint line lower end line segments, so that each finger can be positioned to obtain 6 finger region key points.
Preferably, the rotation angle of each finger region is estimated according to the output result of the second layer convolutional neural network, each finger is corrected according to the estimated rotation angle, and the corrected image is collected as a new training sample.
Preferably, the key points in step S4 are the midpoint of the line segment at the lower end of the located lower knuckle line of the finger and the farthest point from the fingertip end of the midpoint of the line segment within the corresponding finger range, and these two points are used as 2 key points of the finger; the finger lower knuckle is defined as an upper knuckle, a middle knuckle and a lower knuckle in sequence from the fingertip.
Preferably, in step S4, the output result of the third layer convolutional neural network is obtained by performing angle convolution on each finger image according to the rotation angle of each finger in the image correction step, and combining the convoluted finger images into a finger region image and collecting the finger region image as a new training sample.
Preferably, the palm key points in step S5 are defined as GapB, GapC and GapD, respectively, GapB being the key point between the index finger and the middle finger, GapC being the key point between the middle finger and the ring finger, and GapD being the key point between the ring finger and the little finger.
Compared with the prior art, the invention has the beneficial effects that: according to the palm key point positioning method, the positioning of the palm key points can be quickly and accurately obtained, the positioning of relevant feature points is carried out by combining the fixed features of the finger joint lines and the self-learning advantages of the convolutional neural network, the variability of key point positioning only depending on edge information and contour features can be avoided, and the positioning is more accurate.
Drawings
FIG. 1 is a diagram of the convolutional neural network architecture of the present invention;
FIG. 2 is a schematic diagram of the positioning of the end points of the finger joint segments according to the present invention;
FIG. 3 is a schematic diagram of the positioning of 2 key points of the finger according to the present invention;
FIG. 4 is a schematic diagram of a palm key location network according to the present invention;
FIG. 5 is a schematic diagram of the positioning and marking of the palm key points according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Example 1
Referring to fig. 1-5, the present invention provides a technical solution: a palm key point positioning method based on a convolutional neural network comprises 4 convolutional networks, each layer of each convolutional network consists of a convolutional layer, a pooling layer and a full-connection layer, the convolutional neural network performs convolution and pooling on input images for multiple times, and finally palm images positioned by key points are output through the full-connection layer, and the implementation steps are as follows:
step S1, collecting palm images, marking key point information, inputting the key point information as a training sample set into a convolutional neural network, and training the network;
further, step S1 is to acquire a palm image through the palm image acquisition device, perform key point positioning and labeling on the acquired image, input the palm image labeled with key point information as a training sample image into the constructed convolutional neural network for training, and acquire a convolutional neural network model for positioning the palm key point.
Step S2, detecting a palm image by a first layer of the convolutional neural network, dividing the palm image into a finger area and a palm area, and collecting the finger area image as a data set;
further, in the first layer of the convolutional network described in step S2, the input palm image is divided into four finger regions, i.e., an index finger region, a middle finger region, a ring finger region and a little finger region, and the finger region image is intercepted as a data set for performing key point positioning.
The convolutional neural network has a structure as shown in fig. 1, in the convolutional neural network, convolutional layers are mainly used for calculating a feature map, and pooling layers are mainly used for reducing the size of the feature map, and simultaneously, the rotation and translation characteristics of the feature map are maintained. And when the characteristic diagrams meet the designed size and layer number requirements, the two-dimensional characteristic diagrams are arranged in sequence and converted into one-dimensional characteristic vectors, and finally, the one-dimensional characteristic vectors are connected and output through the full-connection layer. The convolution layer operation can be expressed as:
wherein, X(l,k)Set k of profiles, n, representing the output of the l-th layerlNumber of layers, W, representing the characteristic diagram of the l-th layer(l,k,p)Showing the required filters when the p-th set of profiles in layer l-1 is mapped to the k-th set of profiles in layer l. N is needed for generating each group of feature maps of the l-th layerl-1One filter and one offset.
Typical pooling methods include maximum pooling, mean pooling, and the like, and the convolutional neural network in the present invention uses maximum pooling. The size of the feature image after the maximum value pooling is reduced to 1/step according to the step size step. The form of maximum pooling can be expressed as:
wherein, X(l+1,k)(m, n) is the value at the kth set of profile coordinates (m, n) output by layer l + 1. s is the size of the pooled kernel, step is the step length of the pooled kernel when moving, and both s and step are set to be 2 in the invention.
Step S3, the second layer carries out key point positioning on the finger area image data set collected by the first layer of convolutional neural network, positions 6 key points of each finger, and cuts out 4 finger images as a data set;
further, the second layer of the convolutional neural network carries out key point positioning on the finger region image collected by the first layer, two end points of the joint line segment at the lower end of the finger joint of each finger are positioned, the joint line segments at the lower ends of 3 finger joints of 1 finger can be positioned to 6 end points in total, and finger images of four fingers are cut out as a data set according to the positioned end points and contour information.
As shown in fig. 2, the two end points of the knuckle Segment of the palm image are located, and as an example of the two end points of the knuckle Segment of the Index Finger (Index Finger), the two end points of the knuckle Segment at the lower end of the upper knuckle (Tip Segment) of the Index Finger are denoted as TI1(Topsegment of Index Finger 1) and TI2(Top Segment of Index Finger 2), respectively; similarly, the two end points of the lower joint Segment of the Middle knuckle (Middle Segment) are denoted as MI1(Middle Segment of Index finger 1) and MI2(MI 2: Middle Segment of Index finger 2), respectively; two end points of the lower joint Segment of the lower knuckle (Base Segment) are denoted as BI1(Base Segment of Index finger 1) and BI2(Base Segment of Index finger 2), respectively. In turn, each finger can be positioned to 6 keypoints.
S4, positioning the middle point of the lower-end joint line segment of the lower finger joint of each finger and the farthest point of the fingertip end away from the middle point of the lower finger joint in the corresponding finger range, wherein the middle point of the lower-end joint line segment and the farthest point of the fingertip end are used as 2 key points of the finger;
further, the third layer of the convolutional neural network positions two end points of a joint line segment at the lower end of a finger segment under four fingers in a finger area of the palm image, takes the midpoint of the joint line segment at the lower end of the finger segment as a key point, positions the farthest point from the fingertip end of the midpoint of the off-line segment in the corresponding finger range, and the midpoint of the joint line segment and the farthest point from the fingertip end are 2 key points of the fingers.
As shown in fig. 3, a schematic diagram of positioning 2 key points of a finger, a lower end joint line segment BI1-BI2 of a lower finger knuckle of an Index finger is taken, a midpoint MIB (middle point of Index finger Base kuckle) of the line segment is taken, a farthest point from a fingertip end of the midpoint MIB of the line segment is positioned within a corresponding range of the finger, the farthest point is TopI (top point of Index finger), and then the midpoint MIB of the line segment and the farthest point TopI are 2 key points of the positioned finger.
And S5, connecting the midpoints of the lower joint line segments of the two adjacent lower knuckles of the two fingers by the convolutional neural network, taking the midpoint of the connecting line as a palm key point, and defining 3 palm key points among the four fingers as GapB, GapC and GapD respectively.
Further, the fourth layer of the convolutional neural network connects the midpoints between two adjacent fingers to the palm image obtained by the third layer and used as a connecting line at two ends based on the midpoint of the lower end line segment of the lower finger segment, and locates the midpoint of the connecting line, wherein the midpoint is used as a key point of the palm, namely the location of finger gap points between four fingers and is respectively marked as GapB, GapC and GapD. As shown in fig. 3, the Middle point MIB of the Middle point finger Middle joint line segment of the lower end of the lower finger knuckle of the index finger and the Middle point MMB (Middle point of Middle finger Base kneckle) of the lower end of the lower finger knuckle of the Middle finger can be used as two endpoints to connect, so as to obtain the connection line MIB-MMB, and the Middle point of the connection line segment is located and marked as GapB and used as the palm key point. As shown in fig. 5, which is a schematic diagram of the positioning and marking of the key points of the palm, the above-mentioned technical method can be used to obtain 3 gap points between four fingers, namely, the key points of the palm, i.e., GapB, GapC and GapD.
The palm key point positioning method based on the convolutional neural network can obtain stable palm key points, is favorable for quickly and accurately obtaining a high-quality palm recognition area, and improves the system performance of a palm print or palm vein recognition technology.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be considered to be within the technical scope of the present invention, and the technical solutions and the inventive concepts thereof according to the present invention should be equivalent or changed within the scope of the present invention.