CN109583331B - Deep learning-based accurate positioning method for positions of wrist vein and mouth of person - Google Patents

Deep learning-based accurate positioning method for positions of wrist vein and mouth of person Download PDF

Info

Publication number
CN109583331B
CN109583331B CN201811356765.3A CN201811356765A CN109583331B CN 109583331 B CN109583331 B CN 109583331B CN 201811356765 A CN201811356765 A CN 201811356765A CN 109583331 B CN109583331 B CN 109583331B
Authority
CN
China
Prior art keywords
color
hand
picture
training
deep learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811356765.3A
Other languages
Chinese (zh)
Other versions
CN109583331A (en
Inventor
路红
汪子健
甘中学
罗静静
印文杰
商慧亮
张文强
杨友钊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fudan University
Original Assignee
Fudan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fudan University filed Critical Fudan University
Priority to CN201811356765.3A priority Critical patent/CN109583331B/en
Publication of CN109583331A publication Critical patent/CN109583331A/en
Application granted granted Critical
Publication of CN109583331B publication Critical patent/CN109583331B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Abstract

The invention belongs to the technical field of computer image processing, and particularly relates to a method for accurately positioning the position of a wrist vein opening of a person based on deep learning. The basic steps of the invention are as follows: the method comprises the steps of sampling hand pictures of multiple persons, and processing sample pictures to obtain a hand binary image and pulse coordinates, wherein the hand binary image and pulse coordinates are used as training data of a deep learning model; then, performing supervised learning on the training set to obtain a generalized deep learning model; then, respectively performing white balance, converting to HSV color space, mean Shift color clustering, extracting a binary image according to skin color, and further processing the binary image to obtain a hand outline; and (3) taking the processed binary image as input, and putting the binary image into a pre-trained deep learning model to obtain the coordinates of wrist vein points on the image. The method can find the pulse position of the wrist of the person with higher precision, and provides real-time visual positioning for the robot to perform traditional Chinese medicine pulse diagnosis.

Description

Deep learning-based accurate positioning method for positions of wrist vein and mouth of person
Technical Field
The invention belongs to the technical field of computer image processing, and particularly relates to a method for accurately positioning pulse points by deep learning.
Background
Pulse diagnosis refers to the diagnosis of a disease by a doctor's finger cutting the pulse of a patient and perceiving the image of the pulse to be indicated. The theory of pulse theory of traditional Chinese medicine is profound, the pulse diagnosis operation of traditional Chinese medicine is simple and easy, and the method is a diagnosis method which is unique in the diagnosis of traditional Chinese medicine. Traditional pulse-taking is based on the sense of touch of the physician's finger. With the rapid development of science and technology, the development of interdisciplinary is more and more important, and the combination of traditional Chinese medicine and artificial intelligence is also more and more concerned.
The robot with artificial intelligence can also perform four diagnosis of 'looking and asking about' in traditional Chinese medicine type on a patient, and in the process of performing pulse diagnosis, the robot generally realizes the pulse diagnosis link through a 'hand-eye' system, namely a mechanical arm, a camera and a pressure sensor, so that the accurate positioning of pulse mouth points is very important.
With the development of deep learning becoming more mature, the precision of object detection of deep learning has far exceeded traditional methods, and with the updating of hardware and optimization of algorithms, the detection speed is also faster and faster, and the realization of many algorithms reaches real-time level. Thus, accurate positioning of the pulse point can be accomplished entirely with deep learning.
Disclosure of Invention
The invention aims to provide a method capable of accurately positioning the positions of the wrist vein openings of a human hand.
The accurate positioning method for the wrist vein opening position of the human hand is based on deep learning; the basic steps are as follows: the method comprises the steps of sampling hand pictures of multiple persons, and processing sample pictures to obtain a hand binary image and pulse coordinates, wherein the hand binary image and pulse coordinates are used as training data of a deep learning model; then, performing supervised learning on the training set to obtain a generalized deep learning model; then, respectively performing white balance, converting to HSV color space, mean Shift color clustering, extracting a binary image according to skin color, and further processing the binary image to obtain a hand outline; and (3) taking the processed binary image as input, and putting the binary image into a pre-trained deep learning model to obtain the coordinates of wrist vein points on the image.
The invention provides a deep learning-based accurate positioning method for the positions of wrist vein openings of people, which comprises the following specific steps:
(1) Acquiring training data of deep learning;
(2) Training a deep learning model;
(3) And predicting the pulse position of the acquired hand picture by using the trained model.
The specific training data acquisition process in the step (1) is as follows:
(11) The method comprises the steps of sampling hand pictures of a plurality of people, and cutting and scaling the pictures according to the shape requirement of a deep learning model on input pictures; the sampling quantity of each person is not less than 20, the hand gestures during sampling have various states and different rotation degrees and heights, and the whole hand and wrist during sampling are ensured to be contained in the picture; attaching a mark which is different from skin color at the wrist vein; the background color is further to be distinguished from the skin color and the card color;
(12) Extracting a binary image of a human hand; firstly, performing white balance on a picture, and restoring the picture to be a picture which is close to the true color temperature of an original picture; then converting the picture into an HSV color space, wherein H represents color, S represents saturation, and V represents brightness value;
(13) After converting to HSV color space, firstly carrying out fuzzy denoising treatment on the picture by using a proper Gaussian kernel, then carrying out color clustering on the picture by using a Mean Shift algorithm, obtaining a binary image of skin color by limiting the range of the HSV color space, carrying out contour detection on the binary image to eliminate possible interference blocks, finding out a contour with the largest area, and storing the contour to obtain a final hand contour;
(14) Then extracting the marked outline, namely the position of the pulse, wherein the extraction processing mode is similar to the mode of extracting the outline of the hand, and only H is needed to be changed into the color of the small wafer; after the outline of the mark is extracted, the mark is wrapped by a rectangular frame, and the diagonal coordinates of the rectangular frame are returned to be used as the label during training.
In the training of the deep learning model in step (2), data are trained using a YOLOv3 model (redson J, faradai a. YOLOv3: an Incremental Improvement [ J ]. 2018.):
(21) YOLOv3 was dark-53;
(22) And (3) carrying out data enhancement processing (including increasing data amount through rotation, cutting, scaling and color floating) on the data under the condition of smaller data set, then dividing the acquired data into a training set, a testing set and a verification set in a ratio of 7:2:1, and adjusting the super parameters preset by the model to train.
The main super parameters in the model are:
(221) Batch size (batch size): the larger batch is easy to cause the network to be easy to converge, but the too large batch is easy to cause the memory shortage, and the network can be set to be 4 or 8 under the condition of the memory shortage of the display card;
(222) Momentum parameter (momentum): this value affects the rate at which the gradient drops to the optimal value;
(223) Weight decay term (decay): the larger the value of the weight attenuation regular term is, the larger the suppression capability of the overfitting is;
(224) Learning rate (learning rate): the learning rate determines the updating speed of the weight, and the result exceeds the optimal value when the learning rate is set too large, and the descending speed is too slow when the learning rate is too small;
(225) Iteration number (steps): the total training times are generally set to be more than 5 ten thousand;
(23) After the super parameters are set, training can be started, the training loss and the verification loss value during training are noticed, the prediction capacity of the network is gradually increased by reducing the training loss and the verification loss value, and when the prediction capacity of the network is not reduced, the network is converged, and the training can be ended;
(24) Several super-parameters of multiple adjustments (221) - (225) (e.g., momentum parameters adjusted within 0.9 to 0.99, weight decay term at 10) -5 To 10 -3 Internal adjustment, the learning rate is initialized to 10 -3 Then gradually decreasing with the increase of the iteration times) to obtain the model with the best generalization performance.
And (3) testing the data by using the trained model:
(31) Firstly, preprocessing a picture of a pulse position to be predicted; the specific processing mode is that white balance processing is firstly carried out, then the white balance processing is converted into HSV color space, image color is clustered by means of a mean shift algorithm, skin color is extracted to obtain a binary image, and then the contour with the largest area is extracted to be used as the contour of a hand;
(32) The binary image of the hand is used as input and put into a model, and the obtained prediction result is the position of the final pulse.
In the step (13), the color clustering is performed by using a mean shift algorithm, and the specific flow is as follows:
(131) Randomly selecting unlabeled pixels in a picture as centers;
(132) Finding out pixel points with a certain radius from the center, marking the pixel points as a set M, and considering the points as belonging to a cluster c; meanwhile, setting the probability that the inner points belong to the cluster as 1;
(133) The vectors from the center point to each element in the set M are calculated and added to obtain a drift vector, and the vector calculation formula is as follows:
Figure DEST_PATH_IMAGE002
(134) The center point c is set to be a value after the drift vector is added, namely, the center moves along the direction of the drift vector, and the moving distance is the absolute value of the drift vector;
(135) Repeating steps (132), (133) and (134) until the iteration converges, and recording the center at the moment; the points encountered in this iterative process are all categorized into cluster c;
(136) If the distance between the center of the current cluster c and the centers of other existing clusters is smaller than a threshold value during convergence, merging the cluster with c; otherwise, c is taken as a new cluster;
(137) Repeating (131) - (136) until all points are accessed by the tag.
Compared with the prior art, the invention has the beneficial effects that:
1. the position of the pulse point can be accurately found only by providing a hand picture for a user;
2. a visual positioning method for the robot to perform Chinese medicine pulse diagnosis is provided.
Drawings
Fig. 1 is a general flow diagram of the present invention.
Fig. 2 is a diagram of the white balance and transition to HSV color space of step (12) of fig. 1. Wherein, (a) is a white balanced picture, and (b) is a picture converted into HSV color space.
Fig. 3 is a color clustering method for HSV by mean shift algorithm in step (13) of fig. 1, and then extracting a hand binary image according to skin color. And (a) performing fuzzy noise reduction treatment on the HSV and performing color clustering on the HSV by using a mean shift algorithm. (b) And extracting a binary image of the skin color according to the skin color. (c) And searching the outline with the largest area to obtain a binary image of the hand.
Fig. 4 is a diagram of step (14) of fig. 1, wherein the binary image is extracted according to the color of the card, and the minimum positive rectangular frame of the card outline, that is, the position of the pulse opening is obtained. And (a) extracting a binary image according to the color of the card on the basis of the HSV image after the previous color clustering. (b) screening the profile to obtain the profile of the card. (c) obtaining a minimum positive rectangular box of the card outline.
FIG. 5 illustrates the preprocessing of the picture to be predicted, as described in step (31) of FIG. 1, and then putting the preprocessed picture into a pre-trained model to obtain the predicted pulse coordinates. Wherein, (a) white-balanced pictures are obtained. (b) converting to a post-HSV color space picture. (c) And carrying out fuzzy noise reduction on the HSV picture, and processing the HSV picture by using a mean shift color clustering algorithm. (d) obtaining a binary image based on skin color. (e) And searching the outline with the largest area to obtain a binary image of the hand.
FIG. 6 shows the predicted pulse coordinates of the processed hand binary image in a pre-trained model for regression, as described in step (31) of FIG. 1. The predicted pulse position is shown.
Fig. 7 is a diagram of various gestures of a hand at the time of data sampling.
FIG. 8 shows the verification loss value and the training loss value variation in the training process.
Fig. 9 is a graphical representation of a result test.
Detailed Description
1. Data acquisition and data enhancement
The hand picture sampling is carried out on a plurality of people, the sampling quantity of each person is not less than 20, the hand gestures during sampling are in various states and have different rotation degrees and heights, and the whole hand and wrist are ensured to be contained in the picture during sampling. See fig. 7. Data enhancement can be performed under the condition of insufficient sample number, and operations such as turning, rotation, color floating, scaling and the like are performed on the samples so as to enhance the generalization capability of the network.
2. Parameter setting
And adjusting parameters to perform different training processes so as to obtain a model with the best prediction effect. Wherein the batch size is set to 4, the momentum parameter is set to 0.95, the iteration number is set to 10 ten thousand times, and the weight attenuation term is set to 2×10 -4 The learning rate is initialized to 10 -3
3. Training process
Referring to fig. 8, the upper part (blue) of the graph is the validation loss value, the lower part (orange) is the training loss value, and both decrease at the same time indicating that the network predictive power is increasing, and when both do not decrease any more, indicating that the network has converged, the training can be ended, the ordinate is the loss value, and the abscissa is the number of iterations, and it can be seen that the loss value hardly decreases any more when the number of iterations exceeds 5 ten thousand times.
4. Results testing
Referring to fig. 9, it can be seen from the graph that after 5 ten thousand training iterations, the predicted result has approached the true value.

Claims (3)

1. A method for accurately positioning the positions of wrist vein and mouth of a person based on deep learning is characterized by comprising the following specific steps:
(1) Acquiring training data of deep learning;
(2) Training a deep learning model;
(3) Predicting the pulse position of the acquired hand picture by using the trained model;
the specific training data acquisition process in the step (1) is as follows:
(11) The method comprises the steps of sampling hand pictures of a plurality of people, and cutting and scaling the pictures according to the shape requirement of a deep learning model on input pictures; the sampling quantity of each person is not less than 20, the hand gestures during sampling have various states and different rotation degrees and heights, and the whole hand and wrist during sampling are ensured to be contained in the picture; attaching a mark which is different from skin color at the wrist vein; the background color is further to be distinguished from the skin color and the card color;
(12) Extracting a binary image of a human hand; firstly, performing white balance on a picture, and restoring the picture to be a picture which is close to the true color temperature of an original picture; then converting the picture into an HSV color space, wherein H represents color, S represents saturation, and V represents brightness value;
(13) After converting to HSV color space, firstly carrying out fuzzy denoising treatment on the picture by using a proper Gaussian kernel, then carrying out color clustering on the picture by using a Mean Shift algorithm, and then obtaining a binary image of skin color by limiting the range of the HSV color space; performing contour detection on the binary image, finding out the contour with the largest area, and storing the contour to obtain the final hand contour;
(14) Then extracting the marked outline, namely the position of the pulse mouth, wherein the extraction processing mode is the same as that of the hand outline, and only the H color is changed into the marked color; after the marked outline is extracted, wrapping the marked outline by using a rectangular frame, and returning to the diagonal coordinates of the rectangular frame to be used as a label during training;
in the training deep learning model in the step (2), a YOLOv3 model is adopted to train the data, and the process is as follows:
(21) YOLOv3 employs dark-53;
(22) For the case where the data amount is small, data enhancement processing is performed on the data, the enhancement processing including increasing the data amount by rotation, clipping, scaling, color floating; dividing the acquired data into a training set, a testing set and a verification set according to the ratio of 7:2:1, and adjusting the super parameters preset by the model for training;
(23) After the super parameters are set, training is started, the training loss and the verification loss value during training are noticed, the two values are reduced at the same time, the prediction capability of the network is gradually increased, and when the two values are not reduced any more, the network is converged, and the training is ended;
(24) The super parameters are adjusted for multiple times, and a model with the best generalization performance is obtained;
the step (3) predicts the pulse position of the collected hand picture by using a trained model, and the specific flow is as follows:
(31) Firstly, preprocessing a picture of a pulse position to be predicted; the specific processing mode is that white balance processing is firstly carried out, then the white balance processing is converted into HSV color space, image color is clustered by means of a mean shift algorithm, skin color is extracted to obtain a binary image, and then the contour with the largest area is extracted to be used as the contour of a hand;
(32) The binary image of the hand is used as input and put into a model, and the obtained prediction result is the position of the final pulse.
2. The accurate positioning method for the wrist vein opening position of the person based on deep learning according to claim 1, wherein in the step (31), the color clustering is performed by using a mean shift algorithm, and the specific flow is as follows:
(131) Randomly selecting unlabeled pixels in a picture as centers;
(132) Finding out pixel points with a certain radius from the center, marking the pixel points as a set M, and considering the points as belonging to a cluster c; meanwhile, setting the probability that the inner points belong to the cluster as 1;
(133) The vectors from the center point to each element in the set M are calculated and added to obtain a drift vector, and the vector calculation formula is as follows:
Figure FDA0004098991200000021
(134) The center point c is set to be a value after the drift vector is added, namely, the center moves along the direction of the drift vector, and the moving distance is the absolute value of the drift vector;
(135) Repeating steps (132), (133) and (134) until the iteration converges, and recording the center at the moment; the points encountered in this iterative process are all categorized into cluster c;
(136) If the distance between the center of the current cluster c and the centers of other existing clusters is smaller than a threshold value during convergence, merging the cluster with c; otherwise, c is taken as a new cluster;
(137) Repeating (131) - (136) until all points are accessed by the tag.
3. The accurate positioning method for the wrist vein ostium position of the person based on deep learning according to claim 2, wherein the model in the step (2) presets super parameters:
(221) Batch size: set to 4 or 8;
(222) Momentum parameter: this value affects the rate at which the gradient drops to an optimal value;
(223) Weight decay term: the larger the weight decay regular term value is, the larger the inhibition capability to the overfitting is;
(224) Learning rate: the learning rate determines the updating speed of the weight;
(225) Iteration number: the total training times is set to be more than 5 ten thousand;
the adjusting the super parameter includes: adjusting the momentum parameter to be in the range of 0.9 to 0.99; adjusting weight attenuation term to 10 -5 To 10 -3 The method comprises the steps of carrying out a first treatment on the surface of the The learning rate is adjusted and initialized to 10 -3 And then gradually decreases as the number of iterations increases.
CN201811356765.3A 2018-11-15 2018-11-15 Deep learning-based accurate positioning method for positions of wrist vein and mouth of person Active CN109583331B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811356765.3A CN109583331B (en) 2018-11-15 2018-11-15 Deep learning-based accurate positioning method for positions of wrist vein and mouth of person

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811356765.3A CN109583331B (en) 2018-11-15 2018-11-15 Deep learning-based accurate positioning method for positions of wrist vein and mouth of person

Publications (2)

Publication Number Publication Date
CN109583331A CN109583331A (en) 2019-04-05
CN109583331B true CN109583331B (en) 2023-05-02

Family

ID=65922636

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811356765.3A Active CN109583331B (en) 2018-11-15 2018-11-15 Deep learning-based accurate positioning method for positions of wrist vein and mouth of person

Country Status (1)

Country Link
CN (1) CN109583331B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110349224B (en) * 2019-06-14 2022-01-25 众安信息技术服务有限公司 Tooth color value judgment method and system based on deep learning
CN112336318B (en) * 2019-08-09 2022-02-18 复旦大学 Pulse position accurate positioning method for self-adaptive multi-mode fusion
CN110895818B (en) * 2019-10-16 2023-07-18 南京大学 Deep learning-based knee joint contour feature extraction method and device
CN112263217B (en) * 2020-08-27 2023-07-18 上海大学 Improved convolutional neural network-based non-melanoma skin cancer pathological image lesion area detection method
CN113892919B (en) * 2021-12-09 2022-04-22 季华实验室 Pulse feeling data acquisition method and device, electronic equipment and system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017092431A1 (en) * 2015-12-01 2017-06-08 乐视控股(北京)有限公司 Human hand detection method and device based on skin colour
CN107403200A (en) * 2017-08-10 2017-11-28 北京亚鸿世纪科技发展有限公司 Improve the multiple imperfect picture sorting technique of image segmentation algorithm combination deep learning

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017092431A1 (en) * 2015-12-01 2017-06-08 乐视控股(北京)有限公司 Human hand detection method and device based on skin colour
CN107403200A (en) * 2017-08-10 2017-11-28 北京亚鸿世纪科技发展有限公司 Improve the multiple imperfect picture sorting technique of image segmentation algorithm combination deep learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘吉 ; 孙仁诚 ; 乔松林 ; .深度学习在医学图像识别中的应用研究.青岛大学学报(自然科学版).2018,(第01期),全文. *

Also Published As

Publication number Publication date
CN109583331A (en) 2019-04-05

Similar Documents

Publication Publication Date Title
CN109583331B (en) Deep learning-based accurate positioning method for positions of wrist vein and mouth of person
CN107316307B (en) Automatic segmentation method of traditional Chinese medicine tongue image based on deep convolutional neural network
Fan et al. A hierarchical image matting model for blood vessel segmentation in fundus images
US10489909B2 (en) Method of automatically detecting microaneurysm based on multi-sieving convolutional neural network
CN110490850B (en) Lump region detection method and device and medical image processing equipment
CN109215013B (en) Automatic bone age prediction method, system, computer device and storage medium
Ali et al. Optic disk and cup segmentation through fuzzy broad learning system for glaucoma screening
CN109376636A (en) Eye ground image classification method based on capsule network
CN111340819A (en) Image segmentation method, device and storage medium
CN105426929B (en) Object shapes alignment device, object handles devices and methods therefor
Wu et al. U-GAN: Generative adversarial networks with U-Net for retinal vessel segmentation
Naga Srinivasu et al. A comparative review of optimisation techniques in segmentation of brain MR images
CN109685037A (en) A kind of real-time action recognition methods, device and electronic equipment
CN110503063A (en) Fall detection method based on hourglass convolution autocoding neural network
CN109034012A (en) First person gesture identification method based on dynamic image and video sequence
CN112465905A (en) Characteristic brain region positioning method of magnetic resonance imaging data based on deep learning
Wu et al. DA-U-Net: densely connected convolutional networks and decoder with attention gate for retinal vessel segmentation
CN114822823B (en) Tumor fine classification system based on cloud computing and artificial intelligence fusion multi-dimensional medical data
Niu et al. Automatic localization of optic disc based on deep learning in fundus images
Ninh et al. Skin lesion segmentation based on modification of SegNet neural networks
CN110472673B (en) Parameter adjustment method, fundus image processing device, fundus image processing medium and fundus image processing apparatus
CN115346272A (en) Real-time tumble detection method based on depth image sequence
Revathi et al. Particle Swarm Optimization based Detection of Diabetic Retinopathy using a Novel Deep CNN
CN112419335B (en) Shape loss calculation method of cell nucleus segmentation network
CN110163103A (en) A kind of live pig Activity recognition method and apparatus based on video image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant