CN111339941A - Head posture detection method - Google Patents

Head posture detection method Download PDF

Info

Publication number
CN111339941A
CN111339941A CN202010119229.2A CN202010119229A CN111339941A CN 111339941 A CN111339941 A CN 111339941A CN 202010119229 A CN202010119229 A CN 202010119229A CN 111339941 A CN111339941 A CN 111339941A
Authority
CN
China
Prior art keywords
axis
head
detection method
deep learning
learning model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010119229.2A
Other languages
Chinese (zh)
Inventor
林士然
蒋磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Lingtu Intelligent Technology Co ltd
Original Assignee
Suzhou Lingtu Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Lingtu Intelligent Technology Co ltd filed Critical Suzhou Lingtu Intelligent Technology Co ltd
Priority to CN202010119229.2A priority Critical patent/CN111339941A/en
Publication of CN111339941A publication Critical patent/CN111339941A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/166Detection; Localisation; Normalisation using acquisition arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/24Aligning, centring, orientation detection or correction of the image
    • G06V10/242Aligning, centring, orientation detection or correction of the image by image rotation, e.g. by 90 degrees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a head posture detection method, which comprises the following steps: (a) selecting a data set; (b) preprocessing the face pictures in the data set, and then converting the sizes of the face pictures to obtain pictures with set sizes; (c) constructing a deep learning model by taking MobileNetv2 as a backbone; (d) putting the pictures with the set sizes into the neural network for classification; (e) applying softmax to the fully-connected layer result to map the fully-connected layer value to a probability value; (f) mapping the probability value to obtain regression, and calculating the regression loss probability by using an MSE loss function method; (g) carrying out weight weighted summation on the loss probability, and carrying out gradient direction on the final loss probability to finish the training of a deep learning model; (h) and testing the head of the child by using the deep learning model. The method has the advantages of high program detection speed and real-time performance.

Description

Head posture detection method
Technical Field
The invention relates to a head posture detection method, in particular to a method for detecting the head posture of a child with mental diseases by utilizing a deep learning training model in a computer vision technology.
Background
Head gestures can help people position to convey some rich information, such as people pointing with their head to indicate their dialog objects and intentions. In some dialogs, the head direction is a non-verbal announcement that alerts the listener when to change roles and begin speaking; in these dialogs, the head pose direction and the form of the gesture have the same important role.
For some children with autism, hyperactivity or tic disorder, the head orientation can reflect what the children aim at the current environment, so that the therapist or doctor can know the thoughts of the children conveniently. There are several methods of detecting head pose today: such as the early use of the detector array approach (training a large number of head detectors, each detector accommodating a particular pose, then assigning a discrete pose to these detectors, with some head pose being predicted accordingly); in the middle period, a nonlinear regression method or a random forest algorithm in machine learning is used; some recent algorithms extract key points of the human face to perform prediction of the head pose by deep learning training.
However, the above method has certain disadvantages: are relatively dependent on environmental influences. If the environmental background is changed greatly or the ages of the testers are different greatly (for example, the people with mental diseases such as autism, hyperactivity or tics are children generally, but the head posture of the children is detected slightly different from that of adults), the detection structure is easy to be inaccurate.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a head posture detection method which is suitable for children with mental diseases such as autism, hyperactivity or tic disorder.
In order to achieve the purpose, the technical scheme adopted by the invention is as follows: a head posture detection method comprises the following steps:
(a) selecting a data set;
(b) preprocessing the face picture in the data set, performing face detection and cutting on the face picture by using a multitask cascade convolution neural network, and then performing size conversion to obtain a picture with a set size;
(c) constructing a neural network which takes MobileNetv2 as a backbone and is respectively connected with three full-connection layers for the deep learning model;
(d) putting the pictures with the set sizes into the neural network for classification;
(e) applying softmax to the fully-connected layer result to map the fully-connected layer value to a probability value;
(f) mapping the probability value to obtain regression, and calculating the regression loss probability by using an MSE loss function method;
(g) carrying out weight weighted summation on the loss probability, and carrying out gradient direction on the final loss probability to finish the training of a deep learning model;
(h) the nose is taken as a basic point, the horizontal direction is set as an x axis, the vertical direction is set as a y axis, the z axis is perpendicular to a plane formed by the x axis and the y axis, the angle of clockwise rotation around the x axis, the y axis and the z axis is defined as the offset angle of the head posture in pitch, yaw and roll directions, and the deep learning model is used for testing the head of the child to obtain the posture position of the head of the child.
Optimally, in step (a), the data sets are BIWI, 300W-LP and AFLW2000 data sets.
Optimally, in the step (b), the preprocessing is to exclude the unwanted background or other objects in the face picture.
Further, in step (b), the multitask cascade convolution neural network is completed by three cascade lightweight CNNs of PNet, RNet and Onet.
Optimally, in the step (d), the classification result map is put into a range.
Due to the application of the technical scheme, compared with the prior art, the invention has the following advantages: according to the head posture detection method, the deep learning model is used for performing loss calculation on the three angles by using the composite loss function, the program detection speed is high, and the real-time performance can be achieved; the unified evaluation standard is available, and the accuracy is high; the time for the therapist or doctor to observe the child can be saved, and the therapist or doctor can treat the child more in other aspects; and the data may be stored and displayed in the form of video.
Drawings
FIG. 1 is a flow chart of an MSE loss function in the head pose detection method of the present invention;
FIG. 2 is a diagram illustrating a first effect of the head pose detection method according to the present invention;
fig. 3 is a diagram illustrating a second effect of the head pose detection method according to the present invention.
Detailed Description
Preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
The head posture detection method comprises the following steps:
(a) selecting a data set; in the present embodiment, the datasets are primarily BIWI, 300W-LP, and AFLW2000 datasets (i.e., primarily trained and tested on the BIWI, 300W-LP, and AFLW2000 datasets). The BIWI data set was released in 2010 and contained 1000 high quality 3D scanners and 3D data collected by professional microphones, with intensive dynamic face scans acquired at 25 frames per second. The 300W-LP is a 3D data set obtained by simulation based on a 300W data set and a 3DMM model, which is the most widely used simulation data set in the 3D field and comprises labels of 68 key points, camera parameters and coefficients of the 3DMM model. AFLW is a large-scale database of faces including multi-pose, multi-view faces, typically used to evaluate the effect of facial keypoint detection, with pictures from a crawl of flickr, totaling 21997 pictures, 25993 faces, and 21 keypoints per face label (380000 keypoints total).
(b) The method comprises the steps of preprocessing face pictures in a data set, using a multi-task cascade convolution neural network (MTCNN, MTCNN is a more classical and rapid face detection technology, and is completed by three cascade lightweight CNNs, PNet, RNet and Onet) to detect and cut the faces of the face pictures, and eliminating some unneeded backgrounds or other objects to ensure that overfitting data does not appear during training;
(c) constructing a neural network which takes MobileNetv2 as a backbone and is respectively connected with three full-connection layers for the deep learning model (namely, the deep network takes MobileNetv2 as the backbone of the foundation and is respectively connected with the three full-connection layers, and each layer is independently predicted);
(d) the pictures with set sizes are put into a neural network for classification, and then the classification result map is put into a range, so that the precision of the method is greatly improved (the step is the loss probability of classification);
(e) performing softmax on the result of the full link layer to map the value of the full link layer to a probability value;
(f) the probability value is mapped to obtain regression (namely, the regression is needed when mapping is carried out according to the probability value), and the loss probability of the regression is calculated by using an MSE loss function method (as shown in figure 1, MSE mainly introduces the definition of common loss function MSE in machine learning and derivation characteristics thereof, and the mean square error in mathematical statistics refers to the expectation value of the square of the difference between a parameter estimation value and a parameter value;
(g) carrying out weight weighted summation on the loss probability, and carrying out gradient direction on the final loss probability to finish the training of a deep learning model;
(h) taking a nose as a base point, setting the horizontal direction as an x-axis, setting the vertical direction as a y-axis, setting the z-axis to be perpendicular to a plane formed by the x-axis and the y-axis, defining an angle of clockwise rotation around the x-axis, the y-axis and the z-axis as a deviation angle of the head posture in pitch, yaw and roll directions, and testing the head of the child by using the deep learning model to obtain the posture position of the head of the child (for specific application, see fig. 2 and fig. 3).
The above embodiments are merely illustrative of the technical ideas and features of the present invention, and the purpose thereof is to enable those skilled in the art to understand the contents of the present invention and implement the present invention, and not to limit the protection scope of the present invention. All equivalent changes and modifications made according to the spirit of the present invention should be covered within the protection scope of the present invention.

Claims (5)

1. A head posture detection method is characterized by comprising the following steps:
(a) selecting a data set;
(b) preprocessing the face picture in the data set, performing face detection and cutting on the face picture by using a multitask cascade convolution neural network, and then performing size conversion to obtain a picture with a set size;
(c) constructing a neural network which takes MobileNetv2 as a backbone and is respectively connected with three full-connection layers for the deep learning model;
(d) putting the pictures with the set sizes into the neural network for classification;
(e) applying softmax to the fully-connected layer result to map the fully-connected layer value to a probability value;
(f) mapping the probability value to obtain regression, and calculating the regression loss probability by using an MSE loss function method;
(g) carrying out weight weighted summation on the loss probability, and carrying out gradient direction on the final loss probability to finish the training of a deep learning model;
(h) the nose is taken as a basic point, the horizontal direction is set as an x axis, the vertical direction is set as a y axis, the z axis is perpendicular to a plane formed by the x axis and the y axis, the angle of clockwise rotation around the x axis, the y axis and the z axis is defined as the offset angle of the head posture in pitch, yaw and roll directions, and the deep learning model is used for testing the head of the child to obtain the posture position of the head of the child.
2. The head posture detection method according to claim 1, characterized in that: in step (a), the data sets are BIWI, 300W-LP and AFLW2000 data sets.
3. The head posture detection method according to claim 1, characterized in that: in the step (b), the preprocessing is to exclude the background or other objects which are not needed in the face picture.
4. The head pose detection method according to claim 1 or 3, characterized in that: in the step (b), the multitask cascade convolution neural network is completed by three cascade lightweight CNNs of PNet, RNet and Onet.
5. The head posture detection method according to claim 1, characterized in that: in the step (d), the classification result map is put into a range.
CN202010119229.2A 2020-02-26 2020-02-26 Head posture detection method Pending CN111339941A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010119229.2A CN111339941A (en) 2020-02-26 2020-02-26 Head posture detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010119229.2A CN111339941A (en) 2020-02-26 2020-02-26 Head posture detection method

Publications (1)

Publication Number Publication Date
CN111339941A true CN111339941A (en) 2020-06-26

Family

ID=71183659

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010119229.2A Pending CN111339941A (en) 2020-02-26 2020-02-26 Head posture detection method

Country Status (1)

Country Link
CN (1) CN111339941A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112241761A (en) * 2020-10-15 2021-01-19 北京字跳网络技术有限公司 Model training method and device and electronic equipment
CN112634363A (en) * 2020-12-10 2021-04-09 上海零眸智能科技有限公司 Shelf attitude estimation method

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107748858A (en) * 2017-06-15 2018-03-02 华南理工大学 A kind of multi-pose eye locating method based on concatenated convolutional neutral net

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107748858A (en) * 2017-06-15 2018-03-02 华南理工大学 A kind of multi-pose eye locating method based on concatenated convolutional neutral net

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
XIAOJIE GUO等: "PFLD: A Practical Facial Landmark Detector" *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112241761A (en) * 2020-10-15 2021-01-19 北京字跳网络技术有限公司 Model training method and device and electronic equipment
CN112241761B (en) * 2020-10-15 2024-03-26 北京字跳网络技术有限公司 Model training method and device and electronic equipment
CN112634363A (en) * 2020-12-10 2021-04-09 上海零眸智能科技有限公司 Shelf attitude estimation method
CN112634363B (en) * 2020-12-10 2023-10-03 上海零眸智能科技有限公司 Goods shelf posture estimating method

Similar Documents

Publication Publication Date Title
CN109325437B (en) Image processing method, device and system
CN110287880A (en) A kind of attitude robust face identification method based on deep learning
CN111028319B (en) Three-dimensional non-photorealistic expression generation method based on facial motion unit
Li et al. Sign language recognition based on computer vision
CN110889343A (en) Crowd density estimation method and device based on attention type deep neural network
CN112818969A (en) Knowledge distillation-based face pose estimation method and system
CN112949622A (en) Bimodal character classification method and device fusing text and image
CN111339941A (en) Head posture detection method
Depuru et al. Convolutional neural network based human emotion recognition system: A deep learning approach
CN112906520A (en) Gesture coding-based action recognition method and device
Yanmin et al. Research on ear recognition based on SSD_MobileNet_v1 network
Zhao et al. Rapid offline detection and 3D annotation of assembly elements in the augmented assembly
CN111914595A (en) Human hand three-dimensional attitude estimation method and device based on color image
CN109753922A (en) Anthropomorphic robot expression recognition method based on dense convolutional neural networks
CN110490165B (en) Dynamic gesture tracking method based on convolutional neural network
Wang et al. Swimmer’s posture recognition and correction method based on embedded depth image skeleton tracking
CN116823983A (en) One-to-many style handwriting picture generation method based on style collection mechanism
CN115496859A (en) Three-dimensional scene motion trend estimation method based on scattered point cloud cross attention learning
Chen et al. Intelligent Recognition of Physical Education Teachers' Behaviors Using Kinect Sensors and Machine Learning.
CN113536926A (en) Human body action recognition method based on distance vector and multi-angle self-adaptive network
Zhou et al. Motion balance ability detection based on video analysis in virtual reality environment
Gai et al. Digital Art Creation and Visual Communication Design Driven by Internet of Things Algorithm
Liang et al. Interactive Experience Design of Traditional Dance in New Media Era Based on Action Detection
CN112989952B (en) Crowd density estimation method and device based on mask guidance
Xu et al. Research on computer graphics and image design and visual communication design

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200626