CN114999648B - Early screening system, equipment and storage medium for cerebral palsy based on baby dynamic posture estimation - Google Patents

Early screening system, equipment and storage medium for cerebral palsy based on baby dynamic posture estimation Download PDF

Info

Publication number
CN114999648B
CN114999648B CN202210622793.5A CN202210622793A CN114999648B CN 114999648 B CN114999648 B CN 114999648B CN 202210622793 A CN202210622793 A CN 202210622793A CN 114999648 B CN114999648 B CN 114999648B
Authority
CN
China
Prior art keywords
baby
posture
infant
dynamic
video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210622793.5A
Other languages
Chinese (zh)
Other versions
CN114999648A (en
Inventor
舒强
李海峰
王慧
阮雯聪
陈雯聪
肖俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Childrens Hospital of Zhejiang University School of Medicine
Original Assignee
Childrens Hospital of Zhejiang University School of Medicine
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Childrens Hospital of Zhejiang University School of Medicine filed Critical Childrens Hospital of Zhejiang University School of Medicine
Publication of CN114999648A publication Critical patent/CN114999648A/en
Application granted granted Critical
Publication of CN114999648B publication Critical patent/CN114999648B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Theoretical Computer Science (AREA)
  • Public Health (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • Biomedical Technology (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Primary Health Care (AREA)
  • Epidemiology (AREA)
  • Pathology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Human Computer Interaction (AREA)
  • Social Psychology (AREA)
  • Psychiatry (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a cerebral palsy early screening system, equipment and a storage medium based on baby dynamic posture estimation. The whole system of the invention consists of a healthy baby feature space acquisition module, a baby video acquisition module, a static posture feature extraction module, a dynamic posture feature extraction module and an abnormality detection module. The baby motion video clip can extract a baby dynamic attitude feature sequence after the comprehensive time sequence information through the static attitude feature extraction module and the dynamic attitude feature extraction module; and then, carrying out anomaly detection on the infant dynamic posture characteristic sequence of the video segment to be detected in the characteristic space through an anomaly detection module, and outputting a detection result of the infant with cerebral palsy risk in the video. The method can be used for evaluating the risk of the infant suffering from cerebral palsy by using a density-based anomaly detection method in a practical application scene through video segments.

Description

Early screening system, equipment and storage medium for cerebral palsy based on baby dynamic posture estimation
Technical Field
The invention belongs to the field of image data processing, and particularly relates to an anomaly detection technology in human posture estimation and statistical learning in computer vision.
Background
With the development of many promising tools, such as General Mobility Assessment (GMA), early diagnosis of cerebral palsy has become an active research area. Cerebral Palsy (CP) is generally defined as a permanent set of motor and postural developmental disorders resulting in restricted activity due to non-progressive disorders occurring in the developing fetus or infant brain. Such syndromes are often accompanied by sensory, cognitive, communication and behavioral disorders. In severe cases, epilepsy and secondary musculoskeletal abnormalities may occur. Early diagnosis and rehabilitation training of cerebral palsy are particularly important for children.
General locomotion (GMs) summarize the spontaneous motor behaviour characteristics of the infant brain in different developmental states and can be assessed by the variability and complexity of Movement. Normal whole body movement involves a continuous irregular movement throughout the body. And the abnormal core characteristics of the cerebral palsy baby. Infants with severe cerebral palsy often have abnormal postures, and their bodies may be very soft or very stiff. Research shows that lack of restlessness has higher prediction significance on cerebral palsy. Restless motion is small movements of the infant's neck, torso, and limbs with variable accelerations of motion in various directions. Spontaneous movements of normal infants are significantly variable and complex, while those lacking restless movement have a more uniform movement pattern, involving more repetitive movements.
The human body posture estimation aims at positioning human body parts from images or videos, generally comprising a head, a shoulder, an elbow, a knee and the like, and the related technology is widely applied to the fields of human-computer interaction, games, virtual reality, video monitoring, motion analysis and medical assistance. In recent years, with the development of deep learning and the appearance of large-scale human posture estimation data sets, a large number of human posture estimation models based on neural networks are emerging in academia and industry. The models are usually trained end to end on massive marking data by using a back propagation algorithm, so that a good identification effect can be obtained.
In the field of data analysis, anomaly detection is intended to identify samples that have significant deviations from most data. In practical cases, the vast majority of samples are normal samples, while the proportion of abnormal samples is small. This inherent maldistribution makes supervised classification approaches impossible to solve anomaly detection problems. One practical class of anomaly detection methods is density-based methods. Such methods do not contain any parameters that need to be trained and only a small number of normal samples are required as input to estimate the distribution of normal samples. For a completely new sample to be evaluated, if the density of the area in which the sample is located in the approximate normal sample distribution is lower than a threshold value, the sample is considered as an abnormal sample. Anomaly detection has applications in many areas, including network security, medicine, machine vision, statistics, neuroscience, law enforcement, and financial fraud, among others.
Disclosure of Invention
The invention provides a cerebral palsy early screening system, equipment and a storage medium based on baby dynamic posture estimation. The screening system of the present invention can assess the risk of cerebral palsy of an infant by analyzing video segments of the infant's voluntary movements in the supine position.
In order to achieve the above purpose, the invention specifically adopts the following technical scheme:
in a first aspect, the present invention provides a system for early screening of cerebral palsy based on estimation of dynamic posture of an infant, comprising:
the healthy infant feature space acquisition module is used for acquiring a feature space constructed by an infant dynamic posture feature sequence of a healthy infant without cerebral palsy;
the baby video acquisition module is used for acquiring a video segment to be detected, wherein the baby to be screened is in a non-shielding supine position and moves autonomously;
the static posture feature extraction module is used for extracting the baby posture information in the video clip to be detected frame by frame through a pre-trained baby supine position posture estimation model and coding each frame of baby posture information to obtain the corresponding baby static posture feature; the infant supine position posture estimation model is obtained by fine tuning a pre-trained human body posture estimation model on an infant supine position posture estimation data set, wherein the infant supine position posture estimation data set is composed of key frames of a supine position posture infant motion video marked by joint points;
the dynamic attitude feature extraction module is used for taking the static attitude feature of the baby of each frame in the video clip to be detected as input, extracting the network coding of the whole video clip to be detected by the pre-trained baby dynamic attitude feature extraction network, and obtaining the baby dynamic attitude feature sequence after the time sequence information is synthesized in the video clip to be detected;
and the abnormality detection module is used for carrying out abnormality detection on the infant dynamic posture characteristic sequence of the video segment to be detected in the characteristic space and outputting a detection result that the infant in the video has cerebral palsy risk.
As a preferred aspect of the first aspect, in the healthy infant feature space acquisition module, the feature space acquisition method includes:
s1: acquiring a video data set of a healthy baby, wherein the age of the baby in each video fragment sample in the data set meets a preset age interval, the baby belongs to a healthy baby without cerebral palsy, and the baby needs to be in a supine position in a video without shielding and does autonomous movement;
s2: performing frame extraction on each video clip sample in the healthy infant video data set, randomly selecting partial key frames from the video clip samples, and labeling infant joint points in each key frame to form a labeled frame with infant posture information; constructing and obtaining the estimation data set of the supine position posture of the infant based on the labeling frame of each video clip sample; fine-tuning a pre-trained human body posture estimation model on the infant supine position posture estimation data set to obtain the infant supine position posture estimation model;
s3: extracting the baby posture information in each section of video clip sample in the healthy baby video data set frame by using the baby supine position posture estimation model, and coding the baby posture information of each frame to obtain the baby static posture characteristic of the frame;
s4: for each video clip sample in the healthy infant video data set, taking the infant static posture feature of each frame in the video clip sample as input, extracting the network coding whole video clip sample through the infant dynamic posture feature, and constructing an infant dynamic posture feature sequence integrating time sequence information; training a baby dynamic posture feature extraction network through a mask reconstruction task in a self-supervision mode to obtain the pre-trained baby dynamic posture feature extraction network;
s5: and coding all video fragment samples in the healthy infant video data set one by using the pre-trained infant dynamic posture feature extraction network to obtain an infant dynamic posture feature sequence corresponding to each video fragment sample, thereby forming the feature space for screening the infant with the cerebral palsy risk through abnormal detection.
As a preferred aspect of the first aspect, the infant supine position posture estimation model takes a human posture estimation model openpos pre-trained on a human posture estimation data set as a training starting point, and a back propagation algorithm is used on the artificially labeled infant supine position posture estimation data set to perform fine tuning on openpos model parameters, so as to obtain a neural network model dedicated for estimating the supine position posture of the infant; for a piece of image of the infant in the supine position, the infant supine position posture estimation model extracts position information of 8 body parts as posture information, the extracted body parts including bilateral eyes, a neck, bilateral shoulders, bilateral elbows, bilateral wrists, a hip, bilateral knees and bilateral ankles.
As a preferred aspect of the first aspect, the method for encoding the infant posture information of each frame is as follows:
firstly, establishing a two-dimensional plane by taking the central points of the neck and the hip of the baby as the original points, and normalizing the position information of the body part by taking the distance from the neck to the hip of the baby to eliminate the influence of the body type of the baby; subsequently, position offset vectors of different parts of the body are respectively calculated, including: from neck to hip, bilateral from neck to shoulder of body, bilateral from shoulder to elbow of body, bilateral from elbow to wrist of body, from neck to binocular center, from hip to bilateral knee, bilateral from knee to ankle of body; then, calculating an included angle between adjacent offset vectors, including: an included angle of a vector which takes the shoulder as the center and points to the neck and the elbow, an included angle of a vector which takes the elbow as the center and points to the shoulder and the wrist, an included angle of a vector which takes the hip as the center and points to the neck and the knee, and an included angle of a vector which takes the knee as the center and points to the hip and the ankle, wherein all included angles need to comprise both sides of the body; and finally, combining the body part position information and the included angle between the position offset vector and the adjacent offset vector, forming a multi-dimensional vector through coding, and using the multi-dimensional vector as the static posture characteristic of the baby.
Preferably, in the first aspect, the baby dynamic posture feature extraction network uses a Transformer encoder as a network structure, and the input of the baby dynamic posture feature is the baby static posture feature and a time interval between a time of a frame corresponding to the baby static posture feature and a start time of a video segment.
As a preferred aspect of the first aspect, when the infant dynamic posture feature extraction network is trained in advance in a self-supervision manner, the mask reconstruction is used as an agent task to generate a training signal, so as to enhance the timing information in the encoded features, and the specific training steps are as follows:
s41: sampling training samples from the healthy infant video data set, randomly taking a certain frame in a sampled video fragment sample as a replaced position, replacing the infant static attitude characteristics corresponding to the replaced position with a fixed random code, and keeping the infant static attitude characteristics of the rest frames of the video fragment sample unchanged to form a replaced input sequence;
s42: and then, encoding the replaced input sequence by using the baby dynamic attitude feature extraction network to obtain the encoded dynamic attitude feature corresponding to the replaced position.
S43: predicting the replaced baby static attitude characteristics in the replaced position by using a multilayer perceptron network and taking the coded dynamic attitude characteristics corresponding to the replaced position as input, and finishing the characteristic reconstruction;
s44: and evaluating the characteristic reconstruction quality by using the two-norm loss, optimizing parameters in the baby dynamic posture characteristic extraction network by using a back propagation algorithm by taking the characteristic reconstruction quality as a training target, and obtaining the pre-trained baby dynamic posture characteristic extraction network after training to convergence for carrying out an actual baby dynamic posture characteristic extraction task.
Preferably, in the abnormality detection module, the specific steps of performing abnormality detection include:
firstly, fitting the distribution of the baby dynamic posture characteristic sequence of all the healthy babies without cerebral palsy in the characteristic space by using a high-dimensional Gaussian distribution p (x; mu, sigma):
Figure BDA0003675241760000051
wherein x ∈ R n For the sequence of baby dynamic posture features, mu epsilon R n Is a mean vector of n-dimensional Gaussian distribution, sigma belongs to R n×n A covariance matrix of n-dimensional Gaussian distribution; the estimated values of the parameters mu and sigma obtained by fitting are as follows:
Figure BDA0003675241760000052
Figure BDA0003675241760000053
wherein x is (i) The baby dynamic posture characteristic sequence of the ith healthy baby is obtained, and m is the total number of the baby dynamic posture characteristic sequences in the characteristic space;
then, for the baby dynamic attitude characteristic sequence x of the video segment to be detected * Calculating the probability density value of the healthy baby in the dynamic posture distribution:
Figure BDA0003675241760000054
finally, according to the calculated probability density value p (x) * ) Making risk judgment if p (x) * ) If the value is less than the threshold epsilon, the baby in the video segment to be detected is considered to have the risk of cerebral palsy, otherwise, the baby in the video segment to be detected is considered to be a normal baby without cerebral palsy.
In a second aspect, the invention provides a computer electronic device comprising a memory and a processor;
the memory for storing a computer program;
the processor, when executing the computer program, is configured to output the detection result by using the early screening system for cerebral palsy based on baby dynamic posture estimation according to any one of the aspects of the first aspect.
In a third aspect, the present invention provides a computer-readable storage medium, which has a computer program stored thereon, and when the computer program is executed by a processor, the computer program can output the detection result by using the early screening system for cerebral palsy based on baby dynamic posture estimation according to any one of the aspects of the first aspect.
In a fourth aspect, the present invention provides an early screening apparatus for cerebral palsy, which includes a video capturing apparatus and a detecting apparatus;
the video acquisition equipment is used for shooting a video clip of the infant to be screened, which is in a non-shielding supine position and moves autonomously, and storing the shot video clip for the detection equipment to read;
the detection device is used for reading the video segment shot by the video acquisition device as a video segment to be detected, detecting by using the early screening system for cerebral palsy based on baby dynamic posture estimation according to any one of the first aspect, and outputting the detection result.
Compared with the prior art, the invention has the following beneficial effects:
the invention provides a cerebral palsy early screening system, equipment and a storage medium based on baby dynamic posture estimation, which can evaluate the risk of the baby suffering from cerebral palsy by analyzing a video segment of the baby doing autonomous movement in a supine position. According to the invention, more accurate posture recognition of the supine position of the infant is realized by finely adjusting the human posture estimation model pre-trained on a large-scale data set. The invention uses the Transformer encoder to encode the infant static attitude information frame by frame, and can better capture the association between the static attitude characteristics and the differential motion characteristics, such as the direction and frequency of motion, speed and acceleration, and the like. The invention trains the baby dynamic attitude feature extraction network by a self-supervision representation learning method, and uses a mask reconstruction task to guide the dynamic feature extraction network to mine the attitude information of the time sequence. In addition, the method is based on the collected video data set of the healthy baby, the dynamic posture characteristic distribution of the healthy baby is fitted, and the risk of the baby suffering from cerebral palsy is evaluated through video segments by using a density-based anomaly detection method in an actual application scene.
Drawings
Fig. 1 is a schematic diagram of the components of an early screening system for cerebral palsy based on the estimation of the dynamic posture of an infant.
Fig. 2 is a schematic flow chart of the construction process of the early screening system for cerebral palsy based on the estimation of the dynamic posture of the infant.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. The technical characteristics in the embodiments of the present invention can be combined correspondingly without mutual conflict.
In a preferred embodiment of the present invention, as shown in fig. 1, there is provided an early screening system for cerebral palsy based on the estimation of the dynamic posture of an infant, which can evaluate the risk of cerebral palsy of an infant by inputting a video segment of the infant performing an autonomous movement in a supine position. The whole system consists of a healthy baby feature space acquisition module, a baby video acquisition module, a static posture feature extraction module, a dynamic posture feature extraction module and an abnormality detection module, and the specific functions and implementation forms of the modules are described in detail below.
The healthy infant feature space acquisition module is used for acquiring a feature space constructed by an infant dynamic posture feature sequence of a healthy infant without cerebral palsy.
The baby video acquisition module is used for acquiring a video segment to be detected, wherein the baby to be screened is in a non-shielding supine position and moves autonomously.
The static posture feature extraction module is used for extracting the baby posture information in the video clip to be detected frame by frame through a pre-trained baby supine position posture estimation model and coding each frame of baby posture information to obtain the corresponding baby static posture feature; the infant supine position posture estimation model is obtained by fine tuning a pre-trained human body posture estimation model on an infant supine position posture estimation data set, wherein the infant supine position posture estimation data set is composed of key frames of a supine position posture infant motion video marked by joint points.
And the dynamic attitude feature extraction module is used for taking the static attitude features of the baby of each frame in the video clip to be detected as input, extracting the network coding of the whole video clip to be detected by using the pre-trained dynamic attitude features of the baby, and obtaining the dynamic attitude feature sequence of the baby after the time sequence information is synthesized in the video clip to be detected.
And the abnormality detection module is used for carrying out abnormality detection on the infant dynamic posture characteristic sequence of the video segment to be detected in the characteristic space and outputting a detection result that the infant in the video has cerebral palsy risk.
It should be noted that, in the above-mentioned baby video acquisition module, the acquisition mode of the video clip to be detected may be acquired online by calling a video acquisition device, or may be read in from a shot video uploaded or stored from the outside through a data reading interface, which is not limited herein. The video segment to be detected should meet the basic requirements of detection, such as clear image, length more than 10 seconds, and no occlusion of the baby.
It should be noted that, in the above healthy infant feature space acquisition module, the acquisition manner of the feature space may be to construct a new feature space by a dynamic construction method, or to directly acquire an already constructed feature space, which may not be limited to this. In addition, the estimation model of the supine position of the infant in the static posture feature extraction module and the extraction network of the dynamic posture feature of the infant adopted in the dynamic posture feature extraction module both need to be trained by using related data sets in advance, and the model can be put into practical use after the performance of the model meets the requirements. Therefore, the feature space, the infant supine position posture estimation model and the dynamic posture feature extraction module can be constructed and trained in advance and then stored in the corresponding modules, so that calling and processing can be conveniently carried out in practical application. Of course, the feature space, the infant supine position posture estimation model, and the dynamic posture feature extraction module in the above modules may also be continuously updated online to maintain the optimal detection performance.
As a preferred aspect of the embodiment of the present invention, when the feature space, the infant supine position posture estimation model, and the dynamic posture feature extraction module are newly constructed or the three modules are updated online, the processes of S1 to S5 may be performed as follows, as shown in fig. 2, the specific process is as follows:
s1: the method comprises the steps of obtaining a video data set of a healthy baby, enabling the age of the baby in each section of video clip sample in the data set to meet a preset age interval, belonging to the healthy baby without cerebral palsy, and enabling the baby to be in a supine position without shielding in a video and do autonomous movement.
It is noted that the samples in the video data set of healthy infants may be obtained by collecting videos of healthy infants from one month to two months old on the internet. The collected video samples need to be screened for the unobstructed baby video clip with clear images and the length of the unobstructed baby video clip greater than 10 seconds, and the baby needs to be in a supine position in the video and perform autonomous movement. Two medical experts in each video are required to identify the approximate age and health state of the baby in the video, and only video clips of the baby with the age meeting the requirements and the evaluation result being healthy are reserved as video clip samples.
S2: performing frame extraction on each video clip sample in the healthy infant video data set, randomly selecting partial key frames from the video clip samples, and labeling infant joint points in each key frame to form a labeled frame with infant posture information; constructing and obtaining the estimation data set of the supine position posture of the infant based on the labeling frame of each video clip sample; and fine-tuning a pre-trained human body posture estimation model on the infant supine position posture estimation data set to obtain the infant supine position posture estimation model.
It should be noted that, the information labeling of the infant joint on the key frame may be manually labeled or assisted by a labeling tool, which is not limited to this. The body posture estimation model can adopt any model capable of detecting the body posture, such as openpos. Before fine adjustment is carried out, the human body posture estimation model needs to be pre-trained on a related large-scale public human body posture estimation data set, and then fine adjustment is carried out on the small-sized manually marked infant supine position posture estimation data set to achieve good infant supine position posture estimation performance.
S3: and extracting the baby posture information in each section of video clip sample in the healthy baby video data set frame by using the baby supine position posture estimation model, and coding the baby posture information of each frame to obtain the baby static posture characteristic of the frame.
S4: establishing a baby dynamic posture feature extraction network, regarding each video clip sample in the healthy baby video data set, taking the baby static posture feature of each frame in the video clip sample as input, and encoding the whole video clip sample through the baby dynamic posture feature extraction network to construct a baby dynamic posture feature sequence which integrates time sequence information; and training a baby dynamic posture characteristic extraction network through a mask reconstruction task in a self-supervision mode to obtain the pre-trained baby dynamic posture characteristic extraction network.
S5: and coding all video fragment samples in the healthy infant video data set one by using the pre-trained infant dynamic posture feature extraction network to obtain an infant dynamic posture feature sequence corresponding to each video fragment sample, thereby forming the feature space for screening the infant with cerebral palsy risk through abnormal detection.
In the application stage, the constructed feature space, the trained infant supine position posture estimation model and the trained dynamic posture feature extraction module can be respectively embedded into corresponding modules for carrying out reasoning tasks. In the inference process, a user can submit a section of video of the supine position of the baby and then the video is used as the input of a baby video acquisition module, the video is trained in a static posture characteristic extraction module S2 to obtain a baby posture information frame by frame through a baby supine position posture estimation model, and the baby supine position posture information is encoded to obtain corresponding baby static posture characteristics, and a baby dynamic posture characteristic extraction network obtained by training in the S4 step is used for extracting a baby dynamic posture characteristic sequence. And then evaluating the risk of the abnormal posture of the current baby according to the characteristic distribution of the dynamic posture of the healthy baby in the characteristic space. And if the risk value is larger than a certain threshold value, determining that the current infant has the risk of cerebral palsy.
As a preferred mode of the embodiment of the present invention, in the static posture feature extraction module and step S2, the infant supine position posture estimation model may use a human posture estimation model openpoe pre-trained on a human posture estimation data set as a training starting point, and a back propagation algorithm is used on the artificially labeled infant supine position posture estimation data set to perform fine tuning on openpoe model parameters, so as to obtain a neural network model dedicated to infant supine position posture estimation; for a piece of image of the infant in the supine position, the model for estimating the supine position posture of the infant extracts position information of 8 body parts as posture information, and the extracted body parts comprise two-sided eyes, a neck, two-sided shoulders, two-sided elbows, two-sided wrists, a hip, two-sided knees and two-sided ankles.
Further, as a preferred aspect of the embodiment of the present invention, the static posture feature extraction module and the method for encoding the baby posture information of each frame in step S3 are as follows:
firstly, establishing a two-dimensional plane by taking the central points of the neck and the hip of the baby as the original points, and normalizing the position information of the body part by taking the distance from the neck to the hip of the baby to eliminate the influence of the body type of the baby; subsequently, position offset vectors of different parts of the body are respectively calculated, the position offset vectors represent the orientation of different positions of the body, and the starting point types of the position offset vectors comprise: from neck to hip, bilateral from neck to shoulder of body, bilateral from shoulder to elbow of body, bilateral from elbow to wrist of body, from neck to binocular center, from hip to bilateral knee, bilateral from knee to ankle of body; then, calculating an included angle between adjacent offset vectors, including: an included angle of a vector which takes the shoulder as a center and points to the neck and the elbow, an included angle of a vector which takes the elbow as a center and points to the shoulder and the wrist, an included angle of a vector which takes the hip as a center and points to the neck and the knee, and an included angle of a vector which takes the knee as a center and points to the hip and the ankle, wherein all included angles need to comprise both sides of the body; and finally, combining the body part position information and the included angle between the position offset vector and the adjacent offset vector, forming a multi-dimensional vector through coding, and using the multi-dimensional vector as the static posture characteristic of the baby. It should be noted that the specific dimensions of the multidimensional vector herein need to be determined according to the encoding situation, and in a preferred embodiment, the above-mentioned body part position information and the included angle between the position offset vector and the adjacent offset vector are encoded as 68-dimensional vectors.
Further, as a preferable mode of the embodiment of the present invention, in the dynamic pose feature extraction module and step S4, the baby dynamic pose feature extraction network may use a transform encoder as a network structure, and the input of the baby dynamic pose feature and the time interval between the time of the frame corresponding to the baby static pose feature and the start time of the video segment is the time interval.
Further, as a preferred mode of the embodiment of the present invention, when the baby dynamic posture feature extraction network is trained in advance in a self-supervision manner in step S4, mask reconstruction is used as an agent task to generate a training signal, so as to enhance the time-series information in the encoded features, and the specific training steps are as follows:
s41: sampling training samples from the healthy infant video data set, randomly taking a certain frame in a video clip sample obtained by sampling as a replaced position, replacing the infant static attitude characteristics corresponding to the replaced position with a fixed random code, and keeping the infant static attitude characteristics of the rest frames of the video clip sample unchanged to form a replaced input sequence;
s42: and then, encoding the replaced input sequence by using the baby dynamic attitude feature extraction network to obtain the encoded dynamic attitude feature corresponding to the replaced position.
S43: predicting the replaced baby static attitude characteristics in the replaced position by using a multilayer perceptron network and taking the coded dynamic attitude characteristics corresponding to the replaced position as input, and finishing the characteristic reconstruction;
s44: and evaluating the characteristic reconstruction quality by using the two-norm loss, optimizing parameters in the baby dynamic posture characteristic extraction network by using a back propagation algorithm by taking the characteristic reconstruction quality as a training target, and obtaining the pre-trained baby dynamic posture characteristic extraction network after training to convergence for carrying out an actual baby dynamic posture characteristic extraction task.
Further, as a preferable mode of the embodiment of the present invention, in the abnormality detection module, a high-dimensional gaussian distribution is used to fit the dynamic posture feature distribution of a small number of healthy infants. Video clips of these healthy infants can be collected from the internet, but require identification by medical professionals. The baby dynamic posture features of the video segments are extracted by the baby dynamic posture feature extraction network, and each video segment corresponds to a baby dynamic posture feature sequence with an indefinite length. The specific steps of the abnormality detection module for abnormality detection are as follows:
first, a high-dimensional Gaussian distribution p (x; mu, sigma) is used to fit the distribution of the baby dynamic posture feature sequence of all healthy babies without cerebral palsy in the feature space:
Figure BDA0003675241760000111
wherein x ∈ R n For the sequence of baby dynamic posture features, mu epsilon R n Is a mean vector of n-dimensional Gaussian distribution, sigma belongs to R n×n A covariance matrix of n-dimensional Gaussian distribution; the estimated values of the parameters mu and sigma obtained by fitting are as follows:
Figure BDA0003675241760000112
Figure BDA0003675241760000113
wherein x is (i) The baby dynamic posture characteristic sequence of the ith healthy baby is obtained, and m is the total number of the baby dynamic posture characteristic sequences in the characteristic space;
then, for the baby dynamic attitude characteristic sequence x of the video segment to be detected * Calculating the probability density value of the healthy baby in the dynamic posture distribution:
Figure BDA0003675241760000114
finally, according to the calculated probability density value p (x) * ) Making risk judgment if p (x) * ) If the value is less than the threshold epsilon, the baby in the video segment to be detected is considered to have the risk of suffering from cerebral palsy, otherwise, the baby in the video segment to be detected is considered to be a normal baby without suffering from cerebral palsy.
The present invention will be illustrated below by a specific embodiment to show the specific construction, training and practical application process of the early screening system for cerebral palsy based on baby dynamic posture estimation, so as to facilitate understanding.
Examples
1. Healthy infant video data set construction
Videos of healthy infants from one month to two months old are collected on the internet. The manual screening image is clear, the length of the non-shielding infant video clip is more than 10 seconds, and the infant needs to be in a supine position in the video and can move autonomously. Two medical experts identify the approximate age and health status of the baby in the video, and only keep a sample of video clips of the baby of which the age meets the requirements and the evaluation result is healthy, thereby constructing a healthy baby video database.
2. Construction of data set for estimating supine position posture of baby
The method comprises the steps of carrying out frame extraction processing on video segment samples obtained by screening in a healthy baby video database, randomly selecting partial key frames from the video segment samples, manually marking baby joint point information in the partial key frames, and manually adding human posture estimation marks, wherein the human posture estimation marks comprise positions of eyes on two sides, a neck, shoulders on two sides, elbows on two sides, wrists on two sides, buttocks, knees on two sides and ankles on two sides in an image, so that a small-sized baby supine position posture estimation data set is constructed.
3. Training of supine position posture estimation model of baby
An OpenPose human posture estimation model pre-trained on a general large-scale human posture estimation data set is obtained, the pre-trained OpenPose model is finely adjusted on the infant supine position posture estimation data set, and a more accurate infant supine position posture estimation model is obtained. The transfer learning method can use relatively small amount of data labels, and meanwhile, the data distribution difference between the infant supine position posture estimation and the general human body posture estimation is relieved to a great extent.
4. Infant static posture feature construction
And extracting the baby posture information in all video clip samples in the video database of the supine position of the healthy baby by using the fine-tuned baby supine position posture estimation model. And encoding the baby posture information of each frame to obtain the baby static posture characteristic of the frame.
In particular, the refined openpos model can accurately and efficiently identify the eyes, shoulders, neck, elbows, wrists, hips, knees, and ankles of a supine infant in a video keyframe. Based on which the baby static pose feature vector in the current keyframe can be constructed. The infant posture information contains 68 dimensions, and specifically includes the following information (note that some information includes both sides of the body):
4.1 Location information: (1) bilateral eyes, (2) neck, (3) bilateral shoulders, (4) bilateral elbows, (5) bilateral wrists, (6) buttocks, (7) bilateral knees, and (8) bilateral ankles.
4.2 Position offset vectors (i.e., orientation information) for different parts of the body: (1) from neck to hip, (2) from neck to shoulder (bilateral), (3) from shoulder to elbow (bilateral), (4) from elbow to wrist (bilateral), (5) from neck to binocular centre, (6) from hip to knee (bilateral), (7) from knee to ankle (bilateral)
4.3 Angle information: (1) the included angle of the vector pointing to the neck and the elbow (both sides) with the shoulder as the center, (2) the included angle of the vector pointing to the shoulder and the wrist (both sides) with the elbow as the center, (3) the included angle of the vector pointing to the neck and the knee (both sides) with the hip as the center, (4) the included angle of the vector pointing to the hip and the ankle (both sides) with the knee as the center.
5. Baby dynamic posture feature extraction network training
A baby dynamic posture feature extraction network is constructed, and time sequence information such as motion direction, frequency, speed and the like can be coded on the basis of frame-by-frame posture recognition. For each video clip sample in the database, the baby static posture characteristics of each frame are used as input, a baby dynamic posture characteristic extraction network is used for coding the whole video clip sample, and a baby dynamic posture characteristic sequence integrating time sequence information is constructed.
Specifically, the baby dynamic posture feature extraction network is based on a single-layer Transformer encoder structure. To avoid overfitting during training, this encoder contains only one attention branch, and the hidden state vector has dimension 128. The self-attention mechanism in the transform encoder can automatically capture the correlation between the static posture characteristics of the infant at different time instants and the differential motion characteristics, such as the direction and frequency of motion, speed and acceleration, and the like.
In the embodiment, on the premise of no need of additional labeling, the baby dynamic posture feature extraction network is trained in a self-supervision mode through a mask reconstruction task. When the method of self-supervision representation learning is used, sequence mask reconstruction is used as an agent task, and dynamic attitude feature extraction network is guided to mine association and differential motion features between static attitude features at different moments. The specific training process is as follows:
5.1 The weights of the baby dynamic attitude feature extraction network are initialized randomly.
5.2 The filtered video clip samples are subjected to frame extraction, and fine-tuned infant supine position posture estimation models are used for extracting infant posture information for each key frame, and the infant posture information is encoded into 68-dimensional static posture characteristic vectors. At the end of this vector a one-dimensional is added, representing the time interval from the start time to the current key-frame. Through the processing, each video clip corresponds to an indefinite length 69-dimensional static feature sequence which is used as the input of the baby dynamic posture feature extraction network.
5.3 Randomly screening a static feature sequence consisting of the baby static posture features of each frame in a video clip sample, and replacing the baby static posture feature at a random position in the sequence with a fixed random vector, namely erasing an element in the sequence through a random mask. And inputting the replaced static characteristic sequence into the baby dynamic attitude characteristic extraction network to obtain a coded dynamic characteristic sequence.
5.4 The feature vector of the dynamic feature sequence at the replacement position is taken out and input into a simple multi-layer perceptron network to predict the static posture feature of the baby replaced in the step 5.3).
5.5 Computing the two-norm loss between the predicted infant static posture characteristic and the real infant static posture characteristic, and updating parameters in the infant dynamic posture characteristic extraction network by using a back propagation and gradient descent algorithm to gradually reduce a loss function.
5.6 And) repeating the steps from 5.3) to 5.5) until the optimization algorithm is converged to obtain the trained infant dynamic posture feature extraction network.
The training method utilizes the continuity of baby posture change in the video, and reconstructs erased static posture information through the context information of the time dimension at the replaced moment, thereby enhancing the capability of baby dynamic posture feature extraction network mining time sequence posture feature.
6. Feature space construction
And coding video clips in all video databases of the supine position of the healthy baby into a dynamic posture characteristic sequence by using the trained baby dynamic posture characteristic extraction network. Then, a multi-dimensional Gaussian distribution is used to fit the distribution of the dynamic posture characteristics of the healthy baby. In this feature space, infants at risk of cerebral palsy can be screened using a method based on density-based anomaly detection.
In this embodiment, the probability distribution used for fitting is a multidimensional gaussian distribution. In order to capture the correlation between different dimensions of the dynamic attitude feature, no independence assumption is made here, that is, the covariance matrix of the multidimensional gaussian distribution is not necessarily a diagonal matrix. The specific form of distribution is as follows:
Figure BDA0003675241760000141
in this embodiment, the method for fitting the distribution of the dynamic posture features of the healthy baby estimates the parameters in the distribution of the multidimensional gaussian distribution by using a maximum likelihood method, and the specific method is as follows:
Figure BDA0003675241760000142
Figure BDA0003675241760000143
the above parameters are defined as described above and are not described in detail.
7. Practical application
In the application stage, based on the constructed feature space, the trained infant supine position posture estimation model and the trained dynamic posture feature extraction module, the early screening system for the cerebral palsy based on the infant dynamic posture estimation, which is composed of a healthy infant feature space acquisition module, an infant video acquisition module, a static posture feature extraction module, a dynamic posture feature extraction module and an abnormality detection module, can be established. In the system, a supine position video of any section of baby is input into a baby video acquisition module, the static posture feature of the baby of each frame in the video is extracted through a static posture feature extraction module, and then the trained dynamic posture feature of the baby is used for extracting features through a network. And then evaluating the risk of the abnormal posture of the current baby according to the distribution of the dynamic posture characteristics of the healthy baby. And if the risk value is larger than a certain threshold value, determining that the current infant has the risk of cerebral palsy.
It should be noted that the threshold value of the probability density for determining whether the infant is at risk of cerebral palsy needs to be determined in practical applications by considering the recall rate and accuracy of the screening.
Also, based on the same inventive concept, another preferred embodiment of the present invention further provides a computer electronic device corresponding to the early screening system for cerebral palsy based on baby dynamic posture estimation provided by the foregoing embodiments, which is characterized by comprising a memory and a processor;
the memory for storing a computer program;
the processor, when executing the computer program, is capable of outputting the detection result by using the early screening system for cerebral palsy based on baby dynamic posture estimation as described in the foregoing embodiments.
Also, based on the same inventive concept, another preferred embodiment of the present invention further provides a computer-readable storage medium corresponding to the early screening system for cerebral palsy based on baby dynamic posture estimation provided by the foregoing embodiments, wherein the storage medium has a computer program stored thereon, and when the computer program is executed by a processor, the computer program can output the detection result by using the early screening system for cerebral palsy based on baby dynamic posture estimation as described in the foregoing embodiments.
The modules in the early screening system for cerebral palsy based on baby dynamic posture estimation are executed as program modules which are executed in sequence, so that the process of data processing is executed essentially. Specifically, when being executed, the computer program in the above embodiment is equivalent to executing a method for screening early cerebral palsy based on baby dynamic posture estimation, and the process is as follows:
step 1, acquiring a feature space constructed by a baby dynamic posture feature sequence of a healthy baby without cerebral palsy;
step 2, acquiring a video segment to be detected, wherein the baby to be screened is in a non-shielding supine position and moves autonomously;
step 3, extracting baby posture information in the to-be-detected video clip frame by frame through a pre-trained baby supine position posture estimation model, and coding each frame of baby posture information to obtain corresponding baby static posture characteristics; the infant supine position posture estimation model is obtained by fine tuning a pre-trained human body posture estimation model on an infant supine position posture estimation data set, wherein the infant supine position posture estimation data set is composed of key frames of a supine position posture infant motion video marked by joint points;
step 4, taking the baby static attitude feature of each frame in the video clip to be detected as input, extracting the network coding of the whole video clip to be detected by the pre-trained baby dynamic attitude feature, and obtaining a baby dynamic attitude feature sequence after the time sequence information is synthesized in the video clip to be detected;
and 5, carrying out abnormity detection on the baby dynamic posture characteristic sequence of the video segment to be detected in the characteristic space, and outputting a detection result of the risk that the baby suffers from cerebral palsy in the video.
Because the principle of the method for screening early cerebral palsy based on baby dynamic posture estimation is similar to that of the system for screening early cerebral palsy based on baby dynamic posture estimation in the above embodiment of the present invention, the detailed implementation form of each module of the device in this embodiment may also be referred to the detailed implementation form of the above system part, and repeated details are not repeated.
It is understood that the storage medium and the Memory may be Random Access Memory (RAM) or Non-Volatile Memory (NVM), such as at least one disk Memory. Meanwhile, the storage medium may be various media capable of storing program codes, such as a U-disk, a removable hard disk, a magnetic disk, or an optical disk.
It is understood that the Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), etc.; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component.
It should be further noted that, as will be clearly understood by those skilled in the art, for convenience and brevity of description, the specific working process of the apparatus described above may refer to the corresponding process in the foregoing method embodiment, and is not described herein again. In the embodiments provided in the present application, the division of the steps or modules in the apparatus and method is only one logical function division, and in actual implementation, there may be another division manner, for example, multiple modules or steps may be combined or may be integrated together, and one module or step may also be split.
Also, based on the same inventive concept, another preferred embodiment of the present invention further provides an early screening device for cerebral palsy, which corresponds to the early screening system for cerebral palsy based on baby dynamic posture estimation provided by the foregoing embodiment, and comprises a video capturing device and a detecting device;
the video acquisition equipment is used for shooting a video clip of the infant to be screened, which is in a non-shielding supine position and moves autonomously, and storing the shot video clip for the detection equipment to read;
the detection device is used for reading the video segment shot by the video acquisition device as a video segment to be detected, detecting by using the early screening system for cerebral palsy based on baby dynamic posture estimation in the embodiment, and outputting the detection result.
It should be noted that the video capture device may be a camera carried on a video camera, a mobile phone, or other devices, and may capture and maintain the video of the activity of the infant after receiving the capture instruction. The aforementioned detection device may be a data processing device with a result prompting function, such as a computer with a function of displaying a detection result through a display, or a computer capable of outputting a detection result report, or a web page that provides an uploaded video clip through a cloud server and displays the detection result.
It should be further noted that, as will be clearly understood by those skilled in the art, for convenience and brevity of description, the specific working process of the system described above may refer to the corresponding process in the foregoing method embodiment, and is not described herein again. In the embodiments provided in the present application, the division of the steps or modules in the system and method is only one logical function division, and in actual implementation, there may be another division manner, for example, multiple modules or steps may be combined or may be integrated together, and one module or step may also be split.
The above-described embodiments are merely preferred embodiments of the present invention, which should not be construed as limiting the invention. Various changes and modifications may be made by one of ordinary skill in the pertinent art without departing from the spirit and scope of the present invention. Therefore, the technical solutions obtained by means of equivalent substitution or equivalent transformation all fall within the protection scope of the present invention.

Claims (9)

1. An early screening system for cerebral palsy based on baby dynamic posture estimation, comprising:
the healthy infant feature space acquisition module is used for acquiring a feature space constructed by an infant dynamic posture feature sequence of a healthy infant without cerebral palsy;
the baby video acquisition module is used for acquiring a video segment to be detected, wherein the baby to be screened is in a non-shielding supine position and moves autonomously;
the static posture feature extraction module is used for extracting the baby posture information in the video clip to be detected frame by frame through a pre-trained baby supine position posture estimation model and coding each frame of baby posture information to obtain the corresponding baby static posture feature; the infant supine position posture estimation model is obtained by fine tuning a pre-trained human body posture estimation model on an infant supine position posture estimation data set, wherein the infant supine position posture estimation data set is composed of key frames of a supine position posture infant motion video marked by joint points;
the dynamic attitude feature extraction module is used for taking the static attitude feature of the baby of each frame in the video clip to be detected as input, extracting the network coding of the whole video clip to be detected by the pre-trained baby dynamic attitude feature extraction network, and obtaining the baby dynamic attitude feature sequence after the time sequence information is synthesized in the video clip to be detected;
the abnormality detection module is used for carrying out abnormality detection on the baby dynamic posture characteristic sequence of the video segment to be detected in the characteristic space and outputting a detection result that the baby in the video has cerebral palsy risk;
in the abnormality detection module, the specific steps of abnormality detection are as follows:
first, a high-dimensional Gaussian distribution p (x; mu, sigma) is used to fit the distribution of the baby dynamic posture feature sequence of all healthy babies without cerebral palsy in the feature space:
Figure FDA0004017397160000011
wherein x ∈ R n For the sequence of baby dynamic posture features, mu epsilon R n Is a mean vector of n-dimensional Gaussian distribution, sigma belongs to R n×n A covariance matrix of n-dimensional Gaussian distribution; the estimated values of the parameters mu and sigma obtained by fitting are as follows:
Figure FDA0004017397160000012
Figure FDA0004017397160000013
wherein x is (i) The baby dynamic posture characteristic sequence of the ith healthy baby is obtained, and m is the total number of the baby dynamic posture characteristic sequences in the characteristic space;
then, for the baby dynamic attitude characteristic sequence x of the video segment to be detected * Calculating the probability density value of the healthy baby in the dynamic posture distribution:
Figure FDA0004017397160000021
finally, according to the calculated probability density value p (x) * ) Making risk judgment if p (x) * ) If the value is less than the threshold epsilon, the baby in the video segment to be detected is considered to have the risk of cerebral palsy, otherwise, the baby in the video segment to be detected is considered to be a normal baby without cerebral palsy.
2. The system for early screening of cerebral palsy based on estimation of infant dynamic posture as claimed in claim 1, wherein in the healthy infant feature space obtaining module, the feature space obtaining method is as follows:
s1: acquiring a video data set of a healthy baby, wherein the age of the baby in each video fragment sample in the data set meets a preset age interval, the baby belongs to a healthy baby without cerebral palsy, and the baby needs to be in a supine position in a video without shielding and does autonomous movement;
s2: performing frame extraction on each video clip sample in the healthy infant video data set, randomly selecting partial key frames from the video clip samples, and labeling infant joint points in each key frame to form a labeled frame with infant posture information; constructing and obtaining the estimation data set of the supine position posture of the infant based on the labeling frame of each video clip sample; fine-tuning a pre-trained human body posture estimation model on the infant supine position posture estimation data set to obtain the infant supine position posture estimation model;
s3: extracting the baby posture information in each section of video clip sample in the healthy baby video data set frame by using the baby supine position posture estimation model, and coding the baby posture information of each frame to obtain the baby static posture characteristic of the frame;
s4: for each video clip sample in the healthy infant video data set, taking the infant static posture feature of each frame in the video clip sample as input, extracting the network coding whole video clip sample through the infant dynamic posture feature, and constructing an infant dynamic posture feature sequence integrating time sequence information; training a baby dynamic posture feature extraction network through a mask reconstruction task in a self-supervision mode to obtain the pre-trained baby dynamic posture feature extraction network;
s5: and coding all video fragment samples in the healthy infant video data set one by using the pre-trained infant dynamic posture feature extraction network to obtain an infant dynamic posture feature sequence corresponding to each video fragment sample, thereby forming the feature space for screening the infant with cerebral palsy risk through abnormal detection.
3. The early stage screening system for cerebral palsy based on baby dynamic pose estimation as claimed in claim 1 or 2, wherein the baby supine position pose estimation model takes a human pose estimation model OpenPose pre-trained on a human pose estimation data set as a training starting point, and a back propagation algorithm is used to fine tune OpenPose model parameters on the artificially labeled baby supine position pose estimation data set, thereby obtaining a neural network model dedicated for baby supine position pose estimation; for a piece of image of the infant in the supine position, the infant supine position posture estimation model extracts position information of 8 body parts as posture information, the extracted body parts including bilateral eyes, a neck, bilateral shoulders, bilateral elbows, bilateral wrists, a hip, bilateral knees and bilateral ankles.
4. The system as claimed in claim 3, wherein the method for encoding the infant posture information of each frame is as follows:
firstly, establishing a two-dimensional plane by taking the central points of the neck and the hip of the baby as the original points, and normalizing the position information of the body part by taking the distance from the neck to the hip of the baby to eliminate the influence of the body type of the baby; subsequently, position offset vectors of different parts of the body are respectively calculated, including: from neck to hip, bilateral from neck to shoulder of body, bilateral from shoulder to elbow of body, bilateral from elbow to wrist of body, from neck to binocular center, from hip to bilateral knee, bilateral from knee to ankle of body; then, calculating an included angle between adjacent offset vectors, including: an included angle of a vector which takes the shoulder as the center and points to the neck and the elbow, an included angle of a vector which takes the elbow as the center and points to the shoulder and the wrist, an included angle of a vector which takes the hip as the center and points to the neck and the knee, and an included angle of a vector which takes the knee as the center and points to the hip and the ankle, wherein all included angles need to comprise both sides of the body; and finally, combining the body part position information and the included angle between the position offset vector and the adjacent offset vector, forming a multi-dimensional vector through coding, and using the multi-dimensional vector as the static posture characteristic of the infant.
5. The infant dynamic pose estimation-based early screening system for cerebral palsy according to claim 1, wherein the infant dynamic pose feature extraction network uses a Transformer encoder as a network structure, and the input of the network structure is the infant static pose feature and the time interval between the frame time corresponding to the static pose feature and the video segment start time.
6. The early screening system for cerebral palsy based on baby dynamic pose estimation as claimed in claim 2, wherein when the baby dynamic pose feature extraction network is trained in advance in a self-supervision manner, the mask reconstruction is used as an agent task to generate a training signal, so as to enhance the timing information in the encoded features, and the specific training steps are as follows:
s41: sampling training samples from the healthy infant video data set, randomly taking a certain frame in a sampled video fragment sample as a replaced position, replacing the infant static attitude characteristics corresponding to the replaced position with a fixed random code, and keeping the infant static attitude characteristics of the rest frames of the video fragment sample unchanged to form a replaced input sequence;
s42: then, encoding the replaced input sequence by using the baby dynamic posture characteristic extraction network to obtain encoded dynamic posture characteristics corresponding to the replaced positions;
s43: predicting the replaced baby static attitude characteristics in the replaced position by using a multilayer perceptron network and taking the coded dynamic attitude characteristics corresponding to the replaced position as input, and finishing the characteristic reconstruction;
s44: and evaluating the characteristic reconstruction quality by using the two-norm loss, optimizing parameters in the baby dynamic posture characteristic extraction network by using a back propagation algorithm by taking the characteristic reconstruction quality as a training target, and obtaining the pre-trained baby dynamic posture characteristic extraction network after training to convergence for carrying out an actual baby dynamic posture characteristic extraction task.
7. A computer electronic device comprising a memory and a processor;
the memory for storing a computer program;
the processor, when executing the computer program, is capable of outputting the detection result by using the early screening system for cerebral palsy based on baby dynamic posture estimation as claimed in any one of claims 1 to 6.
8. A computer-readable storage medium, wherein the storage medium has a computer program stored thereon, and when the computer program is executed by a processor, the computer program can output the detection result by using the early screening system for cerebral palsy based on baby dynamic posture estimation as claimed in any one of claims 1 to 6.
9. The early screening equipment for the cerebral palsy is characterized by comprising video acquisition equipment and detection equipment;
the video acquisition equipment is used for shooting a video clip of the infant to be screened, which is in a non-shielding supine position and moves autonomously, and storing the shot video clip for the detection equipment to read;
the detection device is used for reading the video segment shot by the video acquisition device as a video segment to be detected, detecting the video segment by using the early screening system for the cerebral palsy based on the baby dynamic posture estimation as claimed in any one of claims 1 to 6, and outputting the detection result.
CN202210622793.5A 2022-05-27 2022-06-02 Early screening system, equipment and storage medium for cerebral palsy based on baby dynamic posture estimation Active CN114999648B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210593777 2022-05-27
CN2022105937778 2022-05-27

Publications (2)

Publication Number Publication Date
CN114999648A CN114999648A (en) 2022-09-02
CN114999648B true CN114999648B (en) 2023-03-24

Family

ID=83031834

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210622793.5A Active CN114999648B (en) 2022-05-27 2022-06-02 Early screening system, equipment and storage medium for cerebral palsy based on baby dynamic posture estimation

Country Status (1)

Country Link
CN (1) CN114999648B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110414306A (en) * 2019-04-26 2019-11-05 吉林大学 A kind of Infants With Abnormal behavioral value method based on meanshift algorithm and SVM

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
BR102015005444A2 (en) * 2015-03-11 2016-09-13 Mirca Christina Da Silva Batista wireless electromyograph equipment and its operating system
KR101738923B1 (en) * 2015-11-04 2017-05-24 (주) 어드밴스드 엔티 The pharmaceutical composition for the prophylaxis or treatment of epilepsy or seizure-related disease targeting miRNA to regulate the expression of TSC1 and mTOR protein and method for screening
CN109620244B (en) * 2018-12-07 2021-07-30 吉林大学 Infant abnormal behavior detection method based on condition generation countermeasure network and SVM
CN112842261B (en) * 2020-12-30 2021-12-28 西安交通大学 Intelligent evaluation system for three-dimensional spontaneous movement of infant based on complex network
CN112932445A (en) * 2021-01-26 2021-06-11 芭雅医院投资管理(上海)有限公司 Child cerebral palsy and cranial nerve injury assessment instrument based on big data and use method thereof
CN113180641A (en) * 2021-04-12 2021-07-30 福建省立医院 Characteristic gait recognition system for instable chronic ankle joint
CN113111844B (en) * 2021-04-28 2022-02-15 中德(珠海)人工智能研究院有限公司 Operation posture evaluation method and device, local terminal and readable storage medium
CN113903468A (en) * 2021-10-30 2022-01-07 重庆喜来云科技有限公司 Mobile phone terminal-based ultra-early intelligent screening and monitoring method for high-risk infants
CN114129175A (en) * 2021-11-19 2022-03-04 江苏科技大学 LSTM and BP based motor imagery electroencephalogram signal classification method
CN114081447B (en) * 2021-11-22 2024-04-02 西安交通大学 Infant brain development state evaluation system based on common video input
CN114399838A (en) * 2022-01-18 2022-04-26 深圳市广联智通科技有限公司 Multi-person behavior recognition method and system based on attitude estimation and double classification

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110414306A (en) * 2019-04-26 2019-11-05 吉林大学 A kind of Infants With Abnormal behavioral value method based on meanshift algorithm and SVM

Also Published As

Publication number Publication date
CN114999648A (en) 2022-09-02

Similar Documents

Publication Publication Date Title
Rui et al. Segmenting visual actions based on spatio-temporal motion patterns
US11482048B1 (en) Methods and apparatus for human pose estimation from images using dynamic multi-headed convolutional attention
CN106909938B (en) Visual angle independence behavior identification method based on deep learning network
Nguyen-Thai et al. A spatio-temporal attention-based model for infant movement assessment from videos
CN110991340B (en) Human body action analysis method based on image compression
Vakanski et al. Mathematical modeling and evaluation of human motions in physical therapy using mixture density neural networks
CN110575663A (en) physical education auxiliary training method based on artificial intelligence
CN113435432B (en) Video anomaly detection model training method, video anomaly detection method and device
CN114269243A (en) Fall risk evaluation system
Williams et al. Assessment of physical rehabilitation movements through dimensionality reduction and statistical modeling
CN117690583B (en) Internet of things-based rehabilitation and nursing interactive management system and method
Elshwemy et al. A New Approach for Thermal Vision based Fall Detection Using Residual Autoencoder.
Kramer et al. Reconstructing nonlinear dynamical systems from multi-modal time series
Liu et al. Information recovery-driven deep incomplete multiview clustering network
Hristov Real-time abnormal human activity detection using 1DCNN-LSTM for 3D skeleton data
CN114027786B (en) Sleep breathing disorder detection method and system based on self-supervision type memory network
CN114999648B (en) Early screening system, equipment and storage medium for cerebral palsy based on baby dynamic posture estimation
CN111063438B (en) Sleep quality evaluation system and method based on infrared image sequence
Chen et al. An effective swimming stroke recognition system utilizing deep learning based on inertial measurement units
CN115546491B (en) Fall alarm method, system, electronic equipment and storage medium
CN116631619A (en) Postoperative leg bending training monitoring system and method thereof
CN116543455A (en) Method, equipment and medium for establishing parkinsonism gait damage assessment model and using same
CN115019388A (en) Full-automatic gait analysis method for shooting gait video by using monocular camera
Li et al. [Retracted] Human Sports Action and Ideological and PoliticalEvaluation by Lightweight Deep Learning Model
Wahla et al. Visual fall detection from activities of daily living for assistive living

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant