WO2022022551A1 - 具有隐私保护功能的运动障碍评估录像分析方法及设备 - Google Patents

具有隐私保护功能的运动障碍评估录像分析方法及设备 Download PDF

Info

Publication number
WO2022022551A1
WO2022022551A1 PCT/CN2021/108849 CN2021108849W WO2022022551A1 WO 2022022551 A1 WO2022022551 A1 WO 2022022551A1 CN 2021108849 W CN2021108849 W CN 2021108849W WO 2022022551 A1 WO2022022551 A1 WO 2022022551A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
face
key points
face image
image
Prior art date
Application number
PCT/CN2021/108849
Other languages
English (en)
French (fr)
Inventor
眭亚楠
朱秉泉
李路明
Original Assignee
清华大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 清华大学 filed Critical 清华大学
Publication of WO2022022551A1 publication Critical patent/WO2022022551A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • G06V40/176Dynamic expression
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/23Recognition of whole body movements, e.g. for sport training
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/50Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis

Definitions

  • the invention relates to the field of medical image analysis, in particular to a method and device for analyzing video recordings of motion disorder assessment with a privacy protection function.
  • Human facial expressions or body posture can reflect some diseases. For example, Parkinson's disease can cause frozen expressions and movement disorders. Doctors can make corresponding diagnosis by judging the patient's eye blinking, mouth opening and walking status.
  • the movement disorder assessment video contains the patient's own appearance, which can easily reveal the patient's identity and damage the patient's privacy.
  • Existing video privacy protection technologies generally add mosaics to the face area or modify the pixel values of the face. Such processing methods completely eliminate the information of the face, and cannot judge the information reflected in the face of some diseases.
  • the present invention provides a movement disorder assessment video analysis method with a privacy protection function, including:
  • the motion features used to assist in diagnosing the disease are determined.
  • performing face-changing processing on the characters in the video including:
  • an encoding network to perform feature extraction on the face image in the video to obtain feature data, including:
  • Feature extraction is performed on the frontal face image by using an encoding network to obtain feature data
  • Replacing the face image in the video with the reconstructed face image including:
  • the method before replacing the face image in the video with the adjusted reconstructed face image, the method further includes:
  • the pixel value of the reconstructed face image is adjusted according to the pixel value of the face image in the video, so that the adjusted reconstructed face image matches the color histogram of the face image in the video.
  • performing face-changing processing on the characters in the video including:
  • the time window median smoothing is performed on the time series of the face detection position after excluding the wrong identification information, so as to stabilize the face detection position.
  • excluding misidentified information includes:
  • the key points include facial key points; according to the change of the key points with the video recording time, the motion characteristics used to assist in diagnosing the disease are determined, including:
  • the facial expression features used to assist in diagnosing a disease are determined according to the change of the facial area with the recording time.
  • the facial key points include a plurality of key points around the eyes; according to the change of the key points with the recording time, the motion characteristics used to assist in diagnosing the disease are determined, including:
  • the blinking frequency is determined according to the change of the open eye area.
  • the facial key points include a plurality of key points around the mouth; according to the change of the key points with the recording time, the motion characteristics used to assist in diagnosing the disease are determined, including:
  • the change of the mouth area is determined according to the change of the mouth area with the recording time.
  • the key point includes an ankle key point; according to the change of the key point with the video recording time, determine the motion feature used to assist in diagnosing the disease, including:
  • Step frequency information is determined according to the swing action.
  • the key points include a plurality of finger joint key points; according to the change of the key points with the recording time, the motion characteristics used to assist in diagnosing the disease are determined, including:
  • the pinch action determine the frequency, amplitude, and time variation trend of the index finger and thumb pinch.
  • the key points include a plurality of finger joint key points; according to the change of the key points with the recording time, the motion characteristics used to assist in diagnosing the disease are determined, including:
  • the fist clenching frequency is determined according to the fist clenching action.
  • the key point includes a wrist key point, an elbow key point; According to the change of the key point with the video recording time, determine the motion feature for assisting in diagnosing a disease, including:
  • the speed of the rotating arm is determined according to the alternating motion.
  • the key points include a hip joint key point, a shoulder joint key point, a knee key point, and an ankle key point; according to the change of the key point with the video recording time, determine the motion feature used to assist in diagnosing the disease, including:
  • the center of gravity offset information and the degree of sway of the center of gravity are determined.
  • identifying key points in the video including:
  • the location of the key point is determined according to the key point distribution probability information.
  • the present invention also provides a movement disorder assessment video analysis device with a privacy protection function, comprising at least one processor and a memory communicatively connected to the at least one processor; An instruction executed by a processor, the instruction being executed by the at least one processor, so that the at least one processor executes the above-mentioned method for analyzing video recordings of a movement disorder assessment with a privacy protection function.
  • the video of the movement disorder assessment is processed to obtain the video after the face-changing. This achieves the purpose of protecting user privacy, and has a high degree of restoration of expressions.
  • the movement characteristics related to the disease can be determined, and the quantifiable key diagnosis indicators of the disease can be obtained, making the movement video more medical. It can effectively assist doctors in diagnosing related diseases, and has strong practicability.
  • Fig. 1 is the flow chart of the video analysis method of dyskinesia assessment in the embodiment of the present invention
  • Fig. 2 is an image marked with key points in an embodiment of the present invention
  • FIG. 3 is a key point distribution probability diagram obtained by utilizing a neural network model in an embodiment of the present invention.
  • Fig. 4 is the relation diagram of the open-eye area and video recording process in the embodiment of the present invention.
  • Fig. 5 is the relation diagram of the ankle distance and video recording process in the embodiment of the present invention.
  • Fig. 6 is the working principle diagram of the face-swapping network in the training phase in the embodiment of the present invention.
  • Fig. 7 is a working principle diagram of a specific face-changing network in the training phase in the embodiment of the present invention.
  • FIG. 8 is a working principle diagram of the face-changing network in the face-changing stage in the embodiment of the present invention.
  • FIG. 9 is a diagram showing the relationship between the open-eye area and the video recording process before and after face-changing in an embodiment of the present invention.
  • An embodiment of the present invention provides a method for analyzing video of a movement disorder assessment, and the method can be executed by an electronic device such as a computer, a portable terminal, or a server. As shown in Figure 1, the method includes the following steps:
  • the video is a video with a certain length of time shot for a user (patient).
  • the user completes some prescribed actions during the shooting process, such as walking, sitting down, standing up, etc., so that the video records the movement of the patient for a period of time. process.
  • S3 identify the key points of the human body in the video after face-changing processing.
  • Key points include, for example, wrists, elbows, shoulders, hips, knees, ankles, as well as face markers around the eyes, around the mouth, and so on.
  • the image is a frame of the video, in which multiple key points of the whole body are marked. Which keypoints are used may be based on the motor function assessed for dyskinesia.
  • a deep neural network is used to identify the above-mentioned key points.
  • the neural network is pre-trained with person images marked with keypoints so that it can identify the required keypoints for the images.
  • each frame of the video is extracted and input to the trained neural network.
  • the neural network in this embodiment identifies the input image, and outputs the distribution probability information of the key points for each key point, which may be a distribution probability map as shown in FIG. 3 .
  • Figure 3 is a heatmap of the distribution probability of left ankle key points (Heatmap of Left Ankle), where the abscissa is the x coordinate of the input image, the ordinate is the y coordinate of the input image, the legend on the right represents the heat, the higher the heat, the color The deeper it is, it is determined that the highest probability value in the probability map, that is, the highest heat is the predicted position of the left ankle key point. In this way, the pixel position of each key point of the body in the picture can be output. If the probability value is less than the given threshold, it is determined that the key point does not appear.
  • Motion feature information includes, for example, facial expression information, which can be used to assist doctors in judging whether there is facial freezing caused by Parkinson's disease; motion feature information, such as foot/leg motion state information, can be used to assist doctors in judging whether or not There are movement disorders due to various diseases.
  • the movement disorder assessment video is processed to obtain a video after face-changing.
  • the video can retain the user's expression and light and shadow, and has the appearance of a public figure, thereby achieving The purpose of protecting user privacy, and has a high degree of restoration of expressions.
  • the key points include a plurality of periocular key points.
  • Step S4 specifically includes:
  • S41A determining a corresponding eye area according to a plurality of key points around the eye, that is, an area surrounded by key points around the eye of the same eye.
  • S42A determining the change of the open eye area according to the change of the eye area with the recording time.
  • the area of the area enclosed by the marked key points around the eye can be calculated and normalized, such as dividing the area of the eye area by the square value of the width of the eye to obtain the normalized open eye area.
  • a relationship diagram between the open-eye area and the video recording process can be drawn and presented, such as the line graph shown in FIG.
  • the peak point corresponds to the blinking action, from which the blinking frequency of the subject can be counted.
  • the obtained eye opening area and blink frequency can be used as key indicators for diagnosing Parkinson's disease, and doctors can evaluate (score) the patient's condition based on this quantitative data, so as to prevent doctors from making too subjective judgments.
  • the key points include a plurality of peri-oral key points.
  • Step S4 specifically includes:
  • the mouth area is determined according to a plurality of key points around the mouth.
  • the circumference of the mouth refers to the circumference of the inner mouth, that is, the area surrounded by key points of each inner circumference of the mouth.
  • S42B determining the change of the mouth area according to the change of the mouth area with the recording time.
  • the area of the area enclosed by the marked key points around the mouth can be calculated and normalized, for example, the normalized open mouth area is obtained by dividing the area of the mouth area by the square value of the mouth width.
  • a relationship diagram between the mouth opening area and the video recording process can be drawn and presented, such as a line graph similar to FIG. 4 , where the abscissa represents the video recording process (time), and the ordinate represents the mouth opening area.
  • the obtained mouth area (size) and its changes can be used as key indicators for diagnosing Parkinson's disease. Doctors can evaluate (score) the patient's condition based on this quantitative data, so as to avoid doctors making too subjective decisions. judge.
  • the keypoints comprise ankle keypoints.
  • Step S4 specifically includes:
  • S41C determining the swing action according to the change of the position of the key point of the ankle with the recording time.
  • the positions of the two ankles (points) are respectively determined to determine the change of their relative positions, and the swing action can be determined by performing peak detection on the positions between the two ankles.
  • a corresponding relationship diagram can be drawn according to the relative position and the recording process, such as the line graph shown in Figure 5, where the abscissa represents the recording process (time), and the ordinate is the distance between the key points of the left and right ankles.
  • the obtained cadence information can be used as a key indicator for diagnosing Parkinson's disease, and the doctor can evaluate (score) the patient's condition according to this quantitative data, so as to prevent the doctor from making an overly subjective judgment.
  • the key points may also include key points of each knuckle of the finger.
  • the finger tapping motion in Parkinson's disease detection can be detected, and the frequency, amplitude, and time-dependent trend of pinching of the index finger and thumb can be calculated; it can also be detected in Parkinson's disease detection.
  • the palm movement function (fist clenching movement), the calculation gives information such as the frequency of clenching a fist.
  • Keypoints can also include wrist and elbow keypoints.
  • the rotation movement (forearm rotation movement) in the detection of Parkinson's disease can be detected, and the speed information of the rotating arm can be calculated and given.
  • Keypoints can also include hip, shoulder, and knee keypoints.
  • hip, shoulder, and knee keypoints By detecting the positions of key points of the hip joint, shoulder joint, knee and ankle, the degree of gait impairment can be detected, and information including the offset of the center of gravity and the degree of swaying of the center of gravity can be given.
  • the present invention also provides a movement disorder assessment video analysis device, comprising: at least one processor; and a memory connected in communication with the at least one processor; wherein, the memory stores instructions executable by the one processor , the instructions are executed by the at least one processor, so that the at least one processor executes the above-mentioned method for analyzing video recordings for assessment of movement disorders.
  • a machine learning algorithm is used to realize the replacement of the human face image, which specifically refers to using the deep learning technology to perform the face-changing processing on the human image.
  • the embodiments of the present invention provide a face-changing model and a training method thereof, and the method can be executed by electronic devices such as a computer and a server.
  • the face-changing model includes an encoding network 11 , a first decoding network 12 and a second decoding network 13 (the network described in this application refers to a neural network).
  • Each training data in this embodiment includes a first face image (hereinafter referred to as face X) and a second face image (hereinafter referred to as face Y), that is, two different An image of a person's face.
  • a training data includes a face X and a face Y, and face X is used as a target to be replaced during face-changing processing; face Y is used to replace face X.
  • face X is used as a target to be replaced during face-changing processing
  • face Y is used to replace face X.
  • These two images are images of real people.
  • the focus of this solution is to protect the privacy of face X, and use face Y to replace face X to achieve the purpose of the invention, so face Y can be a public image.
  • the first face image and the second face image are more than 1000 respectively.
  • the number of the first face image and the second face image in the large amount of training data used are equal, such as 5,000 images respectively, and a total of 10,000 face images are used as training data.
  • the face-swapping model shown in Figure 1 is trained.
  • face X and face Y are used as input data.
  • the encoding network 11 respectively interprets the first face image (face X) and the second face image (face Y). Perform feature extraction to obtain first feature data of the first face image (hereinafter referred to as feature vector X) and second feature data of the second face image (hereinafter referred to as feature vector Y).
  • the first decoding network 12 obtains the first reconstructed face image (hereinafter referred to as face X') according to the first feature data (feature vector X); the second decoding network 13 obtains the first reconstructed face image according to the second feature data (feature vector Y); 2. Reconstructed face image (hereinafter referred to as face Y'). Then according to the difference (loss1) between the first reconstructed face image (face X') and the first face image (face X) and the second reconstructed face image (face Y') and the second face The difference (loss2) of the image (face Y) optimizes the parameters of the face swap model, which include the weights of the layers of the three networks.
  • the loss function is obtained by calculating the difference between face X' and face X, and the difference between face Y' and face Y.
  • the back-propagation algorithm is used to calculate the change value of the weight of each layer of the network, Update the weights of each layer of the network.
  • DSSIM Difference of Structural Similarity
  • ⁇ x is the average value of each pixel point of the face of x, is the variance of each pixel on the face of x
  • ⁇ x' is the average value of each pixel on the face of x'
  • ⁇ xx' is the covariance of xx'
  • C 1 (0.01) 2
  • C 2 (0.03) 2 .
  • the network completes the training.
  • the trained encoding network 11 can effectively extract face feature vectors, and the first decoding network 12 and the second decoding network 13 can reconstruct the face feature vectors into corresponding face pictures.
  • the adopted face-changing model includes an encoding network and two decoding networks.
  • the encoding network can accurately extract feature information from the two face images
  • the decoding network can accurately reconstruct the face image, and can restore the expression, light and shadow of the original image to a high degree.
  • the first decoding network reconstructs the image according to the feature information of the user's face image
  • the second decoding network reconstructs the image according to the feature information of the public person image.
  • a reconstructed image of the face-changing image can be obtained.
  • the image can retain the user's expression and light and shadow, and has the appearance of a public figure, thereby achieving the purpose of protecting the user's privacy and protecting the expression. Has a high degree of reduction.
  • the network structure shown in FIG. 7 is adopted, wherein the encoding network 11 includes four Conv2D (two-dimensional convolutional layers)-Reshape (shape adjustment layers)-two Dense (fully-connected layers) connected in sequence )-Reshape (shape adjustment layer)-Upscale (upscale layer), in which the two-dimensional convolution layer implements feature extraction, the output features are high-dimensional vectors, and the shape adjustment layer adjusts the extracted features into one-dimensional vectors, which is convenient for The subsequent fully connected layer further extracts features, and then uses the shape adjustment layer to adjust to a suitable shape, and uses the upscaling layer to enlarge to a suitable size, so that the feature extraction of face X and face Y can be performed separately, and two 8x8x512 dimensional feature vector.
  • the first decoding network 12 and the second decoding network 13 have the same structure, including three Upscale (upscale layers)-Conv2D (two-dimensional convolution layers) connected in sequence, respectively reconstructing the face X according to two 8x8x512-dimensional feature vectors ' and face Y'.
  • the decoding network first enlarges the feature vector to a suitable size and then processes it, and outputs the reconstructed face image; after training, the parameters of each layer of the decoding network represent a specific face, and the feature vector represents the facial expression information.
  • the vector goes through a decoding network to form a reconstructed face.
  • FIG. 7 what is shown in FIG. 7 is a verified network form, and the present invention is not limited to using this network structure.
  • the above solution can be used to process the movement disorder video, and replace the real patient's face with the public face.
  • this embodiment acquires part of the training data from the patient's motion disorder video. Specifically, a motion video of the patient is obtained first, which is a full-body video used for analysis of human motion characteristics.
  • faces can be detected in the motion disorder assessment video, for example, multiple frames of images can be extracted, and the positions of the faces in them can be detected respectively, and the obtained face images can be used as the above-mentioned first face image (face X), that is, replace the Target.
  • face images of another person public person
  • face Y multiple face images of another person (public person)
  • multiple training data when acquiring training data, multiple training data should include first face images with different shooting angles and/or different lighting conditions and/or different expressions, such as 5000 faces
  • the X pictures include different angles, lighting, expressions, etc.; accordingly, the plurality of training data includes second face images with different shooting angles and/or different lighting conditions and/or different expressions, such as different angles, lighting, expressions, etc. 5000 faces Y.
  • An embodiment of the present invention provides a method for changing face of a person image, and the method can be executed by electronic devices such as a computer and a server.
  • the method includes the following steps:
  • S1A train a face-changing model for a certain character.
  • the face image of the person is used as the first face image
  • the face image of another person public person
  • FIG. 6 and FIG. 7 and related introductions which will not be repeated here.
  • this step is different from the model training process.
  • the first decoding network 12 is no longer needed, but the second decoding network 13 is used to obtain the reconstructed face image according to the feature vector X. Due to the training In the process, what the second decoding network 13 learns is the appearance information of public figures, but the input feature vector at this time is the information of the face X, so the image obtained here is neither the above-mentioned face X' nor the above-mentioned face Y. ', the reconstructed face image has the appearance of face Y, while the expression, light and shadow retain the information of face X.
  • the adopted face-changing model includes an encoding network and two decoding networks, and the model is trained by the real face image of the user and the face image of a public person, so that the encoding network
  • the feature information can be accurately extracted from the user's face image, and the feature information of the user's face image can be used as the input of the second decoding network, so that a reconstructed image of the face can be obtained, which can retain the user's expression.
  • light and shadow and have the appearance of public figures, thus achieving the purpose of protecting user privacy, and has a high degree of restoration of expressions.
  • An embodiment of the present invention provides a method for changing the face of a person in a movement disorder assessment video, and the method can be executed by electronic devices such as a computer and a server.
  • the method includes the following steps:
  • face images are recognized in each of the images to be processed, respectively.
  • a face detection method (such as dlib face recognition tool) is used to detect the face in the picture disassembled by the motion video, and the position of the face is given. and the length and width of the box h and w. According to the detected face position, a frame of the face area is cut out for each motion video picture to form a picture set of face X.
  • the face image recognized in this step refers to the image of the face area from the eyebrows to the chin, and it is not necessary to identify and replace the forehead part above the eyebrows to avoid The effect of hair and other factors on the effect of face replacement.
  • S3B train a face-changing model for the patient.
  • the patient's face image extracted and recognized from the video is used as the above-mentioned first face image (the picture set of face X), and the face image of another person (public person) is used as the above-mentioned second face image.
  • the face image of another person public person
  • each reconstructed face image has the appearance of a public person, and the facial expressions, light and shadow retain the patient's information.
  • each reconstructed face picture is filled to the position of the face detected by the original motion video picture to complete the face swap, and the motion video pictures after the face swap are synthesized into a video in chronological order.
  • each frame of the image in the motion disorder assessment video is processed to obtain a reconstructed image of the face-changing, and then the face-changing image is re-synthesized into a video, in which the user can be retained. It has the appearance of a public figure, and thus achieves the purpose of protecting user privacy, and has a high degree of restoration of expressions, making sports videos more medically valuable. Extracting key points of movement based on the motion video after face-changing, analyzing the features of these key points, it can be used for medical evaluation of whole body movement, and can also be used for facial bradykinesia feature extraction and disease assessment.
  • the selection of the target face may be provided, for example, there are multiple types of different genders, races, face shapes and facial features in the database. public face image.
  • the user can select multiple sets of face images to train face-changing models respectively, and then perform similarity analysis on the face-changing results to determine a face-changing model that is most suitable for the patient's face.
  • the face detection information in the video will be erroneously corrected, that is, in step S2B, face detection is first performed in the video, and false identification is excluded from all detection results. information, and then perform time window median smoothing on the time series of the face detection position after excluding the wrong identification information to stabilize the face detection position.
  • the first case is that the user turns around in the video, and the face cannot be recognized for a long time; the second case is that the user does not turn around, but there are occasional missing persons in the time series. face detection information.
  • this embodiment deletes the face detection information within the time when the face cannot be detected for the first time until the last time when the face cannot be detected; for the second type of erroneous identification information, this embodiment uses the missing face detection information
  • the pre- and post-frame face detection information is interpolated to fill in the missing face detection information.
  • the patient may turn his head in the video to detect the profile image.
  • the face-swapping network used in this implementation does not perform as well on the side face as the front face, the profile image easily leaks the original facial information.
  • further processing is performed on the profile image.
  • the side face image in the video is first converted into a front face image, for example, a rotate-and-render model can be used to rotate the side face image into a front face image.
  • the coding network 11 is used to perform feature extraction on the frontal face image to obtain feature data.
  • step S6B it is necessary to first convert the reconstructed frontal face image into a reconstructed profile image; and then replace the profile image in the video with the reconstructed profile image. In this way, the face-changing effect of the profile face can be optimized, and user privacy can be further protected.
  • step S5B the color of the reconstructed face image in step S5B is often different from the original face image, which causes the color of the face area recorded after the face change to be inconsistent with the color of the forehead area.
  • the pixel value of the reconstructed face image is adjusted according to the pixel value of the face image in the video, so that the adjusted reconstructed face image is the same as the face in the video. Match the color histogram of the image.
  • the numerical distribution of all pixels in the three channels of R, G, and B is calculated respectively, and the histogram of the numerical distribution of the three channels of R, G, and B is obtained.
  • the numerical distribution of all the pixels of the R, G, and B channels is also calculated. Adjusting the color is to adjust the distribution of the R, G, and B channels of the reconstructed face image to be similar to the distribution of the original face image.
  • the percentage of all pixels is set to p, which corresponds to the R channel of the reconstructed face image
  • the same proportion of the brightness value b of p change all the pixels of the value b of the new image R channel to the value a to complete the histogram matching.
  • key point verification can be performed. Specifically, for example, it can detect whether the key points in the video are lost, whether the image definition meets expectations, whether the key points are blocked, whether the size meets the requirements, etc. If the quality of the video after the face swap does not meet expectations, it may even lead to inaccuracy. If the key points are detected, the target face (the above-mentioned second face) can be replaced to retrain the face replacement model.
  • embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
  • computer-usable storage media including, but not limited to, disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions
  • the apparatus implements the functions specified in the flow or flow of the flowcharts and/or the block or blocks of the block diagrams.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Social Psychology (AREA)
  • Psychiatry (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)
  • Image Analysis (AREA)

Abstract

本发明提供一种具有隐私保护功能的运动障碍评估录像分析方法及设备,所述方法包括:获取运动障碍评估录像;对所述录像中的人物进行换脸处理;在换脸处理后的所述录像中识别关键点;根据所述关键点随录像时间的变化确定用于辅助诊断疾病的运动特征。根据本发明提供的运动障碍评估录像分析方法和设备,对运动障碍评估录像进行处理得到换脸后的录像,该录像中能够保留用户的表情和光影,而具备公共人物的样貌,由此达到保护用户隐私的目的。通过在运动障碍评估录像中提取人体的关键点,并监测关键点随着录像进程的变化,来确定与疾病有关的运动特征,由此得到可量化的疾病关键诊断指标,能够有效辅助医生对相关疾病做出诊断。

Description

具有隐私保护功能的运动障碍评估录像分析方法及设备 技术领域
本发明涉及医学影像分析领域,具体涉及一种具有隐私保护功能的运动障碍评估录像分析方法及设备。
背景技术
人的面部表情或者身体姿态能够反映一些疾病情况,比如帕金森病可引起表情冻结和行动障碍,医生通过判断患者的眨眼动作、张嘴情况和行走状态等等,能够做出相应的诊断。
有此类障碍的患者出行很不方便,为此可以通过录像的方式记录患者的实际情况,医生可观看录像来进行初步诊断。但是录像内容缺乏可量化的指标,过分依靠医生的经验和主观判断,这使得患者的运动录像医学价值有限,实用性有待提高。另外,运动障碍评估录像中包含患者本人样貌,容易泄露患者的身份、损害患者的隐私。现有的视频隐私保护技术一般是通过对脸部区域添加马赛克或是修改脸部的像素值,这类处理方式完全消除了脸部的信息,无法评判一些疾病在脸部体现的信息。
发明内容
有鉴于此,本发明提供一种具有隐私保护功能的运动障碍评估录像分析方法,包括:
获取运动障碍评估录像;
对所述录像中的人物进行换脸处理;
在换脸处理后的所述录像中识别人体的关键点;
根据所述关键点随录像时间的变化确定用于辅助诊断疾病的运动特征。
可选地,对所述录像中的人物进行换脸处理,包括:
利用编码网络对录像中的人脸图像进行特征提取得到特征数据,其中所述人脸图像是人脸自眉毛至下巴的脸部区域的图像;
利用解码网络根据所述特征数据得到重建人脸图像;
利用重建人脸图像替换所述录像中的人脸图像。
可选地,当录像中的人脸图像是侧脸图像时,利用编码网络对录像中的人脸图像进行特征提取得到特征数据,包括:
将录像中的侧脸图像转换为正脸图像;
利用编码网络对所述正脸图像进行特征提取得到特征数据;
利用重建人脸图像替换所述录像中的人脸图像,包括:
将重建正脸图像转换为重建侧脸图像;
利用重建侧脸图像替换所述录像中的侧脸图像。
可选地,在利用调整后的重建人脸图像替换所述录像中的人脸图像之前,还包括:
根据所述录像中的人脸图像的像素值调整所述重建人脸图像的像素值,使调整后的重建人脸图像与所述录像中的人脸图像的颜色直方图相匹配。
可选地,对所述录像中的人物进行换脸处理,包括:
在所述录像中进行人脸检测;
排除错误识别信息;
对排除错误识别信息后的人脸检测位置时间序列做时间窗中值平滑,以稳定人脸检测位置。
可选地,排除错误识别信息包括:
将首次无法检测到脸部直至最后一次无法检测到脸部的时间之内的人脸检测信息删除;和/或
使用缺失的人脸检测信息前、后帧人脸检测信息进行插值,以填补缺失的人脸检测信息。
可选地,所述关键点包括面部关键点;根据所述关键点随录像时间的变化确定用于辅助诊断疾病的运动特征,包括:
根据所述面部关键点确定相应的面部区域;
根据所述面部区域随录像时间的变化确定用于辅助诊断疾病的面部表情特征。
可选地,所述面部关键点包括多个眼周关键点;根据所述关键点随录像时间的变化确定用于辅助诊断疾病的运动特征,包括:
根据所述多个眼周关键点确定相应的眼部区域;
根据所述眼部区域随录像时间的变化确定睁眼面积的变化;
根据所述睁眼面积的变化确定眨眼频率。
可选地,所述面部关键点包括多个嘴周关键点;根据所述关键点随录像时间的变化确定用于辅助诊断疾病的运动特征,包括:
根据所述多个嘴周关键点确定的嘴部区域;
根据所述嘴部区域随录像时间的变化确定张嘴面积的变化。
可选地,所述关键点包括踝关键点;根据所述关键点随录像时间的变化确定用于辅助诊断疾病的运动特征,包括:
根据所述踝关键点的位置随录像时间的变化确定迈步动作;
根据所述迈步动作确定步频信息。
可选地,所述关键点包括多个手指关节关键点;根据所述关键点随录像时间的变化确定用于辅助诊断疾病的运动特征,包括:
根据所述手指关节关键点的位置随录像时间的变化确定食指、大拇指捏合动作;
根据所述捏合动作确定食指、拇指捏合的频率、幅度、随时间变化趋势。
可选地,所述关键点包括多个手指关节关键点;根据所述关键点随录像时间的变化确定用于辅助诊断疾病的运动特征,包括:
根据所述手指关节关键点的位置随录像时间的变化确定握拳动作;
根据所述握拳动作确定握拳频率。
可选地,所述关键点包括手腕关键点、手肘关键点;根据所述关键点随录像时间的 变化确定用于辅助诊断疾病的运动特征,包括:
根据所述手腕关键点、手肘关键点的位置随录像时间的变化确定轮替动作;
根据所述轮替动作确定旋转手臂的速度。
可选地,所述关键点包括髋关节关键点、肩关节关键点、膝盖关键点、踝关键点;根据所述关键点随录像时间的变化确定用于辅助诊断疾病的运动特征,包括:
根据所述髋关节关键点、肩关节关键点、膝盖关键点、踝关键点的位置随录像时间的变化确定步态;
根据所述步态确定重心偏移信息、重心晃动程度信息。
可选地,在所述录像中识别关键点,包括:
利用神经网络对所述录像中的各帧图像进行识别,得到关键点分布概率信息;
根据所述关键点分布概率信息确定关键点的位置。
相应地,本发明还提供一种具有隐私保护功能的运动障碍评估录像分析设备,包括至少一个处理器以及与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有可被所述一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器执行上述具有隐私保护功能的运动障碍评估录像分析方法。
根据本发明实施例提供的运动障碍评估录像分析方法及设备,对运动障碍评估录像进行处理得到换脸后的录像,该录像中能够保留用户的表情和光影,而具备公共人物的样貌,由此达到保护用户隐私的目的,并且对表情具有较高的还原度。通过在运动障碍评估录像中提取人体的关键点,并监测关键点随着录像进程的变化,来确定与疾病有关的运动特征,由此得到可量化的疾病关键诊断指标,使得运动录像更具医学价值,能够有效辅助医生对相关疾病做出诊断,具有较强的实用性。
附图说明
为了更清楚地说明本发明具体实施方式或现有技术中的技术方案,下面将对具体实施方式或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施方式,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本发明实施例中的运动障碍评估录像分析方法的流程图;
图2为本发明实施例中一个标记了关键点的图像;
图3为本发明实施例中利用神经网络模型得到的关键点分布概率图;
图4为本发明实施例中的睁眼面积与录像进程的关系图;
图5为本发明实施例中的脚踝距离与录像进程的关系图;
图6为本发明实施例中的换脸网络在训练阶段的工作原理图;
图7为本发明实施例中的一个具体的换脸网络在训练阶段的工作原理图;
图8为本发明实施例中的换脸网络在换脸阶段的工作原理图。
图9为本发明实施例中对比换脸前后的睁眼面积与录像进程的关系图。
具体实施方式
下面将结合附图对本发明的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
在本发明的描述中,需要说明的是,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性。此外,下面所描述的本发明不同实施方式中所涉及的技术特征只要彼此之间未构成冲突就可以相互结合。
本发明实施例提供一种运动障碍评估录像分析方法,本方法可以由计算机、便携式终端或者服务器等电子设备执行。如图1所示,本方法包括如下步骤:
S1,获取运动障碍评估录像,以下简称为运动录像或录像。该录像是针对某一用户(患者)所拍摄的具有一定时长的视频,用户在拍摄过程中完成一些规定动作,比如行走、坐下、站起等等,由此录像记录患者一段时间内的运动过程。
S2,对录像中的人物进行换脸处理。具体地,是使用其它人物样貌替换录像中用户的实际样貌。在此过程中需要基于用户的实际样貌和预先准备的公共人脸信息生成重建的人脸图像,然后使用重建的人脸图像替换录像中的用户人脸图像,以达到保护用户隐私的目的。
S3,在换脸处理后的录像中识别人体的关键点。关键点比如包括手腕、手肘、肩关节、髋关节、膝盖、踝以及脸部的眼周标记点、嘴周标记点等等。如图2所示,该图像 是录像中的一帧图像,其中标记了全身多个关键点。具体使用哪些关键点可基于运动障碍评估的运动功能而定。
在录像中识别和追踪关键点的方式有多种,本发明可以利用现有技术识别关键点。为了提高准确性,在一个优选的实施例中采用深度神经网络来识别上述关键点。具体地,预先使用标记了关键点的人物图像对神经网络进行训练,使其能够针对图像识别出所需的关键点。在应用于本方案时,提取录像的每一帧图像,分别输入训练后的神经网络。本实施例的神经网络对输入的图像进行识别,对每一个关键点输出关键点的分布概率信息,具体可以是如图3所示的分布概率图。图3是一个左踝关键点的分布概率热图(Heatmap of Left Ankle),其中横坐标为输入图像的x坐标,纵坐标为输入图像的y坐标,右侧图例表示热度,热度越高则颜色越深,判定概率图中概率值最高处,也即热度最大处是预测的左踝关键点的位置。以此方式可输出身体每个关键点在图片中的像素位置。若概率值小于给定阈值,则判定该关键点没有出现。
S4,根据关键点随录像时间的变化确定用于辅助诊断疾病的运动特征信息。运动特征信息比如包括面部表情信息,该信息可以用来辅助医生判断是否存在由于帕金森疾病引起的面部冻结;运动特征信息比如包括脚/腿部运动状态信息,该信息可以用来辅助医生判断是否存在由于各种疾病引起的运动障碍。
根据本发明实施例提供的运动障碍评估录像分析方法,对运动障碍评估录像进行处理得到换脸后的录像,该录像中能够保留用户的表情和光影,而具备公共人物的样貌,由此达到保护用户隐私的目的,并且对表情具有较高的还原度。通过在运动障碍评估录像中提取人体的关键点,并监测关键点随着录像进程的变化,来确定与疾病有关的运动特征,由此得到可量化的疾病关键诊断指标,使得运动录像更具医学价值,能够有效辅助医生对相关疾病做出诊断,具有较强的实用性。
在第一个可选的实施例中,关键点包括多个眼周关键点。步骤S4具体包括:
S41A,根据多个眼周关键点确定相应的眼部区域,即同一眼睛的各个眼周关键点所围成的区域。
S42A,根据眼部区域随录像时间的变化确定睁眼面积的变化。可计算标记的眼周关键点围成的区域的面积,并可进行归一化处理,比如用眼部区域的面积除以眼宽度的平方值得到归一化的睁眼面积。由此可以绘制和呈现一个睁眼面积与录像进程的关系图,比如图4所示的折线图,其中横坐标表示录像进程(时间),纵坐标为睁眼面积。
S43A,根据睁眼面积的变化确定眨眼频率。比如在图4所示折线图中,峰值点对应的即为眨眼动作,由此可统计出被拍摄对象的眨眼频率。
根据上述优选方案,得到的睁眼面积和眨眼频率可作为诊断帕金森疾病的关键指标,医生可根据这种量化的数据对患者的情况进行评估(打分),避免医生过于主观地做出判断。
在第二个可选的实施例中,关键点包括多个嘴周关键点。步骤S4具体包括:
S41B,根据多个嘴周关键点确定的嘴部区域,在本实施例中所述嘴周是指内嘴周,即各个内嘴周关键点所围成的区域。
S42B,根据嘴部区域随录像时间的变化确定张嘴面积的变化。可计算标记的嘴周关键点围成的区域的面积,并可进行归一化处理,比如用嘴部区域的面积除以嘴宽度的平方值得到归一化的张嘴面积。由此可以绘制和呈现一个张嘴面积与录像进程的关系图,比如与图4类似的折线图,其中横坐标表示录像进程(时间),纵坐标为张嘴面积。
根据上述优选方案,得到的张嘴面积(大小)及其变化可作为诊断帕金森疾病的关键指标,医生可根据这种量化的数据对患者的情况进行评估(打分),避免医生过于主观地做出判断。
在第三个可选的实施例中,关键点包括踝关键点。步骤S4具体包括:
S41C,根据踝关键点的位置随录像时间的变化确定迈步动作。具体地,分别确定双脚踝(点)的位置,从而确定其相对位置的变化,对两脚踝间的位置进行峰值检测即可判定迈步动作。
S42C,根据迈步动作确定步频信息。根据相对位置和录像进程可绘制一个对应关系图,比如图5所示的折线图,其中横坐标表示录像进程(时间),纵坐标为左、右脚踝关键点间的距离。
根据上述优选方案,得到的步频信息可作为诊断帕金森疾病的关键指标,医生可根据这种量化的数据对患者的情况进行评估(打分),避免医生过于主观地做出判断。
与上述三个实施例类似地,在其它可选的实施例中,关键点还可以包括手指各个指关节关键点。通过检测手指各个指关节的位置,可检测帕金森病检测中的手指拍打运动,并可计算给出食指、大拇指捏合的频率、幅度、随时间变化趋势;也可检测帕金森病检测中的手掌运动功能(握拳运动),计算给出握拳的频率等信息。
关键点还可以包括手腕、手肘关键点。通过检测手腕、手肘关键点的位置,可检测帕金森病检测中的轮替动作(前臂回旋运动),并计算给出旋转手臂的速度信息。
关键点还可以包括髋关节、肩关节、膝盖关键点。通过检测髋关节、肩关节、膝盖、 踝的关键点位置,可检测步态障碍程度,给出包括重心偏移、重心晃动程度的信息。
本发明还提供一种运动障碍评估录像分析设备,包括:至少一个处理器;以及与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有可被所述一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器执行上述运动障碍评估录像分析方法。
关于步骤S2中的换脸处理,在一个优选的实施例中,采用机器学习算法实现对人脸图像的替换,具体是指利用深度学习技术对人物图像进行换脸处理。为此本发明实施例提供一种换脸模型及其训练方法,该方法可由计算机、服务器等电子设备执行。如图6所示,换脸模型包括编码网络11、第一解码网络12和第二解码网络13(本申请所述网络是指神经网络)。
首先准备大量训练数据,本实施例中的每个训练数据分别包括第一人脸图像(下文中简称为人脸X)和第二人脸图像(下文中简称为人脸Y),即不同的两个人物的人脸图像。作为举例,一个训练数据中包括一张人脸X和一张人脸Y,人脸X在换脸处理时作为被换的目标;人脸Y用于替代人脸X。这两个图像都是真实人物的图像,本方案所关注的是保护人脸X的隐私,使用人脸Y替换人脸X来实现发明目的,因此人脸Y可以是一个公共图像。
关于训练数据的数量,第一人脸图像和第二人脸图像分别多于1000张即可。作为优选的实施例,所使用的大量训练数据中的第一人脸图像和第二人脸图像的数量相等,比如分别是5000张,总共10000张人脸图像作为训练数据,利用这些训练数据对如图1所示的换脸模型进行训练。
以一个训练数据为例说明训练过程:人脸X和人脸Y作为输入数据,训练过程中编码网络11分别对第一人脸图像(人脸X)和第二人脸图像(人脸Y)进行特征提取,得到第一人脸图像的第一特征数据(下文中简称特征向量X)和第二人脸图像的第二特征数据(下文中简称特征向量Y)。
第一解码网络12根据第一特征数据(特征向量X)得到第一重建人脸图像(下文简称人脸X’);第二解码网络13根据所述第二特征数据(特征向量Y)得到第二重建人脸图像(下文简称人脸Y’)。进而根据第一重建人脸图像(人脸X’)与第一人脸图像(人脸X)的差异(loss1)以及第二重建人脸图像(人脸Y’)与所述第二人脸图像(人脸Y)的差异(loss2)优化换脸模型的参数,所述参数包括三个网络的各层的权值。
具体地,通过计算人脸X’与人脸X的差异、人脸Y’与人脸Y的差异得到损失函数,根据损失函数,利用反向传播算法,计算网络各层权值的变化值,更新网络各层权 值。以人脸X’与人脸X的差异为例,可以使用DSSIM(Difference of Structural Similarity)表示差异:
Figure PCTCN2021108849-appb-000001
其中,μ x是x脸部各像素点的平均值,
Figure PCTCN2021108849-appb-000002
是x脸部各像素点的方差,μ x'是x’脸部各像素点的平均值,
Figure PCTCN2021108849-appb-000003
是x’脸部各像素点的方差,σ xx'是xx’的协方差,C 1=(0.01) 2,C 2=(0.03) 2
通过使用大量的训练数据不断重复上述训练过程,直到人脸X’与人脸X的差异、人脸Y’与人脸Y的差异都小于阈值,网络完成训练。训练后的编码网络11能有效提取人脸特征向量,第一解码网络12和第二解码网络13能把人脸特征向量重建为对应的人脸图片。
根据本发明实施例提供的换脸模型训练方法,所采用的换脸模型包括编码网络和两个解码网络,使用大量样本图像训练后,编码网络能够准确地从两个人脸图像中提取特征信息,解码网络能够准确地重建人脸图像,对原图像的表情和光影等内容还原度极高。在训练中第一解码网络根据用户人脸图像的特征信息重建图像,第二解码网络根据公共人物图像的特征信息重建图像,训练完成后用于换脸时,只要将第二解码网络的输入替换为用户人脸图像的特征信息,即可得到一个换脸的重建图像,该图像中可保留用户的表情和光影,而具备公共人物的样貌,由此达到保护用户隐私的目的,并且对表情具有较高的还原度。
作为一个具体的实施例,采用如图7所示的网络结构,其中编码网络11包括依次连接的四个Conv2D(二维卷积层)-Reshape(形状调整层)-两个Dense(全连接层)-Reshape(形状调整层)-Upscale(升尺度层),其中,二维卷积层实施特征提取,输出的特征是高维向量,形状调整层将提取后的特征调整为一维向量,便于之后的全连接层进一步提取特征,之后利用形状调整层调整为合适的形状,利用升尺度层放大到合适的大小,由此能够分别对人脸X和人脸Y进行特征提取,得到两个8x8x512维的特征向量。
第一解码网络12和第二解码网络13结构相同,包括依次连接的三个Upscale(升尺度层)-Conv2D(二维卷积层),分别根据两个8x8x512维的特征向量重建出人脸X’和人脸Y’。解码网络先把特征向量放大到合适大小再做处理,输出重建的脸部图像;经 过训练,解码网络各层参数代表了特定的一张人脸,特征向量则代表了人脸的表情信息,特征向量经过解码网络,形成重建的人脸。
需要说明的是,图7所示的是一种经过验证的网络形式,本发明不限于使用此网络结构。
上述方案可以用于处理运动障碍录像,将其中真实患者的脸替换为公共的人脸。为了得到专用于某患者的换脸模型,本实施例从该患者的运动障碍录像中获取部分训练数据。具体地,首先获取患者的运动录像,此录像是用作人体运动特征分析的全身录像。为了得到训练数据,可在运动障碍评估录像检测人脸,例如提取多帧图像,分别检测其中人脸的位置,得到的人脸图像作为上述第一人脸图像(人脸X),也即替换目标。然后可以获取另一个人物(公共人物)的多个人脸图像作为上述第二人脸图像(人脸Y)。
为了提高训练后的模型的实用性,在获取训练数据时,应当使多个训练数据中包括不同拍摄角度和/或不同光照条件和/或不同表情的第一人脸图像,比如5000张人脸X图片包括不同角度、光照、表情等等;相应地,多个训练数据中包括不同拍摄角度和/或不同光照条件和/或不同表情的第二人脸图像,比如不同角度、光照、表情的5000张人脸Y。
对模型完成训练后即可用于替换人脸。本发明实施例提供一种人物图像换脸方法,该方法可由计算机、服务器等电子设备执行。该方法包括如下步骤:
S1A,根据上述训练方法,针对某个人物训练换脸模型。训练时使用该人物的人脸图像作为上述第一人脸图像,使用另一个人物(公共人物)的人脸图像作为上述第二人脸图像。此过程可参考图6和图7及相关介绍,此处不再赘述。
S2A,利用训练后的编码网络11对第一人脸图像进行特征提取,得到第一特征数据。此步骤与模型训练过程相似,相当于只对人脸X进行特征提取得到特征向量X。
S3A,利用训练后的第二解码网络13根据第一特征数据得到重建人脸图像。如图8所示,此步骤与模型训练过程不同,在换脸过程中不再需要使用第一解码网络12,而是使用第二解码网络13根据特征向量X得到重建的人脸图像,由于训练过程中第二解码网络13学习到的是公共人物的样貌信息,但此时输入的特征向量是人脸X的信息,所以在此得到的图像不是上述人脸X’也不是上述人脸Y’,重建的人脸图像具备人脸Y的样貌,而表情、光影则保留了人脸X的信息。
根据本发明实施例提供的人物图像换脸方法,所采用的换脸模型包括编码网络和两个解码网络,并且该模型经过用户的真实人脸图像和公共人物的人脸图像训练,使得编码网络能够准确地从用户的人脸图像中提取特征信息,将用户的人脸图像的特征信息作 为第二解码网络的输入,由此可得到一个换脸的重建图像,该图像中能够保留用户的表情和光影,而具备公共人物的样貌,由此达到保护用户隐私的目的,并且对表情具有较高的还原度。
本发明实施例提供一种对运动障碍评估录像中的人物进行换脸的方法,该方法可由计算机、服务器等电子设备执行。该方法包括如下步骤:
S1B,将患者的运动障碍评估录像逐帧提取为待处理图像;
S2B,分别在各个待处理图像中识别人脸图像。具体地,利用人脸检测手段(例如dlib face recognition tool),检测运动录像拆解出的图片中的人脸,并给出人脸所在位置,人脸位置由检测方框顶角位置x、y和方框长宽h、w表示。根据检测到的人脸位置,对每一张运动录像图片裁出人脸区域的方框,形成人脸X的图片集。为了提高后续换脸处理的准确性,在此步骤中识别的人脸图像是指人脸自眉毛至下巴的脸部区域的图像,后续不必对眉毛以上的额头部分进行识别和替换处理,以避免头发等因素对换脸效果的影响。
S3B,根据上述训练方法,针对该患者训练换脸模型。训练时使用从录像中提取并识别的患者人脸图像作为上述第一人脸图像(人脸X的图片集),使用另一个人物(公共人物)的人脸图像作为上述第二人脸图像。此过程可参考图6和图7及相关介绍,此处不再赘述。
S4B,利用训练后的编码网络11分别对各个人脸图像进行特征提取,得到人脸图像的特征数据。此步骤与模型训练过程相似,相当于只对每张患者人脸图像进行特征提取得到特征向量。
S5B,利用第二解码网络13分别根据各个特征数据得到对应各个人脸图像的重建人脸图像。参照上述步骤S3B,每张重建的人脸图像具备公共人物的样貌,而表情、光影则保留了患者的信息。
S6B,利用各个重建人脸图像替换待处理图像的人脸图像,并合成为视频。即各个重建人脸图片填充至原运动录像图片检测到的人脸的位置完成换脸,把换脸后的运动录像图片按照时间顺序合成为视频。
根据本发明实施例提供的换脸方法,对运动障碍评估录像中的每一帧图像进行处理,得到换脸的重建图像,然后将换脸后的图像重新合成为录像,该录像中能够保留用户的表情和光影,而具备公共人物的样貌,由此达到保护用户隐私的目的,并且对表情具有较高的还原度,使得运动录像更具医学价值。基于换脸后的运动录像提取运动关键点,分析这些关键点构成的特征,可做全身运动的医学评价,也可针对性做人脸运动迟缓的 特征提取与病情评估。
按照上述方式对运动录像进行换脸,然后针对换脸后的录像进行分析,比如按照上述实施例中介绍的方式确定眨眼动作,所得到的睁眼面积与录像进程的关系如图9所示,比较换脸后的折线(虚线表示)与换脸前的折线(实线表示)可以发现,根据换脸后的录像仍能够准确地分析出眨眼动作。本实施例提供的换脸方法在保护用户隐私的同时,对表情具有较高的还原度,不影响对面部表情的分析。
在可选的实施例中,训练换脸模型时(步骤S3B)可以提供目标人脸(上述第二人脸图像)的选择,比如数据库中存有不同性别、种族、脸型、五官特征的多个公共人脸图像。用户可以选择其中多组人脸图像分别训练换脸模型,然后对换脸结果进行相似度分析,从而确定一个最适合患者人脸的换脸模型。
为了提高换脸处理的稳定性,在优选的实施例中将对录像中的人脸检测信息进行错误修正,即在步骤S2B中首先在录像中进行人脸检测,在所有检测结果中排除错误识别信息,然后对排除错误识别信息后的人脸检测位置时间序列做时间窗中值平滑,以稳定人脸检测位置。
排除错误识别信息包括两种情况,第一种情况是用户在录像中进行了转身,在较长时间内无法识别到人脸;第二种情况是用户没有转身,但在时间序列中偶尔缺失人脸检测信息。对于第一种错误识别信息,本实施例将首次无法检测到脸部直至最后一次无法检测到脸部的时间之内的人脸检测信息删除;对于第二种错误识别信息,本实施例使用缺失的人脸检测信息前、后帧人脸检测信息进行插值,以填补缺失的人脸检测信息。
实际应用时患者可能在录像中转头,从而检测到侧脸图像,考虑到本实施所使用的换脸网络对侧脸的处理效果不如正脸,侧脸图像容易泄露原始脸部信息。在一个优选的实施例中,对于侧脸图像将采取进一步的处理。在步骤S4B中,先将录像中的侧脸图像转换为正脸图像,比如使用rotate-and-render模型可实现将侧脸图像旋转为正脸图像。然后再利用编码网络11对正脸图像进行特征提取得到特征数据。在步骤S6B中,需要先将重建正脸图像转换为重建侧脸图像;然后利用重建侧脸图像替换录像中的侧脸图像。由此可以优化侧脸的换脸效果,进一步保护用户隐私。
在测试中发现步骤S5B中重建的人脸图像往往与原始人脸图像的色彩存在一定差异,这导致换脸后录像的人脸区域与额头区域的色彩不一致。为了克服此缺陷,在一个优选的实施例中,在步骤S6B之前根据录像中的人脸图像的像素值调整重建人脸图像的像素值,使调整后的重建人脸图像与录像中的人脸图像的颜色直方图相匹配。
具体地,对录像中的原始人脸图像,分别计算R、G、B三个通道的全部像素的数值 分布,得到R、G、B三个通道的数值分布直方图。对于重建人脸图像,同样计算R、G、B三个通道的全部像素的数值分布。调整颜色,是把重建人脸图像的R、G、B三个通道的分布都调整到类似原始人脸图像的分布。以R通道为例,原始人脸图像某个亮度的数值a的像素点数目,加上小于该数值的所有像素点数目,占全部像素点的百分比设为p,对应到重建人脸图像R通道同样占比p的亮度的数值b,把新图R通道的数值b的像素点全部改为数值为a,完成直方图匹配。此外,对换脸后的运动录像进行分析的过程中,可以进行关键点验证。具体地,比如可以检测录像中的关键点是否丢失、可检查图像清晰度是否符合预期、关键点是否遮挡、尺寸是否符合要求等等,如果换脸后的录像质量不符合预期,甚至导致无法准确检测关键点,则可更换目标人脸(上述第二人脸)重新训练换脸模型。
本领域内的技术人员应明白,本发明的实施例可提供为方法、系统、或计算机程序产品。因此,本发明可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
显然,上述实施例仅仅是为清楚地说明所作的举例,而并非对实施方式的限定。对于所属领域的普通技术人员来说,在上述说明的基础上还可以做出其它不同形式的变化 或变动。这里无需也无法对所有的实施方式予以穷举。而由此所引伸出的显而易见的变化或变动仍处于本发明创造的保护范围之中。

Claims (16)

  1. 一种具有隐私保护功能的运动障碍评估录像分析方法,其特征在于,包括:
    获取运动障碍评估录像;
    对所述录像中的人物进行换脸处理;
    在换脸处理后的所述录像中识别人体的关键点;
    根据所述关键点随录像时间的变化确定用于辅助诊断疾病的运动特征。
  2. 根据权利要求1所述的方法,其特征在于,对所述录像中的人物进行换脸处理,包括:
    利用编码网络对录像中的人脸图像进行特征提取得到特征数据,其中所述人脸图像是人脸自眉毛至下巴的脸部区域的图像;
    利用解码网络根据所述特征数据得到重建人脸图像;
    利用重建人脸图像替换所述录像中的人脸图像。
  3. 根据权利要求2所述的方法,其特征在于,当录像中的人脸图像是侧脸图像时,利用编码网络对录像中的人脸图像进行特征提取得到特征数据,包括:
    将录像中的侧脸图像转换为正脸图像;
    利用编码网络对所述正脸图像进行特征提取得到特征数据;
    利用重建人脸图像替换所述录像中的人脸图像,包括:
    将重建正脸图像转换为重建侧脸图像;
    利用重建侧脸图像替换所述录像中的侧脸图像。
  4. 根据权利要求2或3所述的方法,其特征在于,在利用调整后的重建人脸图像替换所述录像中的人脸图像之前,还包括:
    根据所述录像中的人脸图像的像素值调整所述重建人脸图像的像素值,使调整后的重建人脸图像与所述录像中的人脸图像的颜色直方图相匹配。
  5. 根据权利要求1所述的方法,其特征在于,对所述录像中的人物进行换脸处理, 包括:
    在所述录像中进行人脸检测;
    排除错误识别信息;
    对排除错误识别信息后的人脸检测位置时间序列做时间窗中值平滑,以稳定人脸检测位置。
  6. 根据权利要求5所述的方法,其特征在于,排除错误识别信息包括:
    将首次无法检测到脸部直至最后一次无法检测到脸部的时间之内的人脸检测信息删除;和/或
    使用缺失的人脸检测信息前、后帧人脸检测信息进行插值,以填补缺失的人脸检测信息。
  7. 根据权利要求1所述的方法,其特征在于,所述关键点包括面部关键点;根据所述关键点随录像时间的变化确定用于辅助诊断疾病的运动特征,包括:
    根据所述面部关键点确定相应的面部区域;
    根据所述面部区域随录像时间的变化确定用于辅助诊断疾病的面部表情特征。
  8. 根据权利要求7所述的方法,其特征在于,所述面部关键点包括多个眼周关键点;根据所述关键点随录像时间的变化确定用于辅助诊断疾病的运动特征,包括:
    根据所述多个眼周关键点确定相应的眼部区域;
    根据所述眼部区域随录像时间的变化确定睁眼面积的变化;
    根据所述睁眼面积的变化确定眨眼频率。
  9. 根据权利要求7所述的方法,其特征在于,所述面部关键点包括多个嘴周关键点;根据所述关键点随录像时间的变化确定用于辅助诊断疾病的运动特征,包括:
    根据所述多个嘴周关键点确定的嘴部区域;
    根据所述嘴部区域随录像时间的变化确定张嘴面积的变化。
  10. 根据权利要求1所述的方法,其特征在于,所述关键点包括踝关键点;根据所述关键点随录像时间的变化确定用于辅助诊断疾病的运动特征,包括:
    根据所述踝关键点的位置随录像时间的变化确定迈步动作;
    根据所述迈步动作确定步频信息。
  11. 根据权利要求1所述的方法,其特征在于,所述关键点包括多个手指关节关键点;根据所述关键点随录像时间的变化确定用于辅助诊断疾病的运动特征,包括:
    根据所述手指关节关键点的位置随录像时间的变化确定食指、大拇指捏合动作;
    根据所述捏合动作确定食指、拇指捏合的频率、幅度、随时间变化趋势。
  12. 根据权利要求1所述的方法,其特征在于,所述关键点包括多个手指关节关键点;根据所述关键点随录像时间的变化确定用于辅助诊断疾病的运动特征,包括:
    根据所述手指关节关键点的位置随录像时间的变化确定握拳动作;
    根据所述握拳动作确定握拳频率。
  13. 根据权利要求1所述的方法,其特征在于,所述关键点包括手腕关键点、手肘关键点;根据所述关键点随录像时间的变化确定用于辅助诊断疾病的运动特征,包括:
    根据所述手腕关键点、手肘关键点的位置随录像时间的变化确定轮替动作;
    根据所述轮替动作确定旋转手臂的速度。
  14. 根据权利要求1所述的方法,其特征在于,所述关键点包括髋关节关键点、肩关节关键点、膝盖关键点、踝关键点;根据所述关键点随录像时间的变化确定用于辅助诊断疾病的运动特征,包括:
    根据所述髋关节关键点、肩关节关键点、膝盖关键点、踝关键点的位置随录像时间的变化确定步态;
    根据所述步态确定重心偏移信息、重心晃动程度信息。
  15. 根据权利要求1所述的方法,其特征在于,在所述录像中识别关键点,包括:
    利用神经网络对所述录像中的各帧图像进行识别,得到关键点分布概率信息;
    根据所述关键点分布概率信息确定关键点的位置。
  16. 一种具有隐私保护功能的运动障碍评估录像分析设备,其特征在于,包括:至少一个处理器;以及与所述至少一个处理器通信连接的存储器;其中,所述存储器存储 有可被所述一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器执行如权利要求1-15中任一项所述的具有隐私保护功能的运动障碍评估录像分析方法。
PCT/CN2021/108849 2020-07-29 2021-07-28 具有隐私保护功能的运动障碍评估录像分析方法及设备 WO2022022551A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010745558.8A CN111881838B (zh) 2020-07-29 2020-07-29 具有隐私保护功能的运动障碍评估录像分析方法及设备
CN202010745558.8 2020-07-29

Publications (1)

Publication Number Publication Date
WO2022022551A1 true WO2022022551A1 (zh) 2022-02-03

Family

ID=73201423

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/108849 WO2022022551A1 (zh) 2020-07-29 2021-07-28 具有隐私保护功能的运动障碍评估录像分析方法及设备

Country Status (3)

Country Link
US (1) US11663845B2 (zh)
CN (1) CN111881838B (zh)
WO (1) WO2022022551A1 (zh)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111881838B (zh) * 2020-07-29 2023-09-26 清华大学 具有隐私保护功能的运动障碍评估录像分析方法及设备
US20220398847A1 (en) * 2021-06-09 2022-12-15 Lily Vittayarukskul Machine learning based remote monitoring of moveable objects using sensor data
CN113674857A (zh) * 2021-08-23 2021-11-19 深圳创维-Rgb电子有限公司 电视辅助疾病诊断装置及系统
WO2024050122A1 (en) * 2022-09-02 2024-03-07 University Of Virginia Patent Foundation System and method for body motor function assessment
CN116919391A (zh) * 2023-07-25 2023-10-24 凝动万生医疗科技(武汉)有限公司 运动障碍评估方法和设备

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2399513A1 (en) * 2010-06-23 2011-12-28 Qatar University System for non-invasive automated monitoring, detection, analysis, characterisation, prediction or prevention of seizures and movement disorder symptoms
CN108965740A (zh) * 2018-07-11 2018-12-07 深圳超多维科技有限公司 一种实时视频换脸方法、装置、设备和存储介质
CN110084259A (zh) * 2019-01-10 2019-08-02 谢飞 一种结合面部纹理和光流特征的面瘫分级综合评估系统
CN110738192A (zh) * 2019-10-29 2020-01-31 腾讯科技(深圳)有限公司 人体运动功能辅助评估方法、装置、设备、系统及介质
CN111028144A (zh) * 2019-12-09 2020-04-17 腾讯音乐娱乐科技(深圳)有限公司 视频换脸方法及装置、存储介质
CN111027417A (zh) * 2019-11-21 2020-04-17 复旦大学 基于人体关键点检测算法的步态识别方法及步态评估系统
CN111881838A (zh) * 2020-07-29 2020-11-03 清华大学 具有隐私保护功能的运动障碍评估录像分析方法及设备

Family Cites Families (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100682889B1 (ko) * 2003-08-29 2007-02-15 삼성전자주식회사 영상에 기반한 사실감 있는 3차원 얼굴 모델링 방법 및 장치
US9727913B2 (en) * 2009-06-26 2017-08-08 Trading Technologies International, Inc. Prioritization of trade order processing in electronic trading
US9799096B1 (en) * 2014-07-08 2017-10-24 Carnegie Mellon University System and method for processing video to provide facial de-identification
US20200281508A1 (en) * 2015-09-21 2020-09-10 Figur8, Inc. Human body mounted sensors using mapping and motion analysis
US10244990B2 (en) * 2015-09-30 2019-04-02 The Board Of Trustees Of The University Of Alabama Systems and methods for rehabilitation of limb motion
US20170178287A1 (en) * 2015-12-21 2017-06-22 Glen J. Anderson Identity obfuscation
CN106919918B (zh) * 2017-02-27 2022-11-29 腾讯科技(上海)有限公司 一种人脸跟踪方法和装置
WO2018232511A1 (en) * 2017-06-21 2018-12-27 H3Alth Technologies Inc. SYSTEM, METHOD AND KIT FOR 3D BODY IMAGING
US20190110736A1 (en) * 2017-10-17 2019-04-18 Beneufit, Inc. Measuring body movement in movement disorder disease
CN110321767B (zh) * 2018-03-30 2023-01-31 株式会社日立制作所 图像提取装置和方法、行为分析系统和存储介质
CN111460871B (zh) * 2019-01-18 2023-12-22 北京市商汤科技开发有限公司 图像处理方法及装置、存储介质
US11217345B2 (en) * 2019-05-22 2022-01-04 Mocxa Health Private Limited Anonymization of audio-visual medical data
US20200397345A1 (en) * 2019-06-19 2020-12-24 University Of Southern California Human activity recognition using magnetic induction-based motion signals and deep recurrent neural networks
CN110955912B (zh) * 2019-10-29 2023-08-08 平安科技(深圳)有限公司 基于图像识别的隐私保护方法、装置、设备及其存储介质
US20210195120A1 (en) * 2019-12-19 2021-06-24 Lance M. King Systems and methods for implementing selective vision for a camera or optical sensor
US11069036B1 (en) * 2020-01-03 2021-07-20 GE Precision Healthcare LLC Method and system for real-time and offline de-identification of facial regions from regular and occluded color video streams obtained during diagnostic medical procedures
CN111274998B (zh) * 2020-02-17 2023-04-28 上海交通大学 帕金森病手指敲击动作识别方法及系统、存储介质及终端
US11748928B2 (en) * 2020-11-10 2023-09-05 Adobe Inc. Face anonymization in digital images

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2399513A1 (en) * 2010-06-23 2011-12-28 Qatar University System for non-invasive automated monitoring, detection, analysis, characterisation, prediction or prevention of seizures and movement disorder symptoms
CN108965740A (zh) * 2018-07-11 2018-12-07 深圳超多维科技有限公司 一种实时视频换脸方法、装置、设备和存储介质
CN110084259A (zh) * 2019-01-10 2019-08-02 谢飞 一种结合面部纹理和光流特征的面瘫分级综合评估系统
CN110738192A (zh) * 2019-10-29 2020-01-31 腾讯科技(深圳)有限公司 人体运动功能辅助评估方法、装置、设备、系统及介质
CN111027417A (zh) * 2019-11-21 2020-04-17 复旦大学 基于人体关键点检测算法的步态识别方法及步态评估系统
CN111028144A (zh) * 2019-12-09 2020-04-17 腾讯音乐娱乐科技(深圳)有限公司 视频换脸方法及装置、存储介质
CN111881838A (zh) * 2020-07-29 2020-11-03 清华大学 具有隐私保护功能的运动障碍评估录像分析方法及设备

Also Published As

Publication number Publication date
CN111881838B (zh) 2023-09-26
US11663845B2 (en) 2023-05-30
CN111881838A (zh) 2020-11-03
US20220036058A1 (en) 2022-02-03

Similar Documents

Publication Publication Date Title
WO2022022551A1 (zh) 具有隐私保护功能的运动障碍评估录像分析方法及设备
US9996739B2 (en) System and method for automatic gait cycle segmentation
US10262196B2 (en) System and method for predicting neurological disorders
JP2023171650A (ja) プライバシーの保護を伴う人物の識別しおよび/または痛み、疲労、気分、および意図の識別および定量化のためのシステムおよび方法
Leightley et al. Automated analysis and quantification of human mobility using a depth sensor
Leu et al. A robust markerless vision-based human gait analysis system
CN109815858B (zh) 一种日常环境中的目标用户步态识别系统及方法
Datcu et al. Noncontact automatic heart rate analysis in visible spectrum by specific face regions
CN111933275A (zh) 一种基于眼动与面部表情的抑郁评估系统
US20210386343A1 (en) Remote prediction of human neuropsychological state
CN110991268A (zh) 一种基于深度图像的帕金森手部运动量化分析方法和系统
CN111403026A (zh) 一种面瘫等级评估方法
Williams et al. Assessment of physical rehabilitation movements through dimensionality reduction and statistical modeling
Zhang et al. Comparison of OpenPose and HyperPose artificial intelligence models for analysis of hand-held smartphone videos
CN110801227B (zh) 基于可穿戴设备的立体色块障碍测试的方法和系统
CA2878374C (en) Kinetic-based tool for biometric identification, verification, validation and profiling
CN116543455A (zh) 建立帕金森症步态受损评估模型、使用方法、设备及介质
CN114240934B (zh) 一种基于肢端肥大症的图像数据分析方法及系统
Suriani et al. Non-contact Facial based Vital Sign Estimation using Convolutional Neural Network Approach
CN115147769A (zh) 一种基于红外视频的生理参数鲁棒性检测方法
CN112907635B (zh) 基于几何分析提取眼部异常运动特征的方法
TW201839635A (zh) 情緒偵測系統及方法
Katiyar et al. Clinical gait data analysis based on spatio-temporal features
Mirabet-Herranz et al. LVT Face Database: A benchmark database for visible and hidden face biometrics
Angerer Stresserkennung mit Hilfe von Gesichtsausdrücke aus Videosequenzen: von Paul Angerer

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21849634

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21849634

Country of ref document: EP

Kind code of ref document: A1