CN115641610A - Hand-waving help-seeking identification system and method - Google Patents

Hand-waving help-seeking identification system and method Download PDF

Info

Publication number
CN115641610A
CN115641610A CN202211259423.6A CN202211259423A CN115641610A CN 115641610 A CN115641610 A CN 115641610A CN 202211259423 A CN202211259423 A CN 202211259423A CN 115641610 A CN115641610 A CN 115641610A
Authority
CN
China
Prior art keywords
waving
hand
information
distress
posture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211259423.6A
Other languages
Chinese (zh)
Inventor
刘秦
周晓
王磊
孙岩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenyang Zhanyan Technology Co ltd
Original Assignee
Shenyang Zhanyan Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenyang Zhanyan Technology Co ltd filed Critical Shenyang Zhanyan Technology Co ltd
Priority to CN202211259423.6A priority Critical patent/CN115641610A/en
Publication of CN115641610A publication Critical patent/CN115641610A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The invention belongs to the field of artificial intelligence identification, in particular to a system and a method for identifying a waving hand for help, which comprises the following steps: the system comprises a feature extraction unit, a calculation module and a hand waving distress detection unit; the feature extraction unit is used for extracting acoustic features, acquiring the preprocessed face images, respectively sending the face images to the hand waving distress detection unit, acquiring the skeleton key point information of the personnel and transmitting the skeleton key point information to the calculation module; the computing module is used for detecting hand waving action, hand waving amplitude and sitting, lying or standing postures of personnel, acquiring hand waving frequency, and sending detection results comprising the extension degree index, the hand waving action state and the sitting, lying or standing postures of the personnel and the hand waving frequency to the hand waving distress detection unit; the hand-waving distress detection unit judges whether a person is waving a hand for distress by adopting a comprehensive weight method. The invention relates to a hand waving distress fusion strategy: a multi-state fusion strategy is provided, so that the finally obtained result of the swing distress judgment is more robust.

Description

Hand-waving help-seeking identification system and method
Technical Field
The invention belongs to the field of artificial intelligence identification, and particularly relates to a hand-waving distress identification system and method.
Background
With the development of deep learning and machine vision, the application based on artificial intelligence is gradually mature and widely applied. The existing hand-waving distress recognition system is mainly based on visual images, acquires gesture key points, establishes a distress recognition model by using a deep learning model or a manual logic rule, judges whether to wave a hand, detects hand waving frequency and recognizes distress behaviors.
The method depends on the identification of key points, and due to factors such as shielding, visual angle, illumination, scale, motion blur and the like, the identification of key points of human body postures is interfered, so that the identification result is unstable and inaccurate, and the requirement of high accuracy of a real use scene cannot be met.
Disclosure of Invention
The invention aims to provide a stable and high-reliability hand-waving help-seeking identification method, and provides a plurality of strategies to improve and improve the help-seeking identification effect aiming at the problems.
The technical scheme adopted by the invention for realizing the purpose is as follows: a hand-waving distress recognition system comprising: the system comprises a feature extraction unit, a calculation module and a waving help-seeking detection unit;
the feature extraction unit is used for receiving the audio stream and the video stream sent by the camera, extracting MFCC acoustic features and acquiring a preprocessed face image, respectively sending the face image and the acoustic features to the hand-waving distress detection unit, and meanwhile, extracting the posture information of the human skeleton in the video stream, acquiring the key point information of the human skeleton, and transmitting the key point information to the calculation module;
the computing module is used for detecting hand waving action, hand waving amplitude and sitting, lying or standing postures of the personnel according to the key point information of the personnel skeleton, and sending detection results comprising the extension degree index, the hand waving action state and the sitting, lying or standing postures of the personnel to the hand waving help-seeking detection unit; meanwhile, according to the coordinates of key points of the human body, obtaining an extension degree index, further obtaining a waving frequency, and sending the waving frequency to a waving distress detection unit;
and the hand-waving distress detection unit is used for processing the MFCC acoustic features extracted by the feature extraction unit, the sent face image and the detection result sent by the calculation module, and judging whether the person is waving for distress by adopting a comprehensive weight method.
The feature extraction unit includes: the system comprises a sound feature extraction module, a facial feature preprocessing module and a human body posture detection module;
the voice feature extraction module is used for receiving the audio stream sent by the camera, processing continuous voice in the audio stream, acquiring N-dimensional MFCC features and sending the N-dimensional MFCC features to the waving help-seeking detection unit;
the facial feature preprocessing module is used for acquiring images through video streaming of a camera, detecting faces of people through a machine learning method, removing background and non-face areas, acquiring face point location coordinates, performing face feature normalization processing on the face images according to the face point location coordinates, and sending the processed face images to the swing help-seeking detection unit;
and the human body posture detection module is used for receiving the video stream sent by the camera, extracting the personnel skeleton posture information through the deep neural network, acquiring the personnel skeleton key point information and transmitting the personnel skeleton key point information to the calculation module.
The calculation module comprises: the hand waving action detection module and the extension degree index calculation module;
the hand waving motion detection module is used for carrying out normalization preprocessing on coordinates and scales according to the key point information of the human body to obtain key point characteristics, judging whether a person standing, sitting and lying in the posture information has a hand waving motion state or not, and obtaining hand waving amplitude information according to the hand waving motion state; meanwhile, outputting the hand-waving gesture confidence coefficient under the hand-waving action states of standing, sitting and lying corresponding to the person according to the gesture detection model; sending posture information of the human body corresponding to standing, sitting and lying, hand waving amplitude information and hand waving posture confidence coefficient to a hand waving distress detection unit;
the extension degree index calculation module is used for acquiring an extension degree index according to the coordinates of the key points of the human body so as to reflect the posture extension degree of the human body and calculating the hand waving frequency according to the extension degree index; and sending the hand waving frequency to a hand waving distress detection unit.
The hand waving distress detection unit comprises: the system comprises a voice event recognition module, a facial expression recognition module and a hand waving distress detection module;
the voice event recognition module is used for carrying out event detection by utilizing a deep learning model according to the MFCC characteristics extracted by the characteristic extraction unit to obtain a voice classification result containing classified voice and corresponding confidence information output by the current frame voice;
the facial expression recognition module is used for extracting facial features through a deep learning model according to the facial images sent by the feature extraction unit, carrying out training reasoning and obtaining a facial expression classification result containing the facial expressions of the classified human bodies and the corresponding confidence information output by the facial expressions of the current frame;
and the hand waving distress detection module is used for judging whether the person waves hands for help according to the voice classification result, the facial expression classification result and the posture information of the corresponding station, the sitting posture and the lying posture, the hand waving amplitude information, the hand waving posture confidence coefficient and the hand waving frequency sent by the calculation module by utilizing a comprehensive weight method.
A recognition method of a hand-waving distress recognition system comprises the following steps:
1) Sending the audio stream sent by the camera to a sound feature extraction module, and sending the video stream to a facial feature preprocessing module and a human body posture detection module respectively;
2-1) the sound feature extraction module receives the audio stream sent by the camera, processes the continuous voice, obtains MFCC acoustic features and sends the MFCC acoustic features to the waving help-seeking detection unit;
2-2) the facial feature preprocessing module acquires images through video stream of a camera, performs face detection on personnel through a machine learning method, removes background and non-face areas, acquires face point coordinates, performs face feature normalization processing on the face images according to the face point coordinates, and sends the processed face images to a waving help detection unit;
2-3) the human body posture detection module receives the video stream sent by the camera, extracts the personnel skeleton posture information through the deep neural network, acquires the personnel skeleton key point information and transmits the personnel skeleton key point information to the calculation module;
3-1) the calculation module performs normalization pretreatment on coordinates and scales according to the key point information of the human body to obtain key point characteristics, judges whether a person standing, sitting and lying in the gesture information has a hand waving action state or not, and obtains hand waving amplitude information according to the hand waving action state; meanwhile, outputting the confidence coefficients of the hand waving postures of standing, sitting and lying corresponding to the person at the moment according to the posture detection model; sending gesture information of a person corresponding to standing, sitting and lying, hand waving amplitude information and hand waving gesture confidence to a hand waving distress detection unit;
3-2) the calculation module obtains the extension degree index delta according to the key point information of the human body HPS And according to the elongation index delta HPS Calculating the hand waving frequency, and sending the hand waving frequency to a hand waving distress detection unit;
4) The hand-waving help-seeking detection unit detects an event by using a deep learning model according to the MFCC acoustic features extracted by the feature extraction unit, and the event comprises classified voices and a voice classification result of confidence information corresponding to the current frame voice output; meanwhile, facial features are extracted through a deep learning model according to the facial images sent by the feature extraction unit, training reasoning is carried out, and a facial expression classification result comprising the classified facial expressions of the human body and confidence information output by the corresponding facial expressions of the current frame is obtained;
5) The hand waving distress detection unit judges whether the person is waving for distress according to the voice classification result, the facial expression classification result, and the posture information of the corresponding station, sitting and lying of the human body, the hand waving amplitude information, the hand waving posture confidence coefficient and the hand waving frequency sent by the calculation module by utilizing a comprehensive weight method.
In step 2-1), after processing the continuous speech, obtaining MFCC acoustic characteristics, specifically:
carrying out pre-emphasis, framing and windowing operations on continuous voice of an audio stream in sequence to obtain pre-processed voice information;
and sequentially performing fast Fourier transform, mei filter bank, logarithm operation, discrete cosine transform and dynamic feature extraction on the preprocessed sound information to finally obtain the N-dimensional MFCC acoustic features.
The step 3-1) specifically comprises the following steps:
3-1-1) obtaining the posture information of standing, sitting or lying of the person at the moment and the confidence coefficient of the hand waving posture of the current frame corresponding to the standing, sitting and lying by the calculation module through a posture detection model according to the key point information of the human body; the attitude detection model is a CNN model;
3-1-2) simultaneously acquiring included angle data among the joints according to key point information of the human body, and acquiring hand waving amplitude information of the human body during hand waving according to included angle data between a small arm and a large arm and included angle data between the large arm and a shoulder in the included angle data among the joints;
3-1-3) the calculation module sends the posture information of the corresponding station, sitting and lying of the human body, the hand waving amplitude information and the hand waving posture confidence coefficient to the hand waving distress detection unit.
8. The identification method of the hand-waving distress identification system according to claim 5, wherein in the step 3-2), the calculation module obtains the extension degree index delta according to the key point information of the human body HPS The method specifically comprises the following steps:
3-2-1) the calculation module constructs a matrix structure for the human body key point information of the given posture as follows:
X=[x 1 ,…,x n ]∈R D×n
wherein D is the dimension of the key points, n is the number of the key points, and x n The coordinates of the nth key point are shown, and R is a real number set;
3-2-2) order
Figure BDA0003890575770000051
Defining the mean value of all key point coordinates in X
Figure BDA0003890575770000052
The variance sum delta of the principal axes of the key points of the human body posture HPS Is defined as follows:
Figure BDA0003890575770000053
where U is the projection matrix, x i Is the ith keypoint coordinate, D is a positive number less than the keypoint dimension D, λ j Is a covariance matrix
Figure BDA0003890575770000054
The jth eigenvalue, tr, represents a trace of the matrix;
3-2-3) according to the elongation index delta HPS Calculating the hand waving frequency specifically as follows:
δ HPS the sum of the variances of the principal axes of the key points of the human body posture, namely the posture extension degree of the reaction person;
when the hand swings to two sides of the human body, delta HPS The value is maximum; when the hand swings to the vertex and is in line with the trunk of the human body, delta HPS Value minimum, by counting delta HPS Changing the rule of the period on a time axis, namely obtaining hand waving frequency information during hand waving action; and sending the obtained hand waving frequency information to a hand waving distress detection unit.
The step 4) is specifically as follows:
4-1) for sound event classification, specifically:
during model training, input MFCC features are normalized, label texts for representing corresponding expression names are read to generate label vectors, the MFCC features and the label vectors are bound, then preprocessing is carried out, and then the MFCC features and the label vectors are sent to a deep learning model to learn parameters to obtain a voice classification model;
wherein, the deep learning model is any one of a CNN model, an RCNN model and an LSTM model;
during model prediction, input MFCC features are normalized, then preprocessing is carried out, then the input MFCC features are sent into a voice classification model for prediction, and the obtained data are subjected to post-processing to obtain a voice classification result containing classified voice and corresponding confidence information of current frame voice output;
4-2) classifying facial expressions, specifically:
extracting facial features from the face image through a deep learning model;
the deep learning model is any one of a convolutional neural network, a deep confidence network, a deep automatic encoder and a recurrent neural network;
after the facial features are extracted, the facial expressions are classified through a convolutional neural network, so that a facial expression classification result containing the classified facial expressions of the human body and corresponding confidence coefficient information output by the current frame facial expressions is obtained.
The step 5) is specifically as follows:
(1) The hand-waving distress detection unit obtains a global strategy of waving hands for distress by utilizing a comprehensive weight method according to a voice classification result containing classified voices and corresponding confidence information of current frame voice output, a facial expression classification result containing classified facial expressions of human bodies and corresponding confidence information of current frame facial expression output, human body corresponding station, sitting posture information, lying posture information, waving amplitude information, waving posture confidence and waving frequency sent by a calculation module:
Figure BDA0003890575770000061
wherein,
Figure BDA0003890575770000062
the confidence of the hand waving distress at the current time t,
Figure BDA0003890575770000063
the confidence of the hand waving distress gesture at the current time t,
Figure BDA0003890575770000064
for the confidence of the sound event detection at the current time t,
Figure BDA0003890575770000065
the confidence coefficient of the facial expression at the current moment t is obtained; w is a pose ,w sound ,w face Respectively weighting the hand-waving gesture, the sound event and the facial expression;
wherein:
w pose +w sound +w face =1
w pose ,w sound ,w face for a preset assignment, w pose >w sound ,w pose >w face
(2) Obtaining by means of exponentially weighted moving average
Figure BDA0003890575770000066
Namely:
Figure BDA0003890575770000067
Figure BDA0003890575770000068
Figure BDA0003890575770000069
wherein, beta represents a weighting coefficient,
Figure BDA00038905757700000610
for the confidence of the current frame hand waving distress posture,
Figure BDA00038905757700000611
confidence information output for the current frame sound detection,
Figure BDA0003890575770000071
confidence information output for the current frame facial expression;
(3) For the confidence of the current frame waving hand for asking for help
Figure BDA0003890575770000072
The current frame wave hand gesture confidence, wave hand amplitude information, wave hand frequency information and gesture addition information are comprehensively obtained, namely:
Figure BDA0003890575770000073
wherein,
Figure BDA0003890575770000074
for the current frame hand-waving gesture confidence level,
Figure BDA0003890575770000075
for the current frame of hand-waving amplitude information,
Figure BDA0003890575770000076
for the current frame of hand waving frequency information,
Figure BDA0003890575770000077
is the attitude addition coefficient;
(4) Current frame waving amplitude information
Figure BDA0003890575770000078
Namely:
Figure BDA0003890575770000079
wherein,
Figure BDA00038905757700000710
in order to detect the amplitude of the hand swing,
Figure BDA00038905757700000711
is the maximum value of the preset waving amplitude,
Figure BDA00038905757700000712
is a value between 0 and 1, and the larger the hand waving amplitude is, the higher the value is;
(5) Current frame waving frequency information
Figure BDA00038905757700000713
The method specifically comprises the following steps:
Figure BDA00038905757700000714
wherein,
Figure BDA00038905757700000715
in order to detect the frequency of the waving of the hand,
Figure BDA00038905757700000716
is the maximum value of the preset waving frequency,
Figure BDA00038905757700000717
is a value between 0 and 1, the higher the waving frequency, the higher the value;
(6) In the end of this process,
Figure BDA00038905757700000718
judging with a preset threshold value, if the threshold value is larger than the preset threshold value, judging that the current frame is in a hand waving distress state; if the continuous n frames are in the state of waving hands for help, the hands waving for help alarm is carried out。
The invention has the following beneficial effects and advantages:
1. the voice information is introduced, so that the video stream and the audio stream are combined, and the accuracy of SOS identification is improved in an auxiliary way through the audio information;
2. the method introduces facial expression recognition of the face, and assists in judging the distress state when the face is panic or fear;
3. for gesture recognition, when the gesture is considered to be a hand waving state, the gesture standing/sitting/lying information of a person is judged, the arm waving amplitude is obtained at the same time, and when the arm waving amplitude is large, the standing, sitting and lying gesture information of the person is combined to assist in judging a distress state;
4. the hand waving action is periodic frequency action, and the periodicity of the hand waving action is judged through the extension degree evaluation index, so that the hand waving action frequency is obtained, and the judgment of whether a person normally waves a hand or performs a help-seeking action is assisted
5. The invention relates to a hand waving distress fusion strategy: and providing a multi-state fusion strategy according to the information data obtained by the hand waving posture model, the sound event detection model and the expression recognition model, so that the finally obtained hand waving distress judgment result is more robust.
Drawings
FIG. 1 is a system framework diagram of the present invention;
FIG. 2 is a flowchart illustrating the operation of the voice feature extraction module in feature extraction according to the present invention;
FIG. 3 is a schematic diagram of the present invention for building a speech classification model;
FIG. 4 is a circuit diagram of a equalization management module of the present invention;
FIG. 5 is a schematic diagram of the human body key points of the present invention;
wherein, the corresponding relationship of the key point serial numbers in fig. 5 is: 0: nose, 1: neck, 2: right shoulder, 3: right elbow, 4: right wrist, 5: left shoulder, 6: left elbow, 7: left wrist, 8: middle hip, 9: right hip, 10: right knee, 11: right ankle, 12: left hip, 13: left knee, 14: left ankle, 15: right eye, 16: left eye, 17: right ear, 18: left ear, 19: left thumb, 20: left little finger, 21: left heel, 22: right thumb, 23: right little thumb, 24: the right heel.
Detailed Description
As shown in fig. 1, which is a system framework diagram of the present invention, the overall structure of the system of the present invention is divided into two major parts, a front-end camera and a back-end application server.
The existing front-end Camera is connected to a rear-end AppServer through RTSP or API, and a plurality of modules in the AppServer perform artificial intelligence analysis and processing on video streams and audio streams.
The AppServer comprises: the system comprises a feature extraction unit, a calculation module and a waving help-seeking detection unit, wherein the functions of each unit/module are as follows:
the feature extraction unit is used for receiving the audio stream and the video stream sent by the camera, extracting MFCC acoustic features and acquiring a preprocessed face image, respectively sending the face image and the MFCC acoustic features to the hand-waving distress detection unit, and meanwhile, extracting the skeleton posture information of the personnel in the video stream, acquiring the key point information of the skeleton of the personnel and transmitting the key point information to the calculation module;
the computing module is used for detecting hand waving action, hand waving amplitude and sitting, lying or standing postures of the personnel according to the key point information of the personnel skeleton, and sending detection results comprising the extension degree index, the hand waving action state and the sitting, lying or standing postures of the personnel to the hand waving help-seeking detection unit; meanwhile, according to the coordinates of the key points of the human body, obtaining the extension degree index, further obtaining the waving frequency, and sending the waving frequency to a waving help-seeking detection unit;
and the hand-waving distress detection unit is used for processing the MFCC acoustic features extracted by the feature extraction unit, the sent face image and the detection result sent by the calculation module, and judging whether the person is waving for distress by adopting a comprehensive weight method.
Wherein, the feature extraction unit includes: the system comprises a sound feature extraction module, a facial feature preprocessing module and a human body posture detection module;
the voice feature extraction module is used for receiving the audio stream sent by the camera, processing continuous voice in the audio stream, acquiring N-dimensional MFCC features and sending the N-dimensional MFCC features to the waving help-seeking detection unit;
the face feature preprocessing module is used for acquiring images through video stream of a camera, detecting faces of people through a machine learning method, removing background and non-face areas, acquiring face point coordinates, performing face feature normalization processing on the face images according to the face point coordinates, and sending the processed face images to the waving help-seeking detection unit;
and the human body posture detection module is used for receiving the video stream sent by the camera, extracting the personnel skeleton posture information through the deep neural network, acquiring the personnel skeleton key point information and transmitting the personnel skeleton key point information to the calculation module.
A computing module, comprising: the hand waving action detection module and the extension degree index calculation module;
the hand waving motion detection module is used for carrying out normalization preprocessing on coordinates and scales according to the key point information of the human body to obtain key point characteristics, judging whether a person standing, sitting and lying in the posture information has a hand waving motion state or not, and obtaining hand waving amplitude information according to the hand waving motion state; meanwhile, outputting the hand waving gesture confidence coefficient in the hand waving action state of the person corresponding to standing, sitting and lying at the moment according to the gesture detection model; sending posture information of the human body corresponding to standing, sitting and lying, hand waving amplitude information and hand waving posture confidence coefficient to a hand waving distress detection unit;
the extension degree index calculation module is used for acquiring an extension degree index according to the coordinates of the key points of the human body so as to reflect the posture extension degree of the human body and calculating the waving frequency according to the extension degree index; and sending the hand waving frequency to a hand waving distress detection unit.
Wave hand SOS detecting element includes: the system comprises a voice event identification module, a facial expression identification module and a hand waving distress detection module;
the voice event recognition module is used for detecting events by utilizing a deep learning model according to the MFCC characteristics extracted by the characteristic extraction unit to obtain a voice classification result containing classified voice and corresponding confidence information output by the current frame voice;
the facial expression recognition module is used for extracting facial features through a deep learning model according to the facial images sent by the feature extraction unit, carrying out training reasoning and obtaining a facial expression classification result containing the facial expressions of the classified human bodies and the corresponding confidence information output by the facial expressions of the current frame;
and the hand waving distress detection module is used for judging whether the person waves hands for distress according to the voice classification result, the facial expression classification result and the posture information of the corresponding station, the sitting posture and the lying posture, the hand waving amplitude information, the hand waving posture confidence coefficient and the hand waving frequency sent by the calculation module by utilizing a comprehensive weight method.
As shown in fig. 1, the workflow method of the present invention is specifically implemented based on data streams transmitted by each module, and the method of the present invention includes the following steps:
1) Sending the audio stream sent by the camera to a sound feature extraction module, and sending the video stream to a facial feature preprocessing module and a human body posture detection module respectively;
2-1) the sound feature extraction module receives the audio stream sent by the camera, processes the continuous voice, obtains MFCC acoustic features and sends the MFCC acoustic features to the waving help-seeking detection unit;
2-2) the facial feature preprocessing module acquires images through video stream of a camera, performs face detection on personnel through a machine learning method, removes background and non-face areas, acquires face point coordinates, performs face feature normalization processing on the face images according to the face point coordinates, and sends the processed face images to a waving help detection unit;
2-3) the human body posture detection module receives the video stream sent by the camera, extracts the personnel skeleton posture information through the deep neural network, acquires the personnel skeleton key point information and transmits the personnel skeleton key point information to the calculation module;
3-1) the calculation module performs normalization pretreatment on coordinates and scales according to the key point information of the human body to obtain key point characteristics, judges whether a person standing, sitting and lying in the gesture information has a hand waving action state or not, and obtains hand waving amplitude information according to the hand waving action state; meanwhile, outputting the hand waving gesture confidence coefficients of the person correspondingly standing, sitting and lying at the moment according to the gesture detection model; sending posture information of the human body corresponding to standing, sitting and lying, hand waving amplitude information and hand waving posture confidence coefficient to a hand waving distress detection unit;
3-2) the calculation module obtains the extension degree index delta according to the key point information of the human body HPS And according to the elongation index delta HPS Calculating the hand waving frequency, and sending the hand waving frequency to a hand waving distress detection unit;
4) The hand-waving help-seeking detection unit detects an event by using a deep learning model according to the MFCC acoustic features extracted by the feature extraction unit, and the event comprises classified voices and a voice classification result of confidence information corresponding to the current frame voice output; meanwhile, facial features are extracted through a deep learning model according to the facial images sent by the feature extraction unit, training reasoning is carried out, and a facial expression classification result containing the facial expressions of the classified human body and confidence information output by the corresponding facial expressions of the current frame is obtained;
5) The hand waving distress detection unit judges whether the person is waving for help according to the voice classification result, the facial expression classification result and the posture information of the corresponding station, sitting and lying of the human body, the hand waving amplitude information, the hand waving posture confidence and the hand waving frequency sent by the calculation module by utilizing a comprehensive weight method.
As shown in fig. 2, it is a work flow diagram of feature extraction by the sound feature extraction module of the present invention, wherein the work flow diagram specifically includes the following steps:
carrying out preprocessing based on continuous voice, including the steps of pre-emphasis, framing, windowing and the like;
for the sound information after the preprocessing, the steps of fast fourier transform, mei filter bank, logarithm operation, discrete cosine transform, dynamic feature extraction, and the like are performed, and finally 39-dimensional MFCC features (Mel-scale frequency cepstral coefficients, abbreviated as MFCC) are extracted in this embodiment.
As shown in fig. 3, in order to establish a schematic diagram of a speech classification model, a deep learning model is based on, which includes: but not limited to, convolutional neural networks such as CNN/RCNN/LSTM or recurrent neural networks, to recognize the input speech features.
During model training, input MFCC features are normalized, label texts representing real expressions are read to generate label vectors, the features and the labels are bound, preprocessing is carried out, and then the label vectors are sent to a CNN/RCNN/LSTM model to learn parameters, so that a reliable speech classification model is obtained.
As shown in fig. 4, in the model prediction, the 39-dimensional MFCC features are normalized, then preprocessed, and then sent to the speech classification model for prediction, and the obtained data is post-processed to obtain a speech classification result including classified speech and corresponding confidence information of current frame sound output;
as shown in fig. 1, the facial feature preprocessing module in the present system acquires images through a video stream, performs face detection through a machine learning manner, and then removes background and non-face regions.
Face feature normalization: and carrying out brightness normalization on the contents of the face image, so that the distribution of face brightness pixels is unified as much as possible, and meanwhile, the face contrast is enhanced.
The facial expression recognition module extracts features through a deep learning model, including a Convolutional Neural Network (CNN), a Deep Belief Network (DBN), a deep automatic encoder (DAN), and a Recurrent Neural Network (RNN).
After the feature extraction is completed, the facial expressions are classified. The facial expression recognition can be performed in a deep learning manner to form an end-to-end model, and a traditional machine learning method, such as an SVM (support vector machine) and other classifiers used in the embodiment, can be used for classification to obtain whether the human face has the expressions of panic and panic.
As shown in fig. 1 and 5, the human body posture detection module trains and predicts the human body posture key points through a deep learning human body posture network model structure, specifically, as the human body key point schematic diagram in fig. 5.
In the method for identifying the hand waving for help, step 3-1) specifically comprises the following steps:
3-1-1) obtaining the standing, sitting or lying posture information of the person at the moment and the current frame hand-waving posture confidence coefficient corresponding to the standing, sitting and lying posture through a posture detection model according to the key point information (as shown in figure 5) of the human body by a hand-waving motion detection module; the gesture detection model is a CNN model; whether the person who learns gesture information has the state of waving his hand to give the confidence that people at this moment stood, sat, lain, when sitting and lain the state, all be better additional work to the judgement whether output is SOS.
3-1-2) acquiring included angle data between the joints according to the key point information of the human body, and acquiring waving range information of the human body during waving according to included angle data between a forearm and a forearm (i.e. included angles formed between key points 2, 3 and 4 in fig. 5) and included angle data between the forearm and a shoulder (i.e. included angles formed between key points 1-4 and key points 1 and 5-7) in the included angle data between the joints;
3-1-3) the extension degree index calculation module sends posture information (similarly, judged according to the key point information identified in the attached figure 5) of standing, sitting and lying corresponding to the human body, hand waving amplitude information and hand waving posture confidence coefficient to the hand waving distress detection unit.
In the step 3-2), the calculation module acquires an extension index delta according to the human body key point information HPS The method specifically comprises the following steps:
introduction of delta HPS And the index is used for calculating the extension degree of the human body posture so as to assist in judging the waving frequency of the waving hand detection. Delta HPS Is based on the sum of the distribution variances of the key points of the human posture on the main axis of the key points.
3-2-1) as shown in fig. 4, the extension degree index calculation module constructs a matrix structure for the human body key point information of the given posture as follows:
X=[x 1 ,…,x n ]∈R D×n
wherein D is the dimension of the key points, n is the number of the key points, x n The coordinates of the nth key point are shown, and R is a real number set;
3-2-2) reacting
Figure BDA0003890575770000131
Defining the mean value of all the key point coordinates in X
Figure BDA0003890575770000132
The variance sum delta of the principal axes of the key points of the human body posture HPS Defined as Eq1:
Figure BDA0003890575770000133
eq1 is only delta HPS A form of the expression, which can also be written as Eq2:
Figure BDA0003890575770000134
where U is the projection matrix, x i Is the ith keypoint coordinate, D is a positive number less than the keypoint dimension D, in this embodiment, D is 2, D is 1, λ j Is a covariance matrix
Figure BDA0003890575770000135
The jth eigenvalue, tr, represents a trace of the matrix;
thus, δ HPS Is the sum of the variances of the principal axes of the key points of the human posture and is used for reflecting the posture extension degree of the human.
3-2-3) an extension degree index calculation module according to the extension degree index delta HPS Calculating the hand waving frequency specifically as follows:
will delta HPS The indicator is applied to the waving action. Since the waving motion is also a periodic motion, δ is a motion in which the hand is waved to both sides of the human body during the motion HPS The value is larger, delta when the hand swings to the vertex and is in line with the trunk of the human body HPS Small value, by counting delta HPS The law is changed on the time axis periodically, so that the hand waving frequency during hand waving action can be obtained, and the obtained hand waving frequency information is sent to the hand waving distress detection unit.
δ HPS Has the advantages that: the index is only related to the human body action, is not related to the information such as the visual angle, the scale and the like of the camera, andthe method is not influenced by posture interference, and can effectively reflect the periodic data of the hand waving frequency of the person, so that whether the person wave the hand normally or wave the hand violently for help can be judged.
The step 5) of the invention specifically comprises the following steps:
(1) The hand waving distress detection unit obtains a global strategy of hand waving distress according to a voice classification result containing classified voice and corresponding confidence information of current frame voice output, a facial expression classification result containing classified human facial expressions and corresponding confidence information of current frame facial expression output, posture information of sitting and lying, hand waving amplitude information, hand waving posture confidence and hand waving frequency sent by a calculation module, and by utilizing a comprehensive weight method:
Figure BDA0003890575770000141
wherein,
Figure BDA0003890575770000142
the confidence of the hand waving distress at the current time t,
Figure BDA0003890575770000143
for the confidence level of the hand waving distress gesture at the current time t,
Figure BDA0003890575770000144
a confidence is detected for the sound event at the current time t,
Figure BDA0003890575770000145
the confidence coefficient of the facial expression at the current moment t is obtained; w is a pose ,w sound ,w face The weight of the hand waving gesture, the weight of the sound event and the weight of the facial expression are respectively;
wherein:
w pose +w sound +w face =1
w pose ,w sound ,w face for a preset assignment, w pose >w sound ,w pose >w face (ii) a In this embodiment, w is a hand waving for help pose The values may be higher, let:
w pose =0.8,w sound =0.1,w face =0.1
the above is merely an example, and other values may be selected according to the situation in practical application.
In order to avoid the problem of false detection of single-frame detection of the model,
Figure BDA0003890575770000151
obtaining by using an exponential weighted moving average, namely the step (2) is as follows:
(2) Obtaining by means of exponentially weighted moving average
Figure BDA0003890575770000152
Comprehensively considering the current data and the historical data, and giving higher weight to the current data, namely:
Figure BDA0003890575770000153
Figure BDA0003890575770000154
Figure BDA0003890575770000155
wherein, beta represents a weighting coefficient,
Figure BDA0003890575770000156
for the confidence of the current frame hand waving distress posture,
Figure BDA0003890575770000157
confidence information output for the current frame sound detection,
Figure BDA0003890575770000158
confidence information output for the current frame facial expression;
at the initial moment, there is a deviation, and therefore a deviation correction is introduced, the formula is as follows:
Figure BDA0003890575770000159
Figure BDA00038905757700001510
Figure BDA00038905757700001511
(3) For the confidence of the current frame waving hand for asking for help
Figure BDA00038905757700001512
Obtaining the current frame hand waving attitude confidence coefficient, hand waving amplitude information, hand waving frequency information and attitude addition information comprehensively, namely:
Figure BDA00038905757700001513
wherein,
Figure BDA00038905757700001514
for the current frame hand-waving gesture confidence level,
Figure BDA00038905757700001515
for the current frame of hand-waving amplitude information,
Figure BDA00038905757700001516
for the current frame of hand waving frequency information,
Figure BDA00038905757700001517
is the attitude addition coefficient;
(4) Current frame waving amplitude information
Figure BDA00038905757700001518
Namely:
Figure BDA00038905757700001519
wherein,
Figure BDA0003890575770000161
in order to detect the amplitude of the hand swing,
Figure BDA0003890575770000162
is the maximum value of the preset waving range,
Figure BDA0003890575770000163
is a value between 0 and 1, and the larger the hand waving amplitude is, the higher the value is;
(5) Current frame waving frequency information
Figure BDA0003890575770000164
The method comprises the following specific steps:
Figure BDA0003890575770000165
wherein,
Figure BDA0003890575770000166
in order to detect the frequency of the waving of the hand,
Figure BDA0003890575770000167
is the maximum value of the preset waving frequency,
Figure BDA0003890575770000168
is a value between 0 and 1, the higher the waving frequency is, the higher the value is;
in practical application, gamma conversion can be performed according to the situation, nonlinear compensation is performed on the hand waving amplitude information and the hand waving frequency information, and the final weight values of the hand waving amplitude information and the hand waving frequency information in hand waving posture judgment are adjusted as follows:
Figure BDA0003890575770000169
Figure BDA00038905757700001610
gamma is an empirical value and is typically taken to be slightly less than 1.
Figure BDA00038905757700001611
For the attitude addition coefficient, it is calculated as follows:
if the gesture detection model detects that the person is in a sitting/lying gesture, the expected probability of asking for help that the person swings hands under the sitting/lying gesture is considered to be larger, the expected probability can be set as an empirical value of 0.1, and the addition value can be increased or decreased:
Figure BDA00038905757700001612
otherwise, if the person's posture is standing:
Figure BDA00038905757700001613
(6) Finally, V t (sos) Judging with a preset threshold value, if the threshold value is larger than the preset threshold value, determining that the current frame is in a hand waving distress state; and if the continuous n frames are in the hand-waving distress state, carrying out hand-waving distress alarm.
The above description is only an embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, improvement, extension, etc. made within the spirit and principle of the present invention are included in the protection scope of the present invention.

Claims (10)

1. A hand-waving distress identification system is characterized by comprising: the system comprises a feature extraction unit, a calculation module and a hand waving distress detection unit;
the feature extraction unit is used for receiving the audio stream and the video stream sent by the camera, extracting MFCC acoustic features and acquiring a preprocessed face image, respectively sending the face image and the MFCC acoustic features to the hand-waving distress detection unit, and meanwhile, extracting the skeleton posture information of the personnel in the video stream, acquiring the key point information of the skeleton of the personnel and transmitting the key point information to the calculation module;
the calculation module is used for detecting hand waving actions, hand waving amplitude and sitting, lying or standing postures of personnel according to the key point information of the personnel skeleton, and sending detection results including the extension degree index, the hand waving action state and the sitting, lying or standing postures of the personnel to the hand waving help-seeking detection unit; meanwhile, according to the coordinates of the key points of the human body, obtaining the extension degree index, further obtaining the waving frequency, and sending the waving frequency to a waving help-seeking detection unit;
and the hand-waving distress detection unit is used for processing the MFCC acoustic features extracted by the feature extraction unit, the sent face image and the detection result sent by the calculation module, and judging whether the person is waving for distress by adopting a comprehensive weight method.
2. The system as claimed in claim 1, wherein the feature extraction unit comprises: the system comprises a sound feature extraction module, a facial feature preprocessing module and a human body posture detection module;
the voice feature extraction module is used for receiving the audio stream sent by the camera, processing continuous voice in the audio stream, acquiring N-dimensional MFCC features and sending the N-dimensional MFCC features to the waving help-seeking detection unit;
the face feature preprocessing module is used for acquiring images through video stream of a camera, detecting faces of people through a machine learning method, removing background and non-face areas, acquiring face point coordinates, performing face feature normalization processing on the face images according to the face point coordinates, and sending the processed face images to the waving help-seeking detection unit;
and the human body posture detection module is used for receiving the video stream sent by the camera, extracting the personnel skeleton posture information through the deep neural network, acquiring the personnel skeleton key point information and transmitting the personnel skeleton key point information to the calculation module.
3. The system as claimed in claim 1, wherein the computing module comprises: the hand waving action detection module and the extension degree index calculation module;
the hand waving motion detection module is used for carrying out normalization preprocessing on coordinates and scales according to the key point information of the human body to obtain key point characteristics, judging whether a person standing, sitting and lying in the posture information has a hand waving motion state or not, and obtaining hand waving amplitude information according to the hand waving motion state; meanwhile, outputting the hand waving gesture confidence coefficient in the hand waving action state of the person corresponding to standing, sitting and lying at the moment according to the gesture detection model; sending posture information of the human body corresponding to standing, sitting and lying, hand waving amplitude information and hand waving posture confidence coefficient to a hand waving distress detection unit;
the extension degree index calculation module is used for acquiring an extension degree index according to the coordinates of the key points of the human body so as to reflect the posture extension degree of the human body and calculating the waving frequency according to the extension degree index; and sending the hand waving frequency to a hand waving distress detection unit.
4. The system as claimed in claim 1, wherein the hand-waving distress detection unit comprises: the system comprises a voice event recognition module, a facial expression recognition module and a hand waving distress detection module;
the voice event recognition module is used for detecting events by utilizing a deep learning model according to the MFCC characteristics extracted by the characteristic extraction unit to obtain a voice classification result containing classified voice and corresponding confidence information output by the current frame voice;
the facial expression recognition module is used for extracting facial features through a deep learning model according to the facial images sent by the feature extraction unit, carrying out training reasoning and obtaining a facial expression classification result containing the facial expressions of the classified human bodies and the corresponding confidence information output by the facial expressions of the current frame;
and the hand waving distress detection module is used for judging whether the person waves hands for help according to the voice classification result, the facial expression classification result and the posture information of the corresponding station, the sitting posture and the lying posture, the hand waving amplitude information, the hand waving posture confidence coefficient and the hand waving frequency sent by the calculation module by utilizing a comprehensive weight method.
5. The identification method of a hand-waving distress identification system as claimed in claim 1, characterized by comprising the following steps:
1) Sending the audio stream sent by the camera to a sound feature extraction module, and sending the video stream to a facial feature preprocessing module and a human body posture detection module respectively;
2-1) the sound feature extraction module receives the audio stream sent by the camera, processes the continuous voice, obtains MFCC acoustic features and sends the MFCC acoustic features to the waving help-seeking detection unit;
2-2) the facial feature preprocessing module acquires images through video stream of a camera, performs face detection on personnel through a machine learning method, removes background and non-face areas, acquires face point coordinates, performs face feature normalization processing on the face images according to the face point coordinates, and sends the processed face images to a waving help detection unit;
2-3) the human body posture detection module receives the video stream sent by the camera, extracts the personnel skeleton posture information through the deep neural network, acquires the personnel skeleton key point information and transmits the personnel skeleton key point information to the calculation module;
3-1) the calculation module performs normalization preprocessing on coordinates and scales according to the key point information of the human body to obtain key point characteristics, judges whether a person standing, sitting and lying in the posture information has a hand waving action state or not, and obtains hand waving amplitude information according to the hand waving action state; meanwhile, outputting the confidence coefficients of the hand waving postures of standing, sitting and lying corresponding to the person at the moment according to the posture detection model; sending posture information of the human body corresponding to standing, sitting and lying, hand waving amplitude information and hand waving posture confidence coefficient to a hand waving distress detection unit;
3-2) the calculation module obtains the extension degree index delta according to the key point information of the human body HPS And according to the elongation index delta HPS Calculating the hand waving frequency, and sending the hand waving frequency to a hand waving distress detection unit;
4) The hand-waving help-seeking detection unit detects an event by using a deep learning model according to the MFCC acoustic features extracted by the feature extraction unit, and the event comprises classified voices and a voice classification result of confidence information corresponding to the current frame voice output; meanwhile, facial features are extracted through a deep learning model according to the facial images sent by the feature extraction unit, training reasoning is carried out, and a facial expression classification result containing the facial expressions of the classified human body and confidence information output by the corresponding facial expressions of the current frame is obtained;
5) The hand waving distress detection unit judges whether the person is waving for distress according to the voice classification result, the facial expression classification result, and the posture information of the corresponding station, sitting and lying of the human body, the hand waving amplitude information, the hand waving posture confidence coefficient and the hand waving frequency sent by the calculation module by utilizing a comprehensive weight method.
6. The recognition method of a hand-waving distress recognition system according to claim 5, wherein in the step 2-1), after the continuous speech is processed, MFCC acoustic features are obtained, specifically:
carrying out pre-emphasis, framing and windowing operations on continuous voice of an audio stream in sequence to obtain pre-processed voice information;
and sequentially performing fast Fourier transform, mei filter bank, logarithm operation, discrete cosine transform and dynamic feature extraction on the preprocessed sound information to finally obtain the N-dimensional MFCC acoustic features.
7. The identification method of the hand-waving distress identification system according to claim 5, wherein the step 3-1) is specifically as follows:
3-1-1) obtaining the posture information of standing, sitting or lying of the person at the moment and the confidence coefficient of the hand waving posture of the current frame corresponding to the standing, sitting and lying by the calculation module through a posture detection model according to the key point information of the human body; the attitude detection model is a CNN model;
3-1-2) simultaneously acquiring included angle data among the joints according to key point information of the human body, and acquiring hand waving amplitude information of the human body during hand waving according to included angle data between a small arm and a large arm and included angle data between the large arm and a shoulder in the included angle data among the joints;
3-1-3) the calculation module sends the posture information of the corresponding station, sitting and lying of the human body, the hand waving amplitude information and the hand waving posture confidence coefficient to the hand waving distress detection unit.
8. The identification method of the hand-waving distress identification system according to claim 5, wherein in the step 3-2), the calculation module obtains the extension degree index delta according to the key point information of the human body HPS The method specifically comprises the following steps:
3-2-1) the calculation module constructs a matrix structure for the human body key point information of the given posture as follows:
X=[x 1 ,…,x n ]∈R D×n
wherein D is the dimension of the key points, n is the number of the key points, x n The coordinates of the nth key point are shown, and R is a real number set;
3-2-2) reacting
Figure FDA0003890575760000031
Defining the mean value of all key point coordinates in X
Figure FDA0003890575760000032
The variance sum delta of the principal axes of the key points of the human body posture HPS Is defined as follows:
Figure FDA0003890575760000033
wherein U is a projection matrix,x i is the ith keypoint coordinate, D is a positive number less than the keypoint dimension D, λ j Is a covariance matrix
Figure FDA0003890575760000034
The jth eigenvalue, tr, represents a trace of the matrix;
3-2-3) according to the elongation index delta HPS Calculating the hand waving frequency specifically as follows:
δ HPS the sum of the variances of the principal axes of the key points of the human posture is used for reflecting the posture extension degree of the human body;
when the hand swings to two sides of the human body, delta HPS The value is maximum; when the hand swings to the vertex and is in line with the trunk of the human body, delta HPS Minimum value by counting delta HPS Changing the rule on a time axis, namely obtaining hand waving frequency information during hand waving action; and sending the obtained hand waving frequency information to a hand waving distress detection unit.
9. The identification method of the hand-waving distress identification system according to claim 5, wherein the step 4) is specifically as follows:
4-1) for sound event classification, specifically:
during model training, input MFCC features are normalized, label texts for representing corresponding expression names are read to generate label vectors, the MFCC features and the label vectors are bound, then preprocessing is carried out, and then the MFCC features and the label vectors are sent to a deep learning model to learn parameters to obtain a voice classification model;
wherein, the deep learning model is any one of a CNN model, an RCNN model and an LSTM model;
during model prediction, input MFCC characteristics are normalized, then preprocessing is carried out, then the input MFCC characteristics are sent into a voice classification model for prediction, and the obtained data are subjected to post-processing to obtain a voice classification result containing classified voice and corresponding confidence information of current frame voice output;
4-2) classifying facial expressions, specifically:
extracting facial features from the face image through a deep learning model;
the deep learning model is any one of a convolutional neural network, a deep confidence network, a deep automatic encoder and a recurrent neural network;
after the facial features are extracted, the facial expressions are classified through a convolutional neural network, so that a facial expression classification result containing the classified facial expressions of the human body and corresponding confidence coefficient information output by the current frame facial expressions is obtained.
10. The identification method of the hand-waving distress identification system according to claim 5, wherein the step 5) is specifically as follows:
(1) The hand waving distress detection unit obtains a global strategy of hand waving distress according to a voice classification result containing classified voice and corresponding confidence information of current frame voice output, a facial expression classification result containing classified human facial expressions and corresponding confidence information of current frame facial expression output, posture information of sitting and lying, hand waving amplitude information, hand waving posture confidence and hand waving frequency sent by a calculation module, and by utilizing a comprehensive weight method:
V t (sos) =w pose V t (pose) +w sound V t (sound) +w face V t (face)
wherein, V t (sos) Is the confidence of the help-seeking when waving hand at the current time t, V t (pose) Is the confidence coefficient of the hand waving distress gesture at the current moment t, V t (sound) For the confidence of the detection of the sound event at the current time t, V t (face) The confidence coefficient of the facial expression at the current moment t is obtained; w is a pose ,w sound ,w face Respectively weighting the hand-waving gesture, the sound event and the facial expression;
wherein:
w pose +w sound +w face =1
w pose ,w sound ,w face for a preset assignment, w pose >w sound ,w pose >w face
(2) Obtaining V by means of exponentially weighted moving average t (pose) ,V t (sound) ,V t (face) Namely:
Figure FDA0003890575760000041
Figure FDA0003890575760000042
Figure FDA0003890575760000043
wherein, beta represents a weighting coefficient,
Figure FDA0003890575760000044
for the confidence of the current frame hand waving distress posture,
Figure FDA0003890575760000045
confidence information output for the current frame sound detection,
Figure FDA0003890575760000046
confidence information output for the current frame facial expression;
(3) For the confidence of the current frame waving hand for asking for help
Figure FDA0003890575760000047
Obtaining the current frame hand waving attitude confidence coefficient, hand waving amplitude information, hand waving frequency information and attitude addition information comprehensively, namely:
Figure FDA0003890575760000051
wherein,
Figure FDA0003890575760000052
for the current frame hand-waving gesture confidence level,
Figure FDA0003890575760000053
for the current frame of hand-waving amplitude information,
Figure FDA0003890575760000054
for the current frame of hand waving frequency information,
Figure FDA0003890575760000055
is the attitude addition coefficient;
(4) Current frame waving amplitude information
Figure FDA0003890575760000056
Namely:
Figure FDA0003890575760000057
wherein,
Figure FDA0003890575760000058
in order to detect the amplitude of the hand swing,
Figure FDA0003890575760000059
is the maximum value of the preset waving range,
Figure FDA00038905757600000510
is a numerical value between 0 and 1, and the larger the waving amplitude is, the higher the value is;
(5) Current frame waving frequency information
Figure FDA00038905757600000511
The method specifically comprises the following steps:
Figure FDA00038905757600000512
wherein,
Figure FDA00038905757600000513
in order to detect the frequency of the hand waving,
Figure FDA00038905757600000514
is the maximum value of the preset waving frequency,
Figure FDA00038905757600000515
is a value between 0 and 1, the higher the waving frequency is, the higher the value is;
(6) Finally, V t (sos) Judging with a preset threshold value, if the threshold value is larger than the preset threshold value, judging that the current frame is in a hand waving distress state; and if the continuous n frames are in the hand-waving distress state, carrying out hand-waving distress alarm.
CN202211259423.6A 2022-10-14 2022-10-14 Hand-waving help-seeking identification system and method Pending CN115641610A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211259423.6A CN115641610A (en) 2022-10-14 2022-10-14 Hand-waving help-seeking identification system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211259423.6A CN115641610A (en) 2022-10-14 2022-10-14 Hand-waving help-seeking identification system and method

Publications (1)

Publication Number Publication Date
CN115641610A true CN115641610A (en) 2023-01-24

Family

ID=84944112

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211259423.6A Pending CN115641610A (en) 2022-10-14 2022-10-14 Hand-waving help-seeking identification system and method

Country Status (1)

Country Link
CN (1) CN115641610A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116229581A (en) * 2023-03-23 2023-06-06 珠海市安克电子技术有限公司 Intelligent interconnection first-aid system based on big data
CN118280552A (en) * 2024-05-31 2024-07-02 西安四腾环境科技有限公司 Hospital management method based on video monitoring

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116229581A (en) * 2023-03-23 2023-06-06 珠海市安克电子技术有限公司 Intelligent interconnection first-aid system based on big data
CN116229581B (en) * 2023-03-23 2023-09-19 珠海市安克电子技术有限公司 Intelligent interconnection first-aid system based on big data
CN118280552A (en) * 2024-05-31 2024-07-02 西安四腾环境科技有限公司 Hospital management method based on video monitoring

Similar Documents

Publication Publication Date Title
US11783183B2 (en) Method and system for activity classification
CN115641610A (en) Hand-waving help-seeking identification system and method
CN110826466A (en) Emotion identification method, device and storage medium based on LSTM audio-video fusion
Mohandes et al. Arabic sign language recognition using the leap motion controller
Chibelushi et al. A review of speech-based bimodal recognition
CN105139864B (en) Audio recognition method and device
CN110575663A (en) physical education auxiliary training method based on artificial intelligence
CN111103976A (en) Gesture recognition method and device and electronic equipment
CN115169507A (en) Brain-like multi-mode emotion recognition network, recognition method and emotion robot
CN114724224A (en) Multi-mode emotion recognition method for medical care robot
CN113516005A (en) Dance action evaluation system based on deep learning and attitude estimation
CN114155512A (en) Fatigue detection method and system based on multi-feature fusion of 3D convolutional network
CN112418166A (en) Emotion distribution learning method based on multi-mode information
CN115188074A (en) Interactive physical training evaluation method, device and system and computer equipment
CN116312512A (en) Multi-person scene-oriented audiovisual fusion wake-up word recognition method and device
Lin et al. Adaptive multi-modal fusion framework for activity monitoring of people with mobility disability
CN116244474A (en) Learner learning state acquisition method based on multi-mode emotion feature fusion
CN117635904A (en) Dynamic self-adaptive feature-aware credible low-speed unmanned aerial vehicle detection method
Raghavachari et al. Deep learning framework for fingerspelling system using CNN
CN111339878A (en) Eye movement data-based correction type real-time emotion recognition method and system
CN114694254B (en) Method and device for detecting and early warning robbery of articles in straight ladder and computer equipment
WO2022247118A1 (en) Pushing method, pushing apparatus and electronic device
CN112597842B (en) Motion detection facial paralysis degree evaluation system based on artificial intelligence
CN115472182A (en) Attention feature fusion-based voice emotion recognition method and device of multi-channel self-encoder
CN115223218A (en) Adaptive face recognition technology based on ALFA meta-learning optimization algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination