CN115171154A - WiFi human body posture estimation algorithm based on Performer-Unet - Google Patents

WiFi human body posture estimation algorithm based on Performer-Unet Download PDF

Info

Publication number
CN115171154A
CN115171154A CN202210749949.6A CN202210749949A CN115171154A CN 115171154 A CN115171154 A CN 115171154A CN 202210749949 A CN202210749949 A CN 202210749949A CN 115171154 A CN115171154 A CN 115171154A
Authority
CN
China
Prior art keywords
human body
performer
wifi
attitude
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210749949.6A
Other languages
Chinese (zh)
Inventor
朱艾春
周跃
徐曹洁
张帆
李义丰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Tech University
Original Assignee
Nanjing Tech University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Tech University filed Critical Nanjing Tech University
Priority to CN202210749949.6A priority Critical patent/CN115171154A/en
Publication of CN115171154A publication Critical patent/CN115171154A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B17/00Monitoring; Testing
    • H04B17/30Monitoring; Testing of propagation channels
    • H04B17/309Measuring or estimating channel quality parameters
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/30Services specially adapted for particular environments, situations or purposes

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Medical Informatics (AREA)
  • Human Computer Interaction (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Quality & Reliability (AREA)
  • Electromagnetism (AREA)
  • Image Analysis (AREA)

Abstract

The WiFi human body posture estimation algorithm based on the Performer-Unet is characterized in that a human body activity video is collected, real posture marking information and CSI data containing human body skeleton point coordinates are extracted, input into an artificial neural network for training, loss marking is carried out, the artificial neural network is optimized by adopting a gradient descent method, and a model is obtained; and processing the CSI data stream of the acquired video through the model, and accurately identifying the human body posture. According to the invention, a cross-modal technology is introduced into a human body posture recognition algorithm, the WiFi-based posture recognition algorithm is trained, the cost is low, the application range is wide, and the privacy protection is good, so that the application range of posture estimation in multiple fields is greatly expanded, and the defects of the traditional algorithm application are overcome.

Description

WiFi human body posture estimation algorithm based on Performer-Unet
Technical Field
The invention relates to a WiFi human body posture estimation algorithm based on a Performer-Unet, and belongs to the field of human body posture estimation based on WiFi signals.
Background
With the development of deep learning, human body posture estimation is widely applied to the fields of human-computer interaction, motion analysis, virtual reality, security and the like, and gradually becomes a research hotspot in the field of computer vision. The traditional human body posture estimation method mainly comprises RGB image estimation and multimedia sensor estimation. However, for personal privacy and security, image capturing devices such as cameras cannot be installed in privacy areas such as bedrooms and toilets, and therefore, a large number of blind spot areas exist in the posture estimation method based on the RGB images. In addition, the working condition of the camera is very easily influenced by light factors such as glare and shading, and has larger instability. Most of attitude estimation methods based on multimedia sensors depend on wearable sensors, infrared sensors and radar equipment, and all the methods have the problems of needing to install professional equipment, poor flexibility, high cost and the like.
Disclosure of Invention
The invention aims to provide a WiFi human body posture estimation algorithm based on a Performer-Unet, aiming at the problems that the application scene of the traditional RGB image human body posture recognition algorithm is limited, the performance of the traditional convolution neural network in the human body posture estimation algorithm is limited and the like. In the invention, a cross-mode technology is introduced into a human body posture recognition algorithm, and a WiFi-based posture recognition algorithm is trained through a high-performance image human body posture recognition algorithm. In addition, the invention designs a unique U-shaped multi-head attention algorithm structure, and provides guarantee for the accuracy and robustness of human body posture estimation.
The technical scheme of the invention is as follows:
a WiFi human body posture estimation algorithm based on Performer-Unet comprises the following steps:
s1, collecting a human body activity video, disassembling the human body activity video into sample image frames of various human body postures, and extracting real posture marking information containing coordinates of human body skeleton points; acquiring a CSI data packet, namely a channel state information sequence, of the sample image frame according to the time stamp;
s2, inputting a sample image frame containing real attitude marking information and a channel state information sequence of the corresponding sample image frame obtained according to a time stamp into an artificial neural network for training;
obtaining human body posture estimation output according to the channel state information sequence, carrying out loss marking on the human body posture estimation output and real posture marking information of a corresponding sample image frame, optimizing an artificial neural network by adopting a gradient descent method until the neural network is converged, finishing training and obtaining a Performer-Unet human body posture estimation model;
and S3, acquiring a CSI (channel state information) data stream of the human body posture to be detected in real time by adopting a WiFi (wireless fidelity) receiving antenna, inputting a Performer-Unet human body posture estimation model, and acquiring the human body posture according to the CSI data stream.
Further, the human body posture comprises: standing, walking, squatting, running and jumping; the real posture marking information is 18 human body skeleton point coordinates.
Further, step S1 specifically includes: arranging a WiFi sending antenna and a WiFi receiving antenna on two sides of a human body, arranging a monitoring camera on one side of the WiFi sending antenna, wherein the monitoring camera is aligned with the WiFi sending antenna and is used for shooting a moving video of the human body and disassembling the video into a sample image frame containing a posture of the human body; and a WiFi receiving antenna is adopted to collect the CSI data stream of the human body posture in real time.
Further, in step S1, the obtaining manner of the real posture marking information is as follows:
processing the sample image frame of the human body posture by adopting a human body posture recognition algorithm alpha to obtain real posture marking information P containing coordinates of human body skeleton points A
P A =Alpha(I k )
Wherein: k represents a frame number of the sample image frame, and Ik represents a kth frame of the sample image frame captured by the camera; alpha () represents a human body posture recognition algorithm;
marking information P according to real attitude A Generating pose annotation coordinates
Figure BDA0003720811980000021
And the confidence coefficient C:
Figure BDA0003720811980000031
wherein: j represents the number of the skeleton node;
Figure BDA0003720811980000032
coordinates representing the jth skeleton point, C j A confidence level of the coordinates is identified for the skeleton point.
Further, in step S1, the channel state information, i.e. the CSI data stream, is obtained by: and adopting Matlab to split the data stream of the CSI into CSI data packets corresponding to each frame of image according to the time stamp.
Further, the training process in step S2 specifically includes:
step 21, constructing a teacher network T (-) and a student network S (-) which are used for storing the real posture marking information of the sample image frame and storing a CSI data packet of the sample image frame;
step 22, the student network S (-) converts the channel state information sequence X corresponding to the k frame sample image k Inputting the data into a Performer-Unet of an artificial neural network, and generating a coordinate labeled with the attitude
Figure BDA0003720811980000033
Consistent-sized attitude estimation matrix
Figure BDA0003720811980000034
Figure BDA0003720811980000035
Figure BDA0003720811980000036
Wherein: j represents the number of the skeleton node; PU () represents the sequence data X of the channel state information using the model Performer-Unet k Carrying out treatment;
Figure BDA0003720811980000037
representing coordinates of a jth skeleton point in the attitude estimation matrix;
step 23, calculating the attitude annotation coordinate in the student network S (-)
Figure BDA0003720811980000038
And attitude estimation matrix
Figure BDA0003720811980000039
Where the error L of the student network is calculated by using the L2 loss S(·)
Figure BDA0003720811980000041
Wherein: j represents the number of the skeleton node; c j Representing the confidence of the jth skeleton node;
Figure BDA0003720811980000042
respectively representing a real attitude marking matrix and an attitude estimation matrix;
Figure BDA0003720811980000043
respectively representing the coordinates (x) of the j-th skeleton point position j ,y j );
And step 24, the student network S (-) transmits the calculated error back in a gradient mode, and the network Performer-Unet is optimized through a gradient descent method until the network converges, and training is completed.
Further, the PU () represents a Performer-Unet human posture estimation model, a U-shaped structure is adopted, and an N-layer Performer layer is added in a fusion layer at the bottommost layer, and the processing steps of the model are as follows:
step 22.1, the U-shaped structure model inputs the channel state information sequence X k Carrying out three times of downsampling to carry out background semantic extraction, wherein the downsampling comprises one time of convolution Conv () operation and one time of pooling Pool () operation to obtain the output of the three times of downsampling
Figure BDA0003720811980000044
Figure BDA0003720811980000045
Step 22.2, sequence information after three times of down sampling
Figure BDA0003720811980000046
Inputting the attitude characteristic sequence into an N-layer Performer layer, and extracting an attitude characteristic sequence through a multi-head attention mechanism MulAttn ()
Figure BDA0003720811980000047
Figure BDA0003720811980000048
Step 22.3, extracting the attitude feature sequence
Figure BDA0003720811980000049
Performing up-sampling to amplify the characteristic information to obtain a characteristic amplification sequence
Figure BDA00037208119800000410
The upsampling operation includes a convolution operation Conv () and an interpolation operation Int ():
Figure BDA00037208119800000411
step 22.4, amplifying the characteristic amplified sequence after the up-sampling in sequence
Figure BDA00037208119800000412
Downsampling output corresponding to semantic sequence before cross-layer connection
Figure BDA00037208119800000413
Performing three times of fusion to obtain a posture prediction sequence considering context semantics and characteristic information
Figure BDA00037208119800000414
Figure BDA0003720811980000051
Step 22.5, finally outputting the attitude prediction sequence of the Performer-Unet
Figure BDA0003720811980000052
Labeling coordinates by double convolution and pose
Figure BDA0003720811980000053
Obtaining an attitude estimation matrix by carrying out scale matching
Figure BDA0003720811980000054
Figure BDA0003720811980000055
Further, in the N-layer Performer layer, the number of layers N is 12.
The invention has the beneficial effects that:
compared with a human body posture estimation algorithm based on RGB images and traditional sensors, the human body posture estimation method based on the RGB images and the traditional sensors adopts WiFi equipment which is low in cost, wide in application range and good in privacy protection to carry out human body posture estimation, greatly expands the application range of posture estimation in multiple fields, and makes up for the defects of application of the traditional algorithm.
The multi-head attention mechanism is introduced into the attitude estimation algorithm, so that the performance defect of the traditional attitude estimation algorithm is effectively overcome, the noise reduction at the algorithm level is achieved, and the network robustness is enhanced.
The Performer-Unet algorithm network structure is a high-performance structure, and achieves excellent performance in human body posture estimation based on WiFi.
Additional features and advantages of the invention will be set forth in the detailed description which follows.
Drawings
The above and other objects, features and advantages of the present invention will become more apparent by describing in detail exemplary embodiments thereof with reference to the attached drawings, wherein like reference numerals generally represent like parts in the exemplary embodiments of the present invention.
FIG. 1 shows a schematic diagram of the Performer-Unet model structure.
FIG. 2 shows a flow chart of a model training method of an embodiment of the invention.
Fig. 3 shows a structure diagram of an image and CSI data acquisition apparatus according to an embodiment of the present invention.
Detailed Description
Preferred embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While the preferred embodiments of the present invention are shown in the drawings, it should be understood that the present invention may be embodied in various forms and should not be limited to the embodiments set forth herein.
As shown in fig. 2, the present invention uses an existing image-based high-performance pose estimation network to obtain true pose annotation information. True gesture mark information be 18 individual human skeleton point coordinates (ear, eye, nose, neck, shoulder, elbow, hand, buttockss, knee, ankle), true gesture mark information's acquisition mode is:
processing the sample image frame of the human body posture by adopting a human body posture recognition algorithm alpha to obtain real posture marking information P containing coordinates of human body skeleton points A
P A =Alpha(I k )
Wherein: k represents a frame number of the sample image frame, ik represents a kth frame sample image frame captured by the camera; alpha () represents a human body posture recognition algorithm;
marking information P according to real attitude A Generating gesture annotation coordinates
Figure BDA0003720811980000061
And the confidence coefficient C:
Figure BDA0003720811980000062
wherein: j represents the number of the skeleton node;
Figure BDA0003720811980000063
coordinates representing the jth skeleton point, C j A confidence level of the coordinates is identified for the skeleton point.
Acquiring a CSI data packet, namely a channel state information sequence, of the sample image frame according to the time stamp; as shown in fig. 2, the channel state information, i.e., the CSI data stream, is obtained by: and adopting Matlab to split the data stream of the CSI into CSI data packets corresponding to each frame of image according to the time stamp.
The training process in fig. 2 specifically includes:
step 21, constructing a teacher network T (-) and a student network S (-) which are used for storing the real attitude marking information of the sample image frame and storing the CSI data packet of the sample image frame;
step 22, the student network S (-) converts the channel state information sequence X corresponding to the k frame sample image k Inputting the data into an artificial neural network Performer-Unet, and generating and labeling coordinates with the attitude
Figure BDA0003720811980000071
Consistent-sized attitude estimation matrix
Figure BDA0003720811980000072
Figure BDA0003720811980000073
Figure BDA0003720811980000074
Wherein: j represents the number of the skeleton node; PU () represents sequence data X of channel state information using model Performer-Unet k Carrying out treatment;
Figure BDA0003720811980000075
representing coordinates of a jth skeleton point in the attitude estimation matrix;
step 23, calculating the attitude annotation coordinate in the student network S (-)
Figure BDA0003720811980000076
And attitude estimation matrix
Figure BDA0003720811980000077
Where the error L of the student network is calculated by using the L2 loss S(·)
Figure BDA0003720811980000078
Wherein: j represents the number of the skeleton node; c j Representing the confidence of the jth skeleton node;
Figure BDA0003720811980000079
respectively representing a real attitude marking matrix and an attitude estimation matrix;
Figure BDA00037208119800000710
respectively representing the coordinates (x) of the j-th skeleton point position j ,y j );
And 24, the student network S (-) transmits the calculated error back in a gradient mode, and optimizes the network Performer-Unet through a gradient descent method until the network converges and the error does not obviously descend any more, so that the training is completed.
As shown in fig. 3, in the invention, a WiFi transmitting antenna and a WiFi receiving antenna are arranged on two sides of a human body, a monitoring camera is arranged on one side of the WiFi transmitting antenna, and the monitoring camera and the WiFi transmitting antenna are arranged in alignment for shooting a moving video of the human body, and the video is disassembled into sample image frames containing human body postures; and a WiFi receiving antenna is adopted to collect the CSI data stream of the human body posture in real time.
According to the description of fig. 1, the WiFi human body posture estimation algorithm based on the Performer-uet adopts a U-shaped structure, and adds N layers of performers to the fusion layer at the bottommost layer, and the processing steps of the model are as follows:
step 22.1, the U-shaped structure model inputs the channel state information sequence X k Carrying out three times of downsampling to carry out background semantic extraction, wherein the downsampling comprises one time of convolution Conv () operation and one time of pooling Pool () operation to obtain the output of the three times of downsampling
Figure BDA0003720811980000081
Figure BDA0003720811980000082
Step 22.2, sequence information after three times of down sampling
Figure BDA0003720811980000083
Inputting the attitude characteristic sequence into an N-layer Performer layer, and extracting an attitude characteristic sequence through a multi-head attention mechanism MulAttn ()
Figure BDA0003720811980000084
Figure BDA0003720811980000085
Step 22.3, extracting the attitude feature sequence
Figure BDA0003720811980000086
Performing up-sampling to amplify the characteristic information to obtain a characteristic amplification sequence
Figure BDA0003720811980000087
The upsampling operation comprises a convolution operation Conv () and an interpolation operation Int ():
Figure BDA0003720811980000088
step 22.4, amplifying the characteristic amplified sequence after the up-sampling in sequence
Figure BDA0003720811980000089
Downsampling output corresponding to semantic sequence before cross-layer connection
Figure BDA00037208119800000810
Performing three times of fusion to obtain a posture prediction sequence considering context semantics and characteristic information
Figure BDA00037208119800000811
Figure BDA00037208119800000812
Step 22.5, finally outputting the attitude prediction sequence of the Performer-Unet
Figure BDA00037208119800000813
Labeling coordinates by double convolution and pose
Figure BDA00037208119800000814
Obtaining an attitude estimation matrix by carrying out scale matching
Figure BDA00037208119800000815
Figure BDA00037208119800000816
In the N layers of the final Performer-Unet WiFi human body posture estimation algorithm, the number of the layers N is 12.
In recent years, rapid development of network technologies such as 5G has led to popularization of wireless WiFi devices. Open source drivers for a range of network cards, such as Atheros AR9580, intel WiFi Link 5300, etc., are commercially available, and WiFi devices are deployed today in either public or private homes. This makes the WiFi device have low cost, flexibility advantage such as strong compared with other sensors. With the research on WiFi, the multipath channel characteristics of WiFi signals are gradually uncovered. The carrier electromagnetic signals of each subcarrier are subjected to multipath superposition propagation such as reflection, scattering, penetration and the like on various obstacles and human bodies, so that a special transmission mode is formed.
The invention acquires the surrounding environment characteristics by collecting the Channel State Information (CSI) of each subcarrier signal in the WiFi transmission process to analyze and process. The WiFi is utilized for human body posture estimation, and the method has good universality, privacy protection and the like.
Having described embodiments of the present invention, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments.

Claims (8)

1. A WiFi human body posture estimation algorithm based on Performer-Unet is characterized by comprising the following steps:
s1, collecting a human body activity video, disassembling the human body activity video into sample image frames of various human body postures, and extracting real posture marking information containing coordinates of human body skeleton points; acquiring a CSI data packet, namely a channel state information sequence, of the sample image frame according to the time stamp;
s2, inputting a sample image frame containing real attitude marking information and a channel state information sequence of the corresponding sample image frame obtained according to a time stamp into an artificial neural network for training;
obtaining human body posture estimation output according to the channel state information sequence, carrying out loss marking on the human body posture estimation output and real posture marking information of a corresponding sample image frame, optimizing an artificial neural network by adopting a gradient descent method until the neural network is converged, finishing training and obtaining a Performer-Unet human body posture estimation model;
and S3, acquiring a CSI (channel state information) data stream of the human body posture to be detected in real time by adopting a WiFi (wireless fidelity) receiving antenna, inputting a Performer-Unet human body posture estimation model, and acquiring the human body posture according to the CSI data stream.
2. The Performer-Unet based WiFi human body pose estimation algorithm of claim 1, wherein the human body pose comprises: standing, walking, squatting, running and jumping; the real posture marking information is 18 human skeleton point coordinates.
3. The WiFi human pose estimation algorithm based on Performer-uet of claim 1, characterized in that step S1 specifically is: arranging WiFi (wireless fidelity) transmitting antennas and WiFi receiving antennas on two sides of a human body, arranging a monitoring camera on one side of each WiFi transmitting antenna, wherein the monitoring cameras are aligned with the WiFi transmitting antennas and used for shooting a moving video of the human body, and disassembling the video into a sample image frame containing a posture of the human body; and a WiFi receiving antenna is adopted to collect the CSI data stream of the human body posture in real time.
4. The WiFi human body pose estimation algorithm based on the Performer-uet of claim 1, wherein in step S1, the obtaining manner of the real pose labeling information is:
processing the sample image frame of the human body posture by adopting a human body posture recognition algorithm alpha to obtain real posture marking information P containing coordinates of human body skeleton points A
P A =Alpha(I k )
Wherein: k represents a frame of a sample image frameNumber, I k Representing a k frame sample image frame captured by a camera; alpha () represents a human body posture recognition algorithm;
marking information P according to real attitude A Generating pose annotation coordinates
Figure FDA0003720811970000021
And the confidence coefficient C:
Figure FDA0003720811970000022
wherein: j represents the number of the skeleton node;
Figure FDA0003720811970000023
coordinates representing the jth skeleton point, C j A confidence in the coordinates is identified for the skeleton point.
5. The WiFi human body pose estimation algorithm based on the Performer-uet of claim 4, characterized in that in step S1, the Channel State Information (CSI) data stream obtaining method is: and adopting Matlab to split the data stream of the CSI into CSI data packets corresponding to each frame of image according to the time stamp.
6. The WiFi human body pose estimation algorithm based on the Performer-uet of claim 5, wherein the training process in step S2 specifically includes:
step 21, constructing a teacher network T (-) and a student network S (-) which are used for storing the real attitude marking information of the sample image frame and storing the CSI data packet of the sample image frame;
step 22, the student network S (-) converts the channel state information sequence X corresponding to the k frame sample image k Inputting the data into an artificial neural network Performer-Unet, and generating and labeling coordinates with the attitude
Figure FDA0003720811970000024
Attitude estimation matrix with consistent size
Figure FDA0003720811970000025
Figure FDA0003720811970000026
Figure FDA0003720811970000027
Wherein: j represents the number of the skeleton node; PU () represents the sequence data X of the channel state information using the model Performer-Unet k Carrying out treatment;
Figure FDA0003720811970000031
representing coordinates of a jth skeleton point in the attitude estimation matrix;
step 23, calculating the attitude annotation coordinate in the student network S (-)
Figure FDA0003720811970000032
And attitude estimation matrix
Figure FDA0003720811970000033
Where the error L of the student network is calculated using the L2 loss S(·)
Figure FDA0003720811970000034
Wherein: j represents the number of the skeleton node; c j Representing the confidence of the jth skeleton node;
Figure FDA0003720811970000035
respectively representing a real attitude marking matrix and an attitude estimation matrix;
Figure FDA0003720811970000036
respectively representing the coordinates (x) of the j-th skeleton point position j ,y j );
And step 24, the student network S (-) transmits the calculated error back in a gradient mode, and the network Performer-Unet is optimized through a gradient descent method until the network converges, and training is completed.
7. The WiFi human body posture estimation algorithm based on the Performer-uet of claim 6, characterized in that, the PU () represents the Performer-uet human body posture estimation model, which adopts U-shaped structure and adds N layers of Performer layer in the lowest fusion layer, and the processing steps of the model are as follows:
step 22.1, the U-shaped structure model inputs the channel state information sequence X k Carrying out three times of downsampling to carry out background semantic extraction, wherein the downsampling comprises a convolution Conv () operation and a pooling Pool () operation to obtain the output of the three times of downsampling
Figure FDA0003720811970000037
Figure FDA0003720811970000038
Step 22.2, sequence information after three times of down sampling
Figure FDA0003720811970000039
Inputting the attitude characteristic sequence into an N-layer Performer layer, and extracting an attitude characteristic sequence through a multi-head attention mechanism MulAttn ()
Figure FDA00037208119700000310
Figure FDA00037208119700000311
Step 22.3, extracting the attitude feature sequence
Figure FDA0003720811970000041
Performing up-sampling to amplify the characteristic information to obtain a characteristic amplification sequence
Figure FDA0003720811970000042
The upsampling operation includes a convolution operation Conv () and an interpolation operation Int ():
Figure FDA0003720811970000043
step 22.4, amplifying the characteristic amplified sequence after the up-sampling in sequence
Figure FDA0003720811970000044
Downsampling output corresponding to semantic sequence before cross-layer connection
Figure FDA0003720811970000045
Performing three times of fusion to obtain a posture prediction sequence considering context semantics and characteristic information
Figure FDA0003720811970000046
Figure FDA0003720811970000047
Step 22.5, finally outputting the attitude prediction sequence of the Performer-Unet
Figure FDA0003720811970000048
Labeling coordinates by double convolution and pose
Figure FDA0003720811970000049
Obtaining attitude estimation moment by carrying out scale matchingMatrix of
Figure FDA00037208119700000410
Figure FDA00037208119700000411
8. The WiFi human pose estimation algorithm based on the Performer-uet of claim 6, wherein in N layers of Performer layer, the number of layers N is 12.
CN202210749949.6A 2022-06-29 2022-06-29 WiFi human body posture estimation algorithm based on Performer-Unet Pending CN115171154A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210749949.6A CN115171154A (en) 2022-06-29 2022-06-29 WiFi human body posture estimation algorithm based on Performer-Unet

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210749949.6A CN115171154A (en) 2022-06-29 2022-06-29 WiFi human body posture estimation algorithm based on Performer-Unet

Publications (1)

Publication Number Publication Date
CN115171154A true CN115171154A (en) 2022-10-11

Family

ID=83490176

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210749949.6A Pending CN115171154A (en) 2022-06-29 2022-06-29 WiFi human body posture estimation algorithm based on Performer-Unet

Country Status (1)

Country Link
CN (1) CN115171154A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116189382A (en) * 2023-02-01 2023-05-30 观云(山东)智能科技有限公司 Fall detection method and system based on inertial sensor network
US11892563B1 (en) 2023-08-21 2024-02-06 Project Canary, Pbc Detecting human presence in an outdoor monitored site

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116189382A (en) * 2023-02-01 2023-05-30 观云(山东)智能科技有限公司 Fall detection method and system based on inertial sensor network
US11892563B1 (en) 2023-08-21 2024-02-06 Project Canary, Pbc Detecting human presence in an outdoor monitored site

Similar Documents

Publication Publication Date Title
US20180186452A1 (en) Unmanned Aerial Vehicle Interactive Apparatus and Method Based on Deep Learning Posture Estimation
CN110363140B (en) Human body action real-time identification method based on infrared image
CN115171154A (en) WiFi human body posture estimation algorithm based on Performer-Unet
CN110135249B (en) Human behavior identification method based on time attention mechanism and LSTM (least Square TM)
CN108898063B (en) Human body posture recognition device and method based on full convolution neural network
CN110956094A (en) RGB-D multi-mode fusion personnel detection method based on asymmetric double-current network
CN107220604A (en) A kind of fall detection method based on video
CN107909061A (en) A kind of head pose tracks of device and method based on incomplete feature
CN110969124A (en) Two-dimensional human body posture estimation method and system based on lightweight multi-branch network
CN105160310A (en) 3D (three-dimensional) convolutional neural network based human body behavior recognition method
CN104794737B (en) A kind of depth information Auxiliary Particle Filter tracking
CN111814661A (en) Human behavior identification method based on residual error-recurrent neural network
CN110135277B (en) Human behavior recognition method based on convolutional neural network
CN104821010A (en) Binocular-vision-based real-time extraction method and system for three-dimensional hand information
CN112597814A (en) Improved Openpos classroom multi-person abnormal behavior and mask wearing detection method
CN113378649A (en) Identity, position and action recognition method, system, electronic equipment and storage medium
CN113158833B (en) Unmanned vehicle control command method based on human body posture
CN112001347A (en) Motion recognition method based on human skeleton shape and detection target
CN113705445B (en) Method and equipment for recognizing human body posture based on event camera
Fang et al. Dynamic gesture recognition using inertial sensors-based data gloves
CN112926475A (en) Human body three-dimensional key point extraction method
CN113762009A (en) Crowd counting method based on multi-scale feature fusion and double-attention machine mechanism
CN116229507A (en) Human body posture detection method and system
CN116895098A (en) Video human body action recognition system and method based on deep learning and privacy protection
CN113516232B (en) Self-attention mechanism-based wall-penetrating radar human body posture reconstruction method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination