CN116129467A

CN116129467A - Method for identifying gesture and behavior of transformer operation and maintenance personnel and gesture of tool

Info

Publication number: CN116129467A
Application number: CN202211645542.5A
Authority: CN
Inventors: 刘国忠; 唐琼萍; 林春龙; 林昌年; 阮梦宇; 靳伟
Original assignee: State Grid Corp of China SGCC; Beijing Kedong Electric Power Control System Co Ltd; Beijing Information Science and Technology University
Current assignee: State Grid Corp of China SGCC; Beijing Kedong Electric Power Control System Co Ltd; Beijing Information Science and Technology University
Priority date: 2022-12-16
Filing date: 2022-12-16
Publication date: 2023-05-16

Abstract

The invention provides a method for identifying the gesture and behavior of a transformer operation and maintenance person and the gesture of a tool, which comprises the following steps: pre-storing standard action gestures of personnel and standard shapes of tools and instruments; an RGB camera and an infrared camera are adopted to collect human body images and scene images; acquiring two-dimensional coordinates of a joint point of a human body by adopting an openpost model, and further determining three-dimensional coordinates of the joint point; constructing a cart decision tree; and identifying human behavior types by the cart decision tree, and identifying the gesture of the substation operation and maintenance personnel. Dividing and converting the scene image to obtain tool feature data; identifying a possible two-dimensional coordinate range of the tool by adopting a unet network model; and identifying the tool gesture according to the tool shape feature, the tool position feature, the tool infrared image and the tool color image. The method for identifying the gesture and the behavior of the transformer operation and maintenance personnel and the gesture of the tool has the characteristics of high precision, quick response and the like, and can be widely applied to the power industry.

Description

Method for identifying gesture and behavior of transformer operation and maintenance personnel and gesture of tool

Technical Field

The invention relates to a recognition technology, in particular to a recognition method for the gesture and behavior of a transformer operation and maintenance person and the gesture of a tool.

Background

The electric power system as a national core industry plays a significant role in the national economy production and life. In order to ensure the normal work of the power system, the power transformation operation and maintenance work is difficult and the task amount is large. In actual power transformation operation and maintenance work, the problems that importance cognition of partial staff on the work of the staff is insufficient, the operation flow is not standard, the staff training requirement is increased due to power transformation equipment updating iteration and the like exist. Personnel training is an important measure to solve the above problems.

The personnel training mainly adopts theoretical teaching, master teaching, simulation operation and other modes, and has the problems of single theoretical learning form, hard back of the dead, careless operation and the like; moreover, the master belt freehand is that the master carries out skill teaching according to rules and experience, is difficult to standardize and has a long training period; the simulation operation training requires a large amount of resources to manufacture simulation equipment, and has poor expansibility. How to monitor the standard operation degree of the substation operation and maintenance personnel in an intelligent mode is a blank in the field of the substation operation and maintenance at present.

Therefore, in the prior art, an intelligent method for monitoring the standard operation degree of the transformer operation and maintenance personnel is not available.

Disclosure of Invention

Therefore, the main purpose of the invention is to provide a method for identifying the gesture and behavior of the transformer operation and maintenance personnel and the gesture of the tools and instruments, which has high precision, quick response and convenient use.

In order to achieve the above purpose, the technical scheme provided by the invention is as follows:

a method for identifying the gesture and behavior of a transformer operation and maintenance person and the gesture of a tool comprises the following steps:

and step 1, pre-storing standard action postures of power transformation operation and maintenance personnel and standard appearance of tools.

Step 2, respectively acquiring human body infrared images and human body color images related to the postures of the transformer operation and maintenance personnel by adopting an RGB camera and an infrared camera; simultaneously, respectively acquiring an infrared image and a color image of a scene with tools.

Step 3, acquiring a two-dimensional human body joint point by adopting an openpore model; and (3) obtaining the human body joint point in a three-dimensional form according to the human body joint point in a two-dimensional form, the human body infrared image and the human body color image acquired in the step (2).

And 4, extracting human body posture characteristic data consisting of angles of all joints and distances among all joint points according to the human body joint points in the three-dimensional form obtained in the step 3.

Step 5, constructing a cart decision tree according to the human body posture characteristic data; and identifying human behavior types by the cart decision tree, and identifying the gesture of the substation operation and maintenance personnel.

And 6, segmenting and converting the scene infrared image and the scene color image with the tool acquired in the step 2 to obtain the characteristic data of the tools.

And 7, dividing the tool feature data into a tool feature training set and a tool feature verification set.

Step 8, training the unet network model by adopting a tool feature training set; and verifying the trained unet network model by adopting the tool feature verification set to obtain a trained unet model.

And 9, identifying a possible two-dimensional coordinate range of the tool by the trained unet model.

And 10, according to the shape characteristics and the position characteristics of the tool, combining the tool infrared image and the tool color image obtained in the step 5, and identifying the tool posture.

In summary, according to the method for recognizing the gesture and behavior of the transformer operation and maintenance personnel and the gesture of the tool, the infrared camera and the RGB camera are adopted to collect the infrared image and the color image of the human body, and the infrared image and the color image of the scene with tools. For a human body color image, acquiring two-dimensional coordinates of joint points related to the human body posture by adopting an openpost model, and further determining three-dimensional coordinates of the joint points related to the human body posture according to the human body infrared image. And determining the angle of the joint point and the distance between the joint points according to the three-dimensional coordinates of the joint point of the human body posture, and constructing a cart decision tree. Human body gestures are identified by the cart decision tree. Therefore, the gesture recognition method for the transformer operation and maintenance personnel can remotely recognize the real-time working behavior and working state of the transformer operation and maintenance personnel, obtain first hand data of the first line operation of the transformer operation and maintenance personnel, and realize accurate and rapid recognition on standard training of the personnel and monitoring of the standard degree of the working behavior of the personnel. In addition, the invention also carries out segmentation processing on the scene color image with the tools and adopts the unet network model to identify the possible two-dimensional coordinate range of the tools; and combining the scene infrared image with the tools, the tool shape characteristics and the tool position characteristics to identify the gesture of the tools. The gesture of tool can further confirm the actual operation action of transformer fortune dimension personnel, further accurate, quick, pertinence personnel training provides powerful basis.

Drawings

Fig. 1 is a general flow diagram of a method for recognizing the gesture and behavior of a transformer operation and maintenance person and the gesture of a tool according to the present invention.

Fig. 2 is a schematic structural diagram of a human body joint according to the present invention.

Fig. 3 is an angular schematic view of a human body joint according to the present invention.

Fig. 4 is a schematic view of the distances between the joints according to the present invention.

Detailed Description

The present invention will be described in further detail with reference to the drawings and the embodiments, in order to make the objects, technical solutions and advantages of the present invention more apparent.

Fig. 1 is a general flow diagram of a method for recognizing the gesture and behavior of a transformer operation and maintenance person and the gesture of a tool according to the present invention. As shown in fig. 1, the method for identifying the gesture and behavior of the transformer operation and maintenance personnel and the gesture of the tool comprises the following steps:

In a word, the method for identifying the gesture and the behavior of the transformer operation and maintenance personnel and the gesture of the tool adopts the infrared camera and the RGB camera to collect the infrared image and the color image of the human body, and the infrared image and the color image of the scene with tools. For a human body color image, acquiring two-dimensional coordinates of joint points related to the human body posture by adopting an openpost model, and further determining three-dimensional coordinates of the joint points related to the human body posture according to the human body infrared image. And determining the angle of the joint point and the distance between the joint points according to the three-dimensional coordinates of the joint point of the human body posture, and constructing a cart decision tree. Human body gestures are identified by the cart decision tree. Therefore, the gesture recognition method for the transformer operation and maintenance personnel can remotely recognize the real-time working behavior and working state of the transformer operation and maintenance personnel, obtain first hand data of the first line operation of the transformer operation and maintenance personnel, and realize accurate and rapid recognition on standard training of the personnel and monitoring of the standard degree of the working behavior of the personnel. In addition, the invention also carries out segmentation processing on the scene color image with the tools and adopts the unet network model to identify the possible two-dimensional coordinate range of the tools; and combining the scene infrared image with the tools, the tool shape characteristics and the tool position characteristics to identify the gesture of the tools. The gesture of tool can further confirm the actual operation action of transformer fortune dimension personnel, further accurate, quick, pertinence personnel training provides powerful basis.

In the invention, after the step 1 and before the step 2, the method further comprises the following steps:

and A1, calibrating the RGB camera and the infrared camera to obtain the internal parameters of the RGB camera participating in the infrared camera.

And A2, acquiring an external parameter matrix and a pose transformation matrix between the RGB camera and the infrared camera according to the step A1.

In the present invention, the step A1 includes the steps of:

step A11, according to the imaging principle, the setting of a physical coordinate system and the setting of a pixel coordinate system, obtaining the relationship between the physical coordinate system and the pixel coordinate system of the RGB camera or the infrared camera, wherein the relationship is as follows:

wherein ,(X_R ，Y _R ) Representing coordinates of any pixel point in the image in the physical coordinate system of the RGB camera, (u) _R ，v _R ) Representing coordinates of any pixel point in the image in the pixel coordinate system of the RGB camera, (x) _R ，y _R ，z _R ) Representing the coordinates of any pixel point in the image in the camera coordinate system of the RGB camera, d _Rx 、d _Ry Respectively representing x of any pixel point in the image in a camera coordinate system of the RGB camera _R Axis, y _R Actual dimension in axial direction s _Rx 、s _Ry Respectively representing x of any pixel point in the image in a camera coordinate system of the RGB camera _R Axis, y _R Sampling frequency in axial direction, (X) _R ，Y _R )、(u _R ，v _R )、(x _R ，y _R ，z _R ) The two have one-to-one correspondence; physical coordinate system center point (X) of RGB camera _R0 ，Y _R0 ) The corresponding coordinates in the pixel coordinate system are (u) _R0 ，v _R0 )；(X _h ，Y _h ) Representing coordinates of any pixel point in the image in a physical coordinate system of the infrared camera, (u) _h ，v _h ) Representing coordinates of any pixel point in the image in a pixel coordinate system of the infrared camera, (x) _h ，y _h ，z _h ) Representing the coordinates of any pixel point in the image in the camera coordinate system of the infrared camera, d _hx 、d _hy Respectively representing x of any pixel point in the image in a camera coordinate system of the infrared camera _h Axis, y _h Actual dimension in axial direction s _hx 、s _hy Respectively representing x of any pixel point in the image in a camera coordinate system of the infrared camera _h Axis, y _h Sampling frequency in axial direction, (X) _h ，Y _h )、(u _h ，v _h )、(x _h ，y _h ，z _h ) The two have one-to-one correspondence; physical coordinate system center point (X) of infrared camera _h0 ，Y _h0 ) The corresponding coordinates in the pixel coordinate system are (u) _h0 ，v _h0 )。

Step A12, respectively obtaining the relation between the image physical coordinate system of the RGB camera and the image camera coordinate system and the relation between the image physical coordinate system of the infrared camera and the image camera coordinate system, wherein the relation is as follows:

wherein ,f_R Focal length f of RGB camera _h Is the focal length of the camera of the infrared camera.

Step A13, obtaining an internal reference matrix M of the RGB camera according to the step A11 and the step A12 and the expression mode of the homogeneous equation _R Internal reference matrix M of infrared camera _h ：

wherein ,f_Rx 、f _Ry 、u _R0 、v _R0 Is an internal reference of RGB camera, f _hx 、f _hy 、u _h0 、v _h0 Is an internal reference of an infrared camera; f (f) _Rx ＝f _R /dx _R Representing RGB camera x _R Focal length in axial direction, f _Ry ＝f _R /dy _R Representing RGB camera y _R Focal length in axial direction, f _hx Representing an infrared camera x _h Focal length in axial direction, f _hy Representing an infrared camera y _h Focal length in the axial direction.

In the present invention, the step A2 includes the following steps:

step A21, determining the relation between a camera coordinate system of the infrared camera and a camera coordinate system of the RGB camera according to the installation positions of the infrared camera and the RGB camera:

wherein ,T_Rh Representing the extrinsic matrix between the infrared camera and the RGB camera.

Step A22, according to steps A11 to A13 and step A21, determining the relation between the pixel coordinate system of the RGB camera and the pixel coordinate system of the infrared camera, wherein the relation is as follows:

wherein ,

reference matrix M representing an infrared camera _h Is a matrix of inverse of (a).

Step A23, z of RGB camera-based camera coordinate System _R Z of camera coordinate system of axial and infrared camera _h Axially parallel, take z _R ＝z _h The method comprises the steps of carrying out a first treatment on the surface of the Thereafter, the following is obtained:

wherein ,

representing a pose transformation matrix between the RGB camera and the infrared camera.

In practical application, the pose transformation matrix H has the following functions: in the subsequent image processing or gesture recognition process, the method is used for aligning the color image shot by the RGB camera with the infrared image shot by the infrared camera so as to accurately acquire the relevant information corresponding to each pixel point in the image.

In practical application, three coordinate systems commonly used by cameras are an image coordinate system, a camera coordinate system and a world coordinate system. The world coordinate system is the basis for locating cameras and other objects in the space, and the origin position of the world coordinate system can be defined at any position according to actual needs. The origin of the camera coordinate system is the focus of the pinhole camera model, and the z-axis is the optical axis of the camera. The image coordinate system is specifically divided into an image pixel coordinate system and an image physical coordinate system, wherein the physical coordinate system is a two-dimensional rectangular coordinate system, and the x axis and the y axis of the physical coordinate system are parallel to the camera coordinate system. The pixel coordinate system is a two-dimensional coordinate system in the image, with the origin generally set to the upper left corner of the image. The transformation relationships among the world coordinate system, the camera coordinate system, the image coordinate system and the image physical coordinate system and the pixel coordinate system are the prior art, and are not repeated here.

In the present invention, the step 3 specifically includes the following steps:

and 31, identifying two-dimensional coordinates of all human body joints of the transformer operation and maintenance personnel, connection relations among all the human body joints and confidence degrees of all the human body joints by using an Openphase gesture estimation system for the human body color image.

In practical application, the openpost attitude estimation system U.S. kansui university (CMU) is based on convolutional neural network and open source library developed by taking caffe as framework, and is used for identifying limb language, and will not be described here again.

Step 32, for human body joint points with the confidence coefficient smaller than 0.2, searching for adjacent human body joint points with the confidence coefficient larger than 0.8; setting a virtual sphere by taking the adjacent human body joint point with the confidence coefficient smaller than 0.2 and the adjacent human body joint point with the confidence coefficient larger than 0.8 as a radius: if the human body joint point falls outside the virtual sphere, deleting the human body joint point; if the human body node falls within the virtual sphere, the human body node is retained.

Step 33, correcting and predicting the deleted human body joint points and the joint points lost due to shielding by adopting Kalman filtering; and then, obtaining the connection relation between the two-dimensional coordinates of the 18 human body joint points of the power transformation operation and maintenance personnel and each human body joint point in the 18 human body joint points. Fig. 2 is a schematic structural diagram of a human body joint according to the present invention. As shown in fig. 2, the 18 human body nodes include: nasal node N ₁ Shoulder center joint N ₂ Left shoulder joint point N ₃ Left elbow joint N ₄ Left wrist joint N ₅ Right shoulder joint point N ₆ Right elbow joint point N ₇ Right wrist joint point N ₈ Left crotch node N ₉ Left knee joint N ₁₀ Left ankle joint N ₁₁ Right crotch joint point N ₁₂ Right knee joint point N ₁₃ Right ankle joint point N ₁₄ Left eye joint N ₁₅ Right eye joint N ₁₆ Left ear joint point N ₁₇ Joint point N of right ear ₁₈ 。

And step 34, determining three-dimensional coordinates of 18 human body joint points of the power transformation operation and maintenance personnel according to the human body infrared image conforming to the pose transformation matrix between the power transformation operation and maintenance personnel and the human body color image.

In the present invention, the step 4 specifically includes the following steps:

and step 41, obtaining 12 groups of joint angles through the rest 16 joints except for the left eye joint and the right eye joint in the 18 human joints. Fig. 3 is an angular schematic view of a human body joint according to the present invention. As shown in fig. 3, the 12 sets of joint angles include: left arm angle theta ₁ Included angle theta of right arm ₂ Included angle theta of left leg ₃ Included angle θ of right leg ₄ Included angle theta between left upper arm and left shoulder ₅ Included angle θ between right upper arm and right shoulder ₆ Included angle θ between left thigh and crotch ₇ Included angle θ between right thigh and crotch ₈ Included angle theta between left upper arm and spine ₉ Included angle theta of right upper arm and spine ₁₀ Included angle theta between left thigh and spine ₁₁ Included angle of right thigh and spineθ ₁₂ 。

In practical application, the angle solving is the prior art, and the description is omitted here.

Step 42, distance from nose joint point to shoulder center joint point

As the actual head height; wherein, (x) ₁ ，y ₁ ，z ₁ ) Three-dimensional coordinates of the nasal joint point, (x) ₂ ，y ₂ ，z ₂ ) Is the three-dimensional coordinate of the joint point of the center of the two shoulders.

Step 43, normalizing the actual head height to obtain the scaling ratio

Where d represents the standard head height and d=20 cm.

Step 44, determining the distance between any 2 joint points for the 18 human body joint points according to the scaling ratio k:

wherein ,(x_i ，y _i ，z _i ) Represents the N < th _i Three-dimensional coordinates of the individual nodes, (x) _i ，y _i ，z _i ) Represents the N < th _j Three-dimensional coordinates of the individual nodes, i, j are natural numbers, and i=1, 2,..18, j=1, 2,..18, i+.j; n represents a human body joint point.

Fig. 4 is a schematic view of the distances between the joints according to the present invention. As shown in fig. 4, the distance between the joint points includes: head-to-neck distance L0, head-to-spine lower end distance L1, left shoulder-to-left ankle distance L2, right shoulder-to-right ankle distance L3, left and right double elbow distance L4, left and right double wrist distance L5, left and right double knee distance L6, and left and right double ankle distance L7.

Step 45, combining the joint angles determined in step 41 and the distances between the joint points determined in step 44 together as human body posture feature data.

In the present invention, the step 5 specifically includes the following:

and 51, dividing the human body posture feature data obtained in the step 45 into a human body posture feature training set and a human body posture feature verification set.

And 52, constructing a cart decision tree according to the human body posture feature training set.

Step 53, verifying and evaluating the cart decision tree constructed in the step 52 by adopting a human body posture feature verification set; and analyzing the human body posture feature data by adopting a cart decision tree meeting the evaluation result, identifying the human body posture and the action corresponding to the human body posture, comparing the action corresponding to the human body posture with the pre-stored standard action posture of the power transformation operation and maintenance personnel, and determining the posture and the behavior of the power transformation operation and maintenance personnel in the actual operation process.

In practical application, the art decision tree theory is the prior art, and will not be described here again.

In the present invention, in step 53, the verification and evaluation are performed on the cart decision tree constructed in step 52 by using the human body posture feature verification set, which specifically includes: randomly selecting 70% of the data from the human body posture feature data to form a human body posture feature training set, and the rest 30% of the data to form a human body posture feature verification set; the constructed cart decision tree of step 52 is validated and evaluated by the body posture feature validation set.

In the present invention, in step 53, the verification and evaluation are performed on the cart decision tree constructed in step 52 by using the human body posture feature verification set, which specifically includes:

step B1, dividing the human body posture characteristic data into m sub-data sets; wherein m is a natural number.

And B2, randomly selecting one sub-data set from the m sub-data sets as a human body posture feature verification set, and performing verification and evaluation on the cart decision tree constructed in the step 52 to obtain an evaluation value.

Step B3, repeating the step B2 until the rest m-1 sub-data sets are respectively and independently used as human body posture feature verification sets, and performing verification evaluation on the cart decision tree constructed in the step 52; (m-1) evaluation values are obtained.

And B4, averaging the m evaluation values obtained in the steps B2 to B3, wherein the average value is the evaluation result of the cart decision tree constructed in the step 52.

The two methods for verifying and evaluating the cart decision tree constructed in the step 52 are two parallel methods, and in practical application, one of the two methods can be selected according to the needs. In practical application, the verification and evaluation is the accuracy rate of motion recognition: when the action recognition accuracy, i.e., the evaluation value or the average evaluation value, is greater than 90%, the cart decision tree can be considered to satisfy the evaluation result.

In the present invention, the step 6 specifically includes the following steps:

and 61, labeling and dividing over 4000 scene infrared images with tools and scene color images by using labelme, and correspondingly obtaining tool infrared data and tool RGB data after file conversion.

In practical application, the related theory of labelme label segmentation and file conversion is the prior art, and is not repeated here.

Step 62, combining the tool infrared data with the tool RGB data as tool feature data.

In the present invention, the step 10 specifically includes the following steps:

step 101, calculating the nearest distance between each possible two-dimensional coordinate range and the left wrist articulation point or the right wrist articulation point for the possible two-dimensional coordinate ranges of the tools and instruments obtained in the step 9; comparing the nearest distances, wherein the two-dimensional coordinate range corresponding to the minimum value of the nearest distances is the actual coordinate range of the tool; and simultaneously, obtaining a tool color image.

And 102, comparing the actual coordinate range of the tool with a pre-stored standard shape of the tool to obtain two-dimensional coordinates of each point of the tool on the tool color image.

And 103, obtaining three-dimensional coordinates of each point of the tool according to the tool infrared image conforming to the pose transformation matrix between the tool color image and the tool color image, namely, recognizing the pose of the tool.

In summary, the above embodiments are only preferred embodiments of the present invention, and are not intended to limit the scope of the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. The method for identifying the gesture and the behavior of the transformer operation and maintenance personnel and the gesture of the tool is characterized by comprising the following steps:

step 1, pre-storing standard action gestures and tool standard shapes of substation operation and maintenance personnel;

step 2, respectively acquiring human body infrared images and human body color images related to the postures of the transformer operation and maintenance personnel by adopting an RGB camera and an infrared camera; simultaneously, respectively acquiring a scene infrared image and a scene color image with tools;

step 3, acquiring a two-dimensional human body joint point by adopting an openpore model; obtaining a three-dimensional human body joint point according to the two-dimensional human body joint point, the human body infrared image and the human body color image acquired in the step 2;

step 4, extracting human body posture characteristic data formed by angles of all joints and distances among all joint points according to the human body joint points in the three-dimensional form obtained in the step 3;

step 5, constructing a cart decision tree according to the human body posture characteristic data; recognizing human behavior categories by the cart decision tree, and recognizing the gesture of a substation operation and maintenance person;

step 6, segmenting and converting the scene infrared image and the scene color image with the tool acquired in the step 2 to obtain tool feature data;

step 7, dividing the tool feature data into a tool feature training set and a tool feature verification set;

step 8, training the unet network model by adopting a tool feature training set; verifying the trained unet network model by adopting a tool feature verification set to obtain a trained unet model;

step 9, identifying a possible two-dimensional coordinate range of the tool by the trained unet model;

2. The method for identifying the gesture and behavior of the transformer operation and maintenance personnel and the gesture of the tool according to claim 1, wherein the steps after the step 1 and before the step 2 further comprise the following steps:

a1, calibrating an RGB camera and an infrared camera to obtain an internal parameter of the RGB camera participating in the infrared camera;

3. The method for recognizing the gesture and behavior of the transformer operation and maintenance personnel and the gesture of the tool according to claim 2, wherein the step A1 comprises the following steps:

wherein ,(X_R ，Y _R ) Representing coordinates of any pixel point in the image in the physical coordinate system of the RGB camera, (u) _R ，v _R ) Representing coordinates of any pixel point in the image in the pixel coordinate system of the RGB camera, (x) _R ，y _R ，z _R ) Representing the coordinates of any pixel point in the image in the camera coordinate system of the RGB camera, d _Rx 、d _Ry Respectively representing x of any pixel point in the image in a camera coordinate system of the RGB camera _R Axis, y _R Actual dimension in axial direction s _Rx 、s _Ry Respectively representing any pixel point in an image in a camera coordinate system of an RGB cameraX in the middle _R Axis, y _R Sampling frequency in axial direction, (X) _R ，Y _R )、(u _R ，v _R )、(x _R ，y _R ，z _R ) The two have one-to-one correspondence; physical coordinate system center point (X) of RGB camera _R0 ，Y _R0 ) The corresponding coordinates in the pixel coordinate system are (u) _R0 ，v _R0 )；(X _h ，Y _h ) Representing coordinates of any pixel point in the image in a physical coordinate system of the infrared camera, (u) _h ，v _h ) Representing coordinates of any pixel point in the image in a pixel coordinate system of the infrared camera, (x) _h ，y _h ，z _h ) Representing the coordinates of any pixel point in the image in the camera coordinate system of the infrared camera, d _hx 、d _hy Respectively representing x of any pixel point in the image in a camera coordinate system of the infrared camera _h Axis, y _h Actual dimension in axial direction s _hx 、s _hy Respectively representing x of any pixel point in the image in a camera coordinate system of the infrared camera _h Axis, y _h Sampling frequency in axial direction, (X) _h ，Y _h )、(u _h ，v _h )、(x _h ，y _h ，z _h ) The two have one-to-one correspondence; physical coordinate system center point (X) of infrared camera _h0 ，Y _h0 ) The corresponding coordinates in the pixel coordinate system are (u) _h0 ，v _h0 )；

wherein ,f_R Focal length f of RGB camera _h The focal length of a camera of the infrared camera;

step A13, obtaining an internal reference matrix M of the RGB camera according to the step A11 and the step A12 and the expression mode of the homogeneous equation _R Inner part of infrared cameraParameter matrix M _h ：

wherein ,f_Rx 、f _Ry 、u _R0 、v _R0 Is an internal reference of RGB camera, f _hx 、f _hy 、u _h0 、v _h0 Is an internal reference of an infrared camera; f (f) _Rx ＝f _R /dx _R Representing RGB camera x _R Focal length in axial direction, f _Ry ＝f _R /d y _R Representing RGB camera y _R Focal length in axial direction, f _hx Representing an infrared camera x _h Focal length in axial direction, f _hy Representing an infrared camera y _h Focal length in the axial direction.

4. The method for recognizing gesture and behavior of power transformation operation and maintenance personnel and gesture of tool according to claim 3, wherein the step A2 comprises the following steps:

wherein ,T_Rh An external parameter matrix representing a distance between the infrared camera and the RGB camera;

wherein ,

indicating infrared cameraIs an internal reference matrix M of (2) _h An inverse matrix of (a);

wherein ,

5. The method for recognizing the gesture and behavior of the transformer operation and maintenance personnel and the gesture of the tool according to claim 4, wherein the step 3 specifically comprises the following steps:

step 31, identifying two-dimensional coordinates of all human body joints of a transformer operation and maintenance person, connection relations among all human body joints and confidence degrees of all human body joints by using an Openphase gesture estimation system for a human body color image;

step 32, for human body joint points with the confidence coefficient smaller than 0.2, searching for adjacent human body joint points with the confidence coefficient larger than 0.8; setting a virtual sphere by taking the adjacent human body joint point with the confidence coefficient smaller than 0.2 and the adjacent human body joint point with the confidence coefficient larger than 0.8 as a radius: if the human body joint point falls outside the virtual sphere, deleting the human body joint point; if the human body joint point falls in the virtual sphere, the human body joint point is reserved;

step 33, correcting and predicting the deleted human body joint points and the joint points lost due to shielding by adopting Kalman filtering; then, obtaining the connection relation between the two-dimensional coordinates of the 18 human body joint points of the power transformation operation and maintenance personnel and each human body joint point in the 18 human body joint points; the 18 human body articulation points include: a nose joint point (N1), a shoulder center joint point (N2), a left shoulder joint point, a left elbow joint point, a left wrist joint point, a right shoulder joint point, a right elbow joint point, a right wrist joint point, a left crotch joint point, a left knee joint point, a left ankle joint point, a right crotch joint point, a right knee joint point, a right ankle joint point, a left eye joint point, a right eye joint point, a left ear joint point and a right ear joint point;

6. The method for recognizing the gesture and behavior of the transformer operation and maintenance personnel and the gesture of the tool according to claim 4, wherein the step 4 specifically comprises the following steps:

step 41, obtaining 12 groups of joint angles through the rest 16 joints except for the left eye joint and the right eye joint in the 18 human joints: left arm angle (theta) ₁ ) Included angle of right arm (theta) ₂ ) Included angle of left leg (theta) ₃ ) Included angle of right leg (theta) ₄ ) Included angle (theta) between left upper arm and left shoulder ₅ ) Included angle (theta) between right upper arm and right shoulder ₆ ) Included angle between left thigh and crotch (θ) ₇ ) Included angle between right thigh and crotch (θ) ₈ ) Included angle (θ) between upper left arm and spine ₉ ) Included angle (θ) between right upper arm and spine ₁₀ ) Included angle (θ) between left thigh and spine ₁₁ ) Included angle between right thigh and spine (θ) ₁₂ )；

Step 42, distance from the nose joint point to the central joint point of the shoulders

As the actual head height; wherein, (x) ₁ ，y ₁ ，z ₁ ) Three-dimensional coordinates of the nasal joint point, (x) ₂ ，y ₂ ，z ₂ ) The three-dimensional coordinates of the joint points of the centers of the shoulders;

step 43, normalizing the actual head height to obtain the scaling ratio

Wherein d represents a standard head height, and d=20 cm;

7. The method for recognizing the gesture and behavior of the transformer operation and maintenance personnel and the gesture of the tool according to claim 6, wherein the step 5 specifically comprises the following steps:

51, dividing the human body posture feature data obtained in the step 45 into a human body posture feature training set and a human body posture feature verification set;

step 52, constructing a cart decision tree according to the human body posture feature training set;

8. The method for identifying the gesture and behavior of the transformer operation and maintenance personnel and the gesture of the tool according to claim 7, wherein in step 53, the verification and evaluation of the cart decision tree constructed in step 52 are performed by using a human gesture feature verification set, specifically: randomly selecting 70% of the data from the human body posture feature data to form a human body posture feature training set, and the rest 30% of the data to form a human body posture feature verification set; the constructed cart decision tree of step 52 is validated and evaluated by the body posture feature validation set.

9. The method for identifying the gesture and behavior of the transformer operation and maintenance personnel and the gesture of the tool according to claim 7, wherein in step 53, the verification and evaluation of the cart decision tree constructed in step 52 are performed by using a human gesture feature verification set, specifically:

step B1, dividing the human body posture characteristic data into m sub-data sets; wherein m is a natural number;

step B2, randomly selecting one sub-data set from the m sub-data sets as a human body posture feature verification set, and verifying and evaluating the cart decision tree constructed in the step 52 to obtain an evaluation value;

step B3, repeating the step B2 until the rest m-1 sub-data sets are respectively and independently used as human body posture feature verification sets, and performing verification evaluation on the cart decision tree constructed in the step 52; obtaining (m-1) evaluation values;

10. The method for recognizing the gesture and behavior of the transformer operation and maintenance personnel and the gesture of the tool according to claim 1, wherein the step 6 specifically comprises the following steps:

step 61, labeling and dividing more than 4000 scene infrared images with tools and scene color images by using labelme, and correspondingly obtaining tool infrared data and tool RGB data after file conversion;