CN112454365A

CN112454365A - Human behavior recognition technology-based human-computer interaction safety monitoring system

Info

Publication number: CN112454365A
Application number: CN202011396870.7A
Authority: CN
Inventors: 涂宏斌; 李�杰; 高晓飞; 段军; 丁莉; 聂芳华; 杜变霞
Original assignee: Hunan Great Wall Science And Technology Information Co ltd
Current assignee: Hunan Great Wall Science And Technology Information Co ltd
Priority date: 2020-12-03
Filing date: 2020-12-03
Publication date: 2021-03-09

Abstract

The invention provides a human body behavior recognition technology-based human-computer interaction safety monitoring system. The human behavior recognition technology-based human interaction safety monitoring system intelligently recognizes the actions of the robot arm through the intelligent monitoring system and realizes human interaction through the robot control system, and the human interaction safety monitoring system specifically comprises the following steps: step S1, establishing a robot arm behavior recognition system; step S2, recognizing behaviors of the robot arm and the operator; and step S3, reasoning the behavior category of the opposite side by adopting a case reasoning method combining the front and back action association according to the recognized behaviors of the robot arm and the operator, and realizing man-machine interaction. According to the invention, the camera is arranged at the key position of the robot equipment, the monitoring is carried out at the position where the human-machine interaction exists, and the intelligent video monitoring system can identify the robot hand action of the robot, so that the human-machine interaction is realized.

Description

Human behavior recognition technology-based human-computer interaction safety monitoring system

Technical Field

The invention relates to the technical field of human-computer interaction monitoring, in particular to a human-computer interaction safety monitoring system based on a human body behavior recognition technology.

Background

At present, in a factory workshop, machine equipment and operators do not have intelligent monitoring and alarming systems, and the safety of operation completely depends on the proficiency and personal protection of the operators and passive safety equipment such as limit switches, light curtains and the like arranged on the machines. This often causes injury to the operator of the device before the alarm is given.

Disclosure of Invention

The invention provides a human body behavior recognition technology-based human-computer interaction safety monitoring system, and aims to solve the technical problem that the degree of human-computer interaction is not high in factory monitoring in the background technology.

In order to achieve the above object, the human-computer interaction security monitoring system based on the human behavior recognition technology provided in the embodiments of the present invention intelligently recognizes the actions of the robot arm through the intelligent monitoring system, and realizes human-computer interaction through the robot control system, which specifically includes the following steps:

step S1, establishing a robot arm behavior recognition system: acquiring two parallax images of the robot arm through a binocular camera, and calculating the three-dimensional coordinate and geometric information of the robot arm space through the acquired parallax images so as to reconstruct the three-dimensional shape and position of the monitored part of the robot arm;

step S2, recognizing robot arm and operator behavior: shooting the motion attitude of the robot arm and the behavior of an operator in a certain range of the station through a camera arranged on the appropriate station, and calculating the motion attitude of the robot arm and the behavior and action recognition result of the operator;

and step S3, reasoning the behavior category of the opposite side by adopting a case reasoning method combining the front and back action association according to the recognized behaviors of the robot arm and the operator, and realizing man-machine interaction.

Preferably, in step S1, the calculating the three-dimensional space coordinate and the geometric information of the robot arm specifically includes the following steps:

step S11, where the imaging point coordinate of the left camera at the point P (X, Y, Z) where the imaging object exists is L (X)_L,y_L) The coordinate of the imaging point of the right camera is R (x)_R,y_R)；

Step S12, according to the focal length f of the left and right cameras_LAnd f_RObtaining coordinate values (X) of the imaged object point P (X, Y, Z) with respect to the right camera_R,y_R,z_R)；

Step S13, coordinate value (x) for right camera_R,y_R,z_R) Rotating and translating to obtain the value of the imaging object point P (X, Y, Z) in the coordinate system (X ', Y ', Z ') relative to the coordinate value of the right camera;

step S14, coordinate values (X, Y, Z) of the imaged object are obtained.

Preferably, the step S12 obtains the imaging object point P (X, Y, Z) relative to the right camera (X)_R,y_R,z_R) The coordinate values are specifically: obtaining imaged object point P (X, Y, Z) relative to right camera (X) by obtaining a model of camera perspective transformation_R,y_R,z_R) Coordinate value of (2) ([ X ]_R Y_R Z_R]^TThe model of the camera perspective transformation is:

wherein f is_LIs the focal length of the left camera, f_RIs the focal length of the right camera, k_LIs the scale factor, k, of the left camera_RIs the scale factor of the left camera;

the step S13 specifically includes:

the rotation transformation matrix R of the coordinate system (X, Y, Z) and the right camera coordinate system is:

R＝R(α)·R(β)·R(γ)；

wherein alpha, beta and gamma are rotation angles around x, y and z axes respectively;

the translation transformation vector T of the coordinate system (X ', Y ', Z ') and the right camera coordinate system is:

[T]＝[t_x t_y t_z]^T；

the calculation formula of the value of the imaged object point P (X, Y, Z) in the coordinate system (X ', Y ', Z ') with respect to the right camera coordinate values is:

the obtaining of the coordinate values (X, Y, Z) of the imaging object in step S14 specifically includes:

is provided with

Thereby having

The formula in the merging step S12 can be obtained:

determining R by rotating the transformation matrix R₁～r₉And obtaining the three-dimensional coordinates of the imaging object point P (X, Y, Z) in the coordinate system (X ', Y ', Z ') according to the corresponding relation:

X＝Zx_L/f_L；Y＝Zy_L/f_L；

preferably, in step S2, the calculating the result of the robot arm motion gesture and the operator behavior recognition specifically includes the following steps:

step S21, acquiring space-time characteristic points of the movement parts of the robot arm and the operator;

step S22, converting the human body behavior in each frame into corresponding word packet information, and converting the obtained space-time characteristic points of each key point of the human body into word packets; establishing a word book by using an improved k-means clustering algorithm, namely, coding behavior actions with high similarity or the same similarity into one class, namely generating the word book;

and step S23, adopting the improved PLSA algorithm to recognize and classify the human body behavior.

Preferably, the step S21 is specifically:

for spatial domain image f^spGiven an observation scale

And a smoothing scale

sp represents space domain and can be obtained through the found image interest points according to the Harris angular point detection method:

wherein the content of the first and second substances,

and

is a gaussian derivative function defined as follows;

in the formula (I), the compound is shown in the specification,

is a pair of

The differential of x is taken to be the difference,

is a pair of

Differentiating y, g^sp(x,y；δ²) And mu^spRespectively a smooth window function and an observation window function;

μ^sp＝det(μ^sp)-ktrac²(μ^sp)

＝λ₁λ₂-k(λ₁+λ₂)²

in the formula, delta is a correlation coefficient between pixel points, det is a formula expression in the Harris corner detection method, trac is the rank of the solution u matrix, and lambda₁λ₂Is det (mu)^sp) A representation of the calculation of the determinant;

when the image intensity value of a pixel point in a video image has obvious changes in a space domain and a time domain, namely a space-time characteristic point, defining a function: f is R²Xr → R, and convolving it with a gaussian kernel to obtain the result, namely the scale space L, which is defined as: l is R²×R×R₊→R

Delta is a correlation coefficient between pixel points, tau is a time coefficient of the same pixel point under a scene, and is specific time with the unit of second;

where g is a Gaussian kernel function:

in the formula, delta_lIs the correlation coefficient between pixels, tau_lFor a time coefficient appearing in the same pixel under a scene, t is the time of appearing the pixel, a 2 × 2 matrix of a spatial domain is popularized to a 3 × 3 matrix of the spatial domain containing time differentiation, and the following results are obtained:

wherein the content of the first and second substances,

is shown as

Convolution of, L_xL_y、L_tAre respectively as

The smoothing function has smoothing scales in a space domain and a time domain as follows:

and

and detecting the spots by Harris method;

to model a sequence of spatiotemporal images, functions f and L are used and the following functions are defined:

the spatio-temporal function with a gaussian kernel is:

wherein the fusion scale is:

and

finally, an extended Harris function is defined, and the maximum value of the function is the space-time characteristic point:

H＝det(μ)-ktrace³(μ)

＝λ₁λ₂λ₃-k(λ₁+λ₂+λ₃)³

in the formula, λ₃Is the eigenvalue of the u matrix.

Preferably, the step S22 specifically includes the following steps:

step S221, establishing a digital-analog definition word packet and a word book:

parameter definition: d ═ D₁,d₂,....,d_NDenotes a set of good word packets, and W ═ W₁,w₂,....,w_MDenotes a word book, then D and W are merged into an N M co-occurrence matrix N, which represents W_jAppears at d_iThe number of times of (1) is as shown in the following formula

Wherein d is_iFor the ith word package, w_jRepresenting the jth word, in the matrix N, the rows represent the number of times each word appears in a certain word package, and the columns represent the number of times a certain word appears in each document; defining a subject variable z, p (d)_i) For a word appearing at d_iProbability in the piece document; p (w)_j|z_k) To be on the subject z_kIn occurrence ofw_jThe probability of (d); p (z)_k|d_i) As a subject z_kProbability of occurrence in the ith document;

where P (Z | d) is the probability distribution of topic Z under a given topic Z, and P (w | Z) is the probability distribution of a topic w for a given topic Z;

step S222, classifying the space-time characteristic points of the robot arm and the joint points of the operator by applying a digital model:

the category d of the robot arm behavior and the operator behavior, N, can be obtained from the clustering in the video image q_vA visual word is represented as N_vVector of dimensions d (q):

in the formula, n (p, v)_i) The video image q representing the human body behavior d contains visual words v_iThe number of (2);

the space-time characteristic point feature set of the camera to the joint points of the robot arm and the operator is as follows:

X＝{x₁,......x_N},

n1.. N, wherein Y ═ Y ·₁,......y_N}，

i-1, …. n. is a reduced-dimension set of the set X, i.e., a set of noise-removed points, and the cluster category is

Step S223, creating a word list through a k-means clustering algorithm:

wherein k is a cluster number, r_ijE {0,1} is a number of labels, e.g.Fruit y_iIs class j, then r_ij1, otherwise r_ij＝0；

Wherein the content of the first and second substances,

is composed of

The weight value after the dimension reduction is carried out,

the probability expectation value of the x picture at the t moment and the i frame is obtained; k is the size of the word list,

is composed of

The weight value after dimensionality reduction; x ═ X₁,......x_N},

N is divided into k classes, defining a cluster variance:

weight ε defining variance in cluster_w

0≤p＜1

Wherein the index p represents the sensitivity degree of weight updating between classes; an empirical value of 0.7 was chosen, giving the following formula:

0≤p＜1

firstly fixing the weight value to find a new cluster, and then calculating m_k；

When w is_kWhen increasing progressively, m_kThe calculation formula is as follows:

wherein σ_ijFor the i frame and j frame image correlation coefficient, v_kIs x_i-μ_iFor the difference between the ith frame image x and the ith frame probability expectation value mu,

0≤p＜1,

the difference value of the probability expectation value mu of the jth frame image x and the ith frame image x is obtained;

the updating weight value is defined as follows in order to enhance and improve the stability of the k-means algorithm:

0≤β≤1

wherein the content of the first and second substances,

in the continuous iterative calculation process, the beta parameter can influence the updating of the control weight and can smooth the weight values of continuous iteration;

given weight matrix w_kWe can obtain the following formula:

wherein W is a weight matrix W_kThe sum, X is the image matrix, C is the cluster class

T is matrix transposition;

the minimum value of J is the word book, i.e., the coded classification of high similarity or same behavioral actions.

Preferably, the step S23 specifically includes the following steps:

defining a joint distribution model and a posterior probability formula:

p(w,d,z)＝p(w|z,z₀)p(z,z₀|d)p(d)

in order to find the maximum log-likelihood function in the EM step of the maximum expectation algorithm, the estimation value in the EM step is as follows:

the parameters in the above formula are regularized by setting appropriate Lagrange multipliers alpha and beta to obtain

Bayesian estimation of p (z | w, d) and p (z) is then used₀L w, d), the specific calculation steps are as follows:

p (z) finally obtained₀And | w, d) is a probability value, and the probability value is the result of motion recognition.

Preferably, the step S3 specifically includes the following steps:

step S31, defining an interactive behavior Case of the robot arm to be recognized and the operator, and dividing the Case into 3 parts: the previous frame has determined a man-machine interaction sub-behavior part, a current frame basic sub-behavior part and a basic related sub-behavior time characteristic part:

Case＝{{A_prior}{B_current},{B_related}}＝{{A₁,A₂,....A_n},{B₁,B₂,B₃,....B_n},{t₁,t₂, ....,t_n}}；

wherein { A _ prior } identifies the behavior of the previous frame; a set denoted as a1.. An, respectively;

{ B _ current } is the current behavior to be identified; { B _ related } is a behavior that is identified by the behavior identification algorithm above as being similar to the behavior that has been identified for A;

b is represented as a set of B1.. Bn;

t 1.. tn is the time series corresponding to it;

and step S32, judging the current man-machine interaction behavior through the previous man-machine interaction action by using a case reasoning mechanism according to the behavior category determined in the previous frame.

Preferably, the step S32 specifically includes the following steps:

step S321, calculating the similarity attribute characteristics of the human-computer interaction behaviors of the front frame and the rear frame: defining a one-dimensional vector consisting of the interaction behaviors of the robot arm and the operator in the previous frame of image as { A _ prior }, and the current behavior part as { B _ current }, then calculating the similarity between the current behavior and the determined behavior in the previous frame:

Sim(A,B)＝∑n_i＝1

w_i×sim(A_i,B_i)

n is the characteristic number in each case base, sim (A, B) is the case of known determined behavior and ambiguous behavior to be tested, w_iThe case is A, the associated weight of the case base B, the weight is the frequency of the case base A when the behavior B occurs, and the calculation formula is as follows: w is a_i＝B_i/A；

Step S322, calculating time attribute characteristics related to behaviors: the time sequence set of the basic sub-behaviors according to the sequence set of the front frame number and the back frame number is as follows: b _ duration, the total duration of the occurrence of each simple child behavior in vector B ═ T₁,T₂,...,T_N}; wherein, Ti is the total duration of the ith sub-behavior corresponding to the vector B _ duration;

the specific algorithm is as follows:

calculating two attribute characteristics of each simple behavior of the human body for the collected video sequence;

calculating comprehensive similarity sim (A, B);

when the maximum value of the similarity is larger than a certain threshold value T, namely sim (A, B) > T, judging that the current interactive behavior belongs to the same interactive behavior, wherein the threshold value is determined according to a field test experience value, and adding the threshold value into a case library to update the case library in real time for classifying the subsequent human-computer interactive behavior; and sends deaf the results to the operator to effect the corresponding interaction.

The technical effects which can be achieved by adopting the invention are as follows: according to the invention, the camera is arranged at the key position of the machine equipment, the monitoring is carried out at the position where the man-machine interaction exists, and when the camera shoots the machine, such as an operator operating materials, the intelligent video monitoring system can identify the robot hand action of the robot, so that the man-machine interaction is realized. When the movement of the robot arm harms the operator, the robot automatically gives an alarm and performs emergency braking.

Compared with the traditional method, the intelligent monitoring method has the greatest advantages that a contactless, large-range and long-distance intelligent monitoring technology can be realized, the repeated behavior interaction and dangerous behavior emergency braking of the robot-operating personnel in the monitored area can be realized, the technical scheme can be used for the man-machine interaction monitoring of the autonomous controllable intelligent factory, and the safety of the online operating personnel can be ensured on the premise of improving the production efficiency.

Drawings

FIG. 1 is a schematic structural diagram of a human-computer interaction safety monitoring system based on human behavior recognition technology according to the present invention;

fig. 2 is a binocular camera device installation schematic diagram and an imaging schematic diagram of the human behavior recognition technology-based human interaction safety monitoring system of the invention.

Detailed Description

In order to make the technical problems, technical solutions and advantages to be solved by the present invention clearer, the following detailed description is made with reference to the accompanying drawings and specific embodiments.

The invention provides a human-computer interaction safety monitoring system based on a human behavior recognition technology, which is characterized in that as shown in figure 1, the intelligent monitoring system intelligently recognizes the actions of a robot arm and realizes human-computer interaction through a robot control system, and the human-computer interaction safety monitoring system specifically comprises the following steps:

In step S1, the behavior of the robot arm in the operation area, such as turning, grabbing, lifting, etc., is identified, and due to the actual scene, there may be human-machine, human-human and machine-machine occlusion. Behavior recognition in the case of occlusion must be employed to recognize this type of behavior.

The invention installs a binocular camera or a multi-camera on a proper station, and the binocular camera adopts two cameras with completely same performance parameters during imaging so as to shoot operators and equipment. Taking binocular as an example, two images of parallax images of the robot arm are acquired from two different angles respectively, and the spatial three-dimensional coordinates and geometric information of the robot arm can be calculated through the acquired parallax images, so that the three-dimensional shape and position of the monitored part of the robot arm are reconstructed.

The binocular camera imaging uses two cameras with identical performance parameters, arranged in a left-right structure. As shown in fig. 1, wherein OL is the optical center of the left camera, OR represents the optical center of the right camera, by establishing a three-dimensional coordinate on the object under the action of the light source (indoor OR outdoor light source), it can be known that there is a point P (X, Y, Z) on the object, connecting the optical centers of the left and right cameras and the object point P, respectively, and the connecting line must intersect the XY plane of the cameras. Under the imaging of the cameras, the imaging point coordinate of the point P on the left camera is L (xL, yL), and the imaging point coordinate of the point P on the right camera is R (xR, yR). The two imaging points are images of the object P with different view angles.

In step S1, the step of calculating the three-dimensional coordinate and the geometric information of the robot arm space specifically includes the following steps:

step S14, coordinate values (X, Y, Z) of the imaged object are obtained.

1. The step S12 obtains the imaging object point P (X, Y, Z) with respect to the right camera (X)_R,y_R,z_R) The coordinate values are specifically: obtaining imaged object point P (X, Y, Z) relative to right camera (X) by obtaining a model of camera perspective transformation_R,y_R,z_R) Coordinate value of (2) ([ X ]_R Y_R Z_R]^TThe model of the camera perspective transformation is:

the step S13 specifically includes:

R＝R(α)·R(β)·R(γ)；

[T]＝[t_x t_y t_z]^T；

is provided with

Thereby having

The formula in the merging step S12 can be obtained:

X＝Zx_L/f_L；Y＝Zy_L/f_L；

in step S2, the calculating the result of the robot arm motion gesture and the operator behavior recognition specifically includes the following steps:

step S22, converting the human body behavior in each frame into corresponding word packet information, and converting the obtained space-time characteristic points of each key point of the human body into word packets, namely encoding 01, 11, 10 and the like; establishing a word book by using an improved k-means clustering algorithm, namely, classifying behavior action codes with high similarity or the same behavior action codes into one class, namely generating the word book;

In step S21, the camera captures the robot arm and the operator simultaneously in the field of view, and the robot arm and the operator' S joints can be regarded as angular points. Therefore, the design method finds these corner points. The method for acquiring the space-time characteristic points of the moving parts of the robot arm and the operator specifically comprises the following steps:

for spatial domain image f^spGiven an observation scale

And a smoothing scale

sp represents the space domain; the method can be obtained by finding the interest points of the image according to a Harris corner detection method:

wherein the content of the first and second substances,

and

is a gaussian derivative function defined as follows;

in the formula (I), the compound is shown in the specification,

is a pair of

The differential of x is taken to be the difference,

is a pair of

μ^sp＝det(μ^sp)-ktrac²(μ^sp)

＝λ₁λ₂-k(λ₁+λ₂)²

when pixels in a video image are clicked outWhen the image intensity values have obvious changes in a space domain and a time domain, namely, when space-time characteristic points appear, a function is defined: f is R²Xr → R, and convolving it with a gaussian kernel to obtain the result, namely the scale space L, which is defined as: l is R²×R×R₊→R

where g is a Gaussian kernel function:

wherein the content of the first and second substances,

is shown as

Convolution of, L_xL_y、L_tAre respectively as

and

and detecting the spots by Harris method;

the spatio-temporal function with a gaussian kernel is:

wherein the fusion scale is:

and

H＝det(μ)-ktrace³(μ)

＝λ₁λ₂λ₃-k(λ₁+λ₂+λ₃)³

in the formula, λ₃Is the eigenvalue of the u matrix.

The step S22 specifically includes the following steps:

Wherein d is_iFor the ith word package, w_jRepresenting the jth word, in the matrix N, the rows represent the number of times each word appears in a certain word package, and the columns represent the number of times a certain word appears in each document; defining a subject variable z, p (d)_i) For a word appearing at d_iProbability in the piece document; p (w)_j|z_k) To be on the subject z_kIn the presence of w_jThe probability of (d); p (z)_k|d_i) As a subject z_kProbability of occurrence in the ith document;

X＝{x₁,......x_N},

n1.. N, wherein Y ═ Y ·₁,......y_N}，

Step S223, creating a word list through a k-means clustering algorithm:

wherein k is a cluster number, r_ijE {0,1} is a tag number if y_iIs class j, then r_ij1, otherwise r_ij＝0；

Wherein the content of the first and second substances,

is composed of

The weight value after the dimension reduction is carried out,

is composed of

The weight value after dimensionality reduction; x ═ X₁,......x_N},

Divided into k classes, we define an intra-cluster variance:

weight ε defining variance in cluster_w

0≤p＜1

0≤p＜1

0≤p＜1,

0≤β≤1

wherein the content of the first and second substances,

in the continuous iterative calculation process, the beta parameter can influence the updating of the control weight and can smooth the weight value of continuous iteration;

given weight matrix w_kWe can obtain the following formula:

T is matrix transposition;

The step S23 specifically includes the following steps:

defining a joint distribution model and a posterior probability formula:

p(w,d,z)＝p(w|z,z₀)p(z,z₀|d)p(d)

the Expectation maximization (Expectation maximization) algorithm is an iterative optimization algorithm, and the calculation method is that each iteration is divided into an Expectation (E) step and a Maximum (M) step. In order to find the maximum log-likelihood function in the EM step of the maximum expectation algorithm, the estimation value in the EM step is as follows:

Then using shellfishLeaf estimates p (z | w, d) and p (z)₀L w, d), the specific calculation steps are as follows:

The step S3 specifically includes the following steps:

b is represented as a set of B1.. Bn;

t 1.. tn is the time series corresponding to it;

The step S32 specifically includes the following steps:

Sim(A,B)＝∑n_i＝1

w_i×sim(A_i,B_i)

the specific algorithm is as follows:

calculating comprehensive similarity sim (A, B);

when the maximum value of the similarity is larger than a certain threshold value T, namely sim (A, B) > T, judging that the current interactive behavior belongs to the same interactive behavior, wherein the threshold value is determined according to a field test experience value, and adding the threshold value into a case library to update the case library in real time for classifying the subsequent human-computer interactive behavior; and sends deaf the results to the operator to effect corresponding interactions, such as: and delivering the workpiece to the robot arm, and automatically lifting or grabbing the workpiece by the robot arm.

While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. Human action recognition technology based human interaction safety monitoring system, its characterized in that through intelligent monitoring system intelligent recognition robot arm action to realize human-computer interaction through robot control system, specifically include the following step:

2. The human body behavior recognition technology-based human body interaction security monitoring system according to claim 1, wherein the step S1 of calculating the spatial three-dimensional coordinates and the geometric information of the robot arm specifically comprises the steps of:

step S14, coordinate values (X, Y, Z) of the imaged object are obtained.

3. The human body behavior recognition technology-based human body interaction security monitoring system according to claim 2, wherein the imaging object point P (X, Y, Z) obtained in the step S12 is opposite to the right camera (X)_R,y_R,z_R) The coordinate values are specifically: obtaining imaged object point P (X, Y, Z) relative to right camera (X) by obtaining a model of camera perspective transformation_R,y_R,z_R) Coordinate value of (2) ([ X ]_R Y_R Z_R]^TThe model of the camera perspective transformation is:

the step S13 specifically includes:

R＝R(α)·R(β)·R(γ)；

[T]＝[t_x t_y t_z]^T；

is provided with

Thereby having

The formula in the merging step S12 can be obtained:

X＝Zx_L/f_L；Y＝Zy_L/f_L；

4. the human body behavior recognition technology-based human body interaction security monitoring system according to claim 1, wherein in the step S2, the calculating of the robot arm motion gesture and the operator behavior recognition result specifically includes the following steps:

5. The human behavior recognition technology-based human interaction security monitoring system of claim 4, wherein the step S21 specifically comprises:

for spatial domain image f^spGiven an observation scale

And a smoothing scale

wherein the content of the first and second substances,

and

is a gaussian derivative function defined as follows;

in the formula (I), the compound is shown in the specification,

is a pair of

The differential of x is taken to be the difference,

is a pair of

μ^sp＝det(μ^sp)-ktrac²(μ^sp)

＝λ₁λ₂-k(λ₁+λ₂)²

when the image intensity value of a pixel point in a video image has obvious changes in a space domain and a time domain, namely a space-time characteristic point, defining a function: f is R²X R → R, and mixing it with Gaussian nucleusThe result of the convolution of the function is a scale space L, which is defined as: l is R²×R×R₊→R

where g is a Gaussian kernel function:

in the formula, delta_lIs the correlation coefficient between pixels, tau_lFor a time coefficient of the same pixel point appearing in a scene, t is the time of the pixel point appearing, a 2 × 2 matrix of a spatial domain is popularized to a 3 × 3 matrix of the spatial domain containing time differentiation, and the following results are obtained:

wherein the content of the first and second substances,

is shown as

Convolution of, L_xL_y、L_tAre respectively as

The smoothing scale of the function in the space domain and the time domain is respectively:

and

and detecting these spots by Harris method;

the spatio-temporal function with a gaussian kernel is:

wherein the fusion scale is:

and

H＝det(μ)-ktrace³(μ)

＝λ₁λ₂λ₃-k(λ₁+λ₂+λ₃)³

in the formula, λ₃Is the eigenvalue of the u matrix.

6. The human behavior recognition technology-based human interaction security monitoring system of claim 5, wherein the step S22 specifically comprises the following steps:

Wherein d is_iFor the ith word package, w_jRepresenting the jth word, in the matrix N, the rows represent the number of times each word appears in a certain package, and the columns represent the number of times a certain word appears in each document; defining a subject variable z, p (d)_i) For a word appearing at d_iProbability in the piece document; p (w)_j|z_k) To be on the subject z_kIn the presence of w_jThe probability of (d); p (z)_k|d_i) As a subject z_kProbability of occurrence in the ith document;

X＝{x₁，......x_N},

wherein Y is { Y ═ Y₁,......y_N}，

Is a reduced-dimension set of the set X, i.e. a set of noise points, and the cluster category is

Step S223, creating a word list through a k-means clustering algorithm:

Wherein the content of the first and second substances,

is composed of

The weight value after the dimension reduction is carried out,

is composed of

The weight value after dimensionality reduction; x ═ X₁,......x_N},

Is divided into k classes, defining an intra-cluster variance:

weight ε defining variance in cluster_w

wherein the content of the first and second substances,

given weight matrix w_kWe can obtain the following formula:

wherein W is a weight matrix W_kSum, X is the image matrix C for the cluster category

T is a matrixTransposition is carried out;

7. The human behavior recognition technology-based human interaction security monitoring system of claim 6, wherein the step S23 specifically comprises the following steps:

defining a joint distribution model and a posterior probability formula:

p(w,d,z)＝p(w|z,z₀)p(z,z₀|d)p(d)

8. The human behavior recognition technology-based human interaction security monitoring system of claim 1, wherein the step S3 specifically comprises the following steps:

step S31, defining an interactive behavior Case of the robot arm to be recognized and the operator, and dividing the Case into 3 parts: the previous frame has determined a man-machine interaction sub-behavior part, a current frame basic sub-behavior part and a basic correlation sub-behavior time characteristic part:

Case＝{{A_prior}{B_current},{B_related}}＝{{A₁,A₂,....A_n},{B₁,B₂,B₃,....B_n},{t₁,t₂,....,t_n}}；

{ B _ current } is the current behavior to be identified;

{ B _ related } is a behavior that is identified by the behavior identification algorithm above as being similar to the behavior that has been identified for A;

b is represented as a set of B1.. Bn;

t 1.. tn is the time series corresponding to it;

9. The human behavior recognition technology-based human interaction security monitoring system of claim 8, wherein the step S32 specifically comprises the following steps:

Sim(A,B)＝∑n_i＝1

w_i×sim(A_i,B_i)

n is the characteristic number in each case base, sim (A, B) is the case of known definite behavior and ambiguous behavior to be measured, w_iThe case is A, the associated weight of the case base B, the weight is the frequency of the case base A when the behavior B occurs, and the calculation formula is as follows: w is a_i＝B_i/A；

Step S322, calculating time attribute characteristics related to behaviors: the time sequence set of the basic sub-behaviors according to the sequence set of the front frame number and the back frame number is as follows: b _ duration, the total duration of the occurrences of each simple child behavior in vector B { T }₁,T₂,...,T_N}; wherein, Ti is the total duration of the ith sub-behavior corresponding to the vector B _ duration;

the specific algorithm is as follows:

calculating comprehensive similarity sim (A, B);

when the maximum value of the similarity is larger than a certain threshold value T, namely sim (A, B) > T, judging that the current interactive behavior belongs to a certain interactive behavior, wherein the threshold value is determined according to a field test experience value, and adding the threshold value into a case library to update the case library in real time for classifying the subsequent human-computer interactive behavior; and sends deaf the results to the operator to effect the corresponding interaction.