CN114463835A

CN114463835A - Behavior recognition method, electronic device and computer-readable storage medium

Info

Publication number: CN114463835A
Application number: CN202111605393.5A
Authority: CN
Inventors: 戴媛
Original assignee: Zhejiang Dahua Technology Co Ltd
Current assignee: Zhejiang Dahua Technology Co Ltd
Priority date: 2021-12-25
Filing date: 2021-12-25
Publication date: 2022-05-10

Abstract

The application discloses a behavior recognition method, electronic equipment and a computer readable storage medium, and relates to the field of image processing and computer vision. The behavior recognition method comprises the following steps: detecting an image to be processed, and determining a head and shoulder area and key points of a human body contained in the image to be processed; and determining whether the human body has target behaviors or not based on the head and shoulder regions and the position distribution characteristics of the key points. Through the mode, the behavior recognition method can be used for recognizing whether the human body has the target behavior or not by processing the head and shoulder regions and key points of the human body of the image, and can improve the accuracy of human body behavior recognition.

Description

Behavior recognition method, electronic device and computer-readable storage medium

Technical Field

The present application relates to the field of image processing and computer vision, and in particular, to a behavior recognition method, an electronic device, and a computer-readable storage medium.

Background

With the development of big data and deep learning, the technology based on human behavior recognition plays more and more important roles in various fields. However, when the current human behavior recognition technology recognizes various behaviors or recognizes a plurality of human bodies, the recognition accuracy is low due to the interference of factors such as environmental factors and mutual shielding among a plurality of individuals.

Disclosure of Invention

The technical problem mainly solved by the application is to provide a behavior recognition method, an electronic device and a computer readable storage medium, so as to improve the accuracy of human behavior recognition.

In order to solve the technical problem, the application adopts a technical scheme that: there is provided a behavior recognition method, the method comprising:

detecting an image to be processed, and determining a head and shoulder area and key points of a human body contained in the image to be processed; and determining whether the human body has target behaviors or not based on the head and shoulder regions and the position distribution characteristics of the key points.

In order to solve the technical problem, the application adopts a technical scheme that: an electronic device is provided, which includes a processor and a memory connected to the processor, wherein the memory stores program data, and the processor executes the program data stored in the memory to realize: detecting an image to be processed, and determining a head and shoulder area and key points of a human body contained in the image to be processed; and determining whether the human body has target behaviors or not based on the head and shoulder regions and the position distribution characteristics of the key points.

In order to solve the technical problem, the application adopts a technical scheme that: there is provided a computer readable storage medium having stored therein program instructions that are executed to implement: detecting an image to be processed, and determining a head and shoulder area and key points of a human body contained in the image to be processed; and determining whether the human body has target behaviors or not based on the head and shoulder regions and the position distribution characteristics of the key points.

Different from the prior art, the method and the device have the advantages that the human body behaviors are identified by detecting the images to be processed, the target behaviors of the human body are identified by detecting the head and shoulder regions and the position distribution characteristics of key points of the human body contained in the images to be processed, the problem of behavior error identification caused by insufficient single characteristic information can be solved, and the accuracy of human body behavior identification can be improved.

Drawings

FIG. 1 is a schematic flow chart diagram of a first embodiment of the behavior recognition method of the present application;

FIG. 2 is a schematic flow chart diagram of a second embodiment of the behavior recognition method of the present application;

FIG. 3 is a specific flowchart of step S202 in FIG. 2;

FIG. 4 is a specific flowchart of step S202 in FIG. 2;

FIG. 5 is a specific flowchart of step S404 in FIG. 4;

FIG. 6 is a specific flowchart of step S202 in FIG. 2;

FIG. 7 is a schematic flow chart diagram illustrating a third embodiment of the behavior recognition method of the present application;

FIG. 8 is a schematic structural diagram of an embodiment of the combined trident depth residual model of the present application;

FIG. 9 is a schematic structural diagram of another embodiment of the combined trident depth residual model of the present application;

FIG. 10 is a schematic structural diagram of an embodiment of a cascaded pyramid model according to the present application;

FIG. 11 is a schematic structural diagram of an embodiment of an overall network of the cascaded pyramid model and the combined tridentate depth residual model according to the present application.

FIG. 12 is a schematic flowchart of an embodiment of a training strategy for a cascaded pyramid model and a joint tridentate depth residual error model according to the present application;

fig. 13 is a schematic flow chart of a fourth embodiment of the behavior recognition method of the present application;

FIG. 14 is a schematic structural diagram of an embodiment of an electronic device of the present application;

FIG. 15 is a schematic structural diagram of an embodiment of a computer-readable storage medium of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The present application firstly proposes a behavior recognition method, as shown in fig. 1, and fig. 1 is a schematic flow chart of a first embodiment of the behavior recognition method of the present application. The behavior recognition method specifically includes steps S101 to S102:

step S101: and detecting the image to be processed, and determining the head and shoulder area and key points of the human body contained in the image to be processed.

The method comprises the steps of obtaining an image to be processed of behavior recognition, detecting the image to be processed by using a target detection model, and outputting detection frames such as a human face frame, a human head and shoulder area frame and the like of the image to be processed and key points of a human body by using the target detection model. The embodiment can determine the head and shoulder regions and key points of the human body contained in the image to be processed by using the target detection model.

Step S102: and determining whether the human body has target behaviors or not based on the head and shoulder regions and the position distribution characteristics of the key points.

Based on the position distribution characteristics of the head and shoulder area and the key points of the human body, the target behaviors are set and preset with certain rules for preselection, other human body behaviors are filtered, and whether the target behaviors exist in the human body is determined.

Different from the prior art, the behavior recognition method of the application recognizes the behavior of the human body by combining the features of key points of the human body and the image features, can avoid the problem of behavior error recognition caused by insufficient single feature information, and can enhance the accuracy of human body behavior recognition.

The present application further provides a behavior recognition method, as shown in fig. 2, and fig. 2 is a schematic flow chart of a second embodiment of the behavior recognition method according to the present application. The behavior identification method can be applied to an examination room, the target behaviors can comprise cheating behaviors of the examination room, and the key points comprise sub key points related to hands of a human body. The second embodiment of the behavior recognition method specifically includes steps S201 to S202:

step S201: and detecting the image to be processed, and determining the head and shoulder area and key points of the human body contained in the image to be processed.

The method comprises the steps of obtaining an image to be processed of an examination room, inputting the image to be processed of the examination room into a target detection model for detection, and determining a head and shoulder area and key points of a human body contained in the image to be processed.

Step S202, determining reference points of the head and shoulder areas, and determining whether the human body has examination room cheating behaviors or not based on the sub-key points, the position distribution among the reference points and the size information of the head and shoulder areas, or determining whether the human body has examination room cheating behaviors or not based on the size information of the head and shoulder areas.

The method comprises the steps of obtaining a reference point of a human body head and shoulder area and sub key points related to the hand of the human body, and conducting preselection according to a certain rule on cheating behaviors of an examination room by setting the cheating behaviors of the examination room on the basis of position distribution among the reference point and the sub key points and size information of the head and shoulder area or size information of the head and shoulder area, so that whether the cheating behaviors of the examination room exist in the human body is judged.

Optionally, the cheating act in the examination room comprises a table-lying act, the reference point comprises a bottom central point of a head-shoulder area, and the sub-key points at least comprise a finger key point and a wrist key point of the same hand of the human body. In this embodiment, the step of determining whether there is cheating in the examination room on the human body based on the sub-key points, the position distribution among the reference points, and the size information of the head and shoulder area in step S202 can be implemented by the method shown in fig. 3, and the specific implementation steps include steps S301 to S302:

step S301: a first reference line is determined that passes through the finger keypoints and the wrist keypoints of the same hand.

Step S302: and determining whether the human body has a table bending behavior or not based on the distance from the bottom center point to the first reference straight line.

Specifically, the table bending behavior of the human body is determined in response to the fact that the distance from the bottom center point to the first reference straight line is larger than a first preset threshold value.

Whether the human body has the desk action of lying prone or not is judged, the image to be processed of the examination room is obtained, the image to be processed is detected, in the examination room, due to the fact that the shooting angle or the shooting condition of a single camera exist, the problem of shielding easily exists between the human bodies in the examination room, only the bottom central point of the head-shoulder area and the finger key point and the wrist key point of the same hand of the human body need to be obtained, the first reference straight line of the finger key point and the wrist key point of the same hand of the human body is determined again, when the distance from the bottom central point of the head-shoulder area to the first reference straight line is larger than a first preset threshold value, the human body is determined to have the desk action of lying prone.

Optionally, the examination room cheating act includes an act of transferring articles, the reference points include an upper corner point, an upper boundary, a left shoulder key point and a right shoulder key point of the head-shoulder area, and the sub-key points include a wrist key point and an elbow key point of the same hand of the human body. In this embodiment, the step of determining whether there is cheating in the examination room on the human body based on the sub-key points, the position distribution among the reference points, and the size information of the head and shoulder area in step S202 can be implemented by the method shown in fig. 4, and the specific implementation steps include steps S401 to S404:

step S401: a second distance from the upper corner point to the corresponding wrist key point is determined.

Step S402: and judging whether the second distance is greater than half of the width of the head-shoulder area or not, and whether the minimum horizontal distance between the wrist key point and the upper boundary is smaller than a second preset threshold or not.

Step S403: and if the second distance is greater than half of the width of the head-shoulder area and the minimum horizontal distance between the wrist key point and the upper boundary is less than a second preset threshold, determining a second reference straight line between the left shoulder key point and the right shoulder key point, and determining a third reference straight line between the elbow key point and the left shoulder key point or the right shoulder key point on the same side as the elbow key point.

Step S404: and determining whether the human body has the behavior of transferring articles or not based on the second reference straight line, the third reference straight line and the wrist key points.

And when judging whether the human body has the behavior of transferring articles, acquiring an image to be processed of the examination room, detecting the image to be processed, acquiring an upper corner point, an upper boundary, a left shoulder key point, a right shoulder key point and the width of the head-shoulder area, and acquiring a wrist key point and an elbow key point of the same hand of the human body.

And determining a second distance from the upper corner point to the corresponding wrist key point, determining a second distance from the upper left corner point of the head-shoulder area to the left wrist key point when the acquired key point is the left wrist key point of the human body, and determining a second distance from the upper right corner point of the head-shoulder area to the right wrist key point when the acquired key point is the right wrist key point of the human body. And judging whether the second distance is greater than half of the width of the head-shoulder area or not, and whether the minimum horizontal distance between the wrist key point and the upper boundary of the head-shoulder area is smaller than a second preset threshold or not, and if the second distance is smaller than half of the width of the head-shoulder area and/or the minimum horizontal distance between the wrist key point and the upper boundary is greater than the second preset threshold, judging that the human body does not have the action of transferring articles.

And if the second distance is greater than half of the width of the head-shoulder area and the minimum horizontal distance between the wrist key point and the upper boundary is less than a second preset threshold, determining a second reference straight line between the left shoulder key point and the right shoulder key point, and determining a third reference straight line between the elbow key point and the left shoulder key point or the right shoulder key point on the same side as the elbow key point.

And then, determining whether the human body has the behavior of transferring articles according to the relationship among the second reference straight line, the third reference straight line and the key points of the wrist.

Optionally, the sub-keypoints further comprise wrist keypoints of the other hand of the human body. In this embodiment, step S404 can be implemented by the method steps shown in fig. 5, and the specific steps include steps S501 to S504:

step S501: and calculating the cosine distance between the second reference straight line and the third reference straight line.

Step S502: a third distance between the wrist keypoint and a wrist keypoint of the other hand is calculated.

Step S503: and judging whether the cosine distance is smaller than a third preset threshold value and whether the third distance is larger than a fourth preset threshold value.

Step S504: and determining that the human body has an article transferring behavior in response to the fact that the cosine distance is smaller than a third preset threshold and the third distance is larger than a fourth preset threshold.

When determining whether the human body has the behavior of transferring articles according to the relationship among the second reference straight line, the third reference straight line and the wrist key points, and needing to acquire the wrist key point of the other hand of the human body, firstly calculating the cosine distance between the second reference straight line and the third reference straight line, then calculating the third distance between the wrist key point and the wrist key point of the other hand, further judging whether the cosine distance is smaller than a third preset threshold value and whether the third distance is larger than a fourth preset threshold value, and if the cosine distance is smaller than the third preset threshold value and the third distance is larger than the fourth preset threshold value, determining that the human body has the behavior of transferring articles.

Optionally, the examination room cheating act includes a turn-around act, and the size information of the head-shoulder area includes a width of the head-shoulder area and a height of the head-shoulder area. In this embodiment, the step of determining whether there is cheating in the examination room on the human body based on the size information of the head and shoulder area in step S202 can be implemented by the method shown in fig. 6, and the specific implementation steps include steps S601 to S602:

step S601: the ratio of the width of the head-shoulder region to the height of the head-shoulder region is calculated.

Step S602: and determining whether the human body has the turning behavior or not based on the ratio.

Specifically, in response to the ratio being greater than a fifth preset threshold, it is determined that there is a turn-around behavior in the human body.

And when judging whether the human body has a turning behavior, acquiring an image to be processed of the examination room, detecting the image to be processed, and acquiring the width and the height of the head and shoulder area at the moment, so as to calculate the ratio of the width of the head and shoulder area to the height of the head and shoulder area, and if the ratio is greater than a fifth preset threshold, determining that the human body has the turning behavior.

Compared with the prior art, the method and the device have the advantages that the conclusion of the distribution situation of the positions of the human body head and shoulder areas and the human body key points of the cheating behaviors in the examination scene under the general condition is obtained through statistical analysis of a large amount of experimental data, other normal behaviors of the human body can be rapidly and accurately filtered through setting the cheating behaviors of the examination room and the preselection rules, the cheating behaviors of the examination room aimed at by the method and the device mainly comprise a table-lying behavior, an article transferring behavior and a turning behavior, the cheating behaviors of the three examination rooms in the image to be processed are identified, and therefore the calculation amount of the cheating behaviors of the follow-up secondary identification examination room is greatly reduced.

As shown in fig. 7, fig. 7 is a schematic flowchart of a behavior recognition method according to a third embodiment of the present application. Specifically, the method includes steps S701 to S706:

step S701: and detecting the image to be processed, and determining the head and shoulder area and key points of the human body contained in the image to be processed.

Step S701 is identical to step S101, and is not described again.

Step S702: and determining whether the human body has target behaviors or not based on the head and shoulder regions and the position distribution characteristics of the key points.

Step S702 is the same as step S102, and is not repeated.

Step S703: and carrying out image recognition on the head and shoulder area by using the combined trident depth residual error model to obtain the image characteristics of the head and shoulder area.

After determining whether the human body has the target behavior based on the position distribution characteristics of the head-shoulder area and the key points, firstly, acquiring the head-shoulder area of the image to be processed, in which the target behavior exists, of the human body, and then, carrying out image recognition on the head-shoulder area through the combined trident depth residual error model, thereby acquiring the image characteristics of the head-shoulder area.

When a combined trident depth residual error model is constructed, the trident network is introduced into the depth residual error network and is improved. The multi-branch trident network based on the space convolution is utilized, and the trident network is improved to be more suitable for the structure of the classification network, so that the depth residual error network is better fused. The backbone network used for behavior recognition in the present application may be a deep residual network resnet 50. In other implementations, there may be other neural networks for behavior recognition, and are not limited herein.

The tridentate network is a scale-sensitive detection framework, and the training process also needs to be scale-sensitive. According to the behavior recognition method, the tridentate network is introduced into the deep residual error network and is improved, so that the detection accuracy of behavior recognition is high, extra parameters and extra calculated amount are avoided, and the calculated amount in the behavior recognition method is reduced.

Further, the multi-branch trident network is constructed by using the void convolution, and the void convolution is filled by adding 0 into the common convolution, so that the receptive field of the network layer is enlarged. The hole convolution has a parameter of expansion rate, and when the expansion rate is d, (d-1) spaces are added to the ordinary convolution picture. Assuming that the original convolution kernel size is k and the convolution kernel size after adding (d-1) spaces is n, then n is k + (k-1) × (d-1). Therefore, as shown in fig. 8 and 9, fig. 8 is a schematic structural diagram of an embodiment of the combined trident depth residual error model of the present application, and fig. 9 is a schematic structural diagram of another embodiment of the combined trident depth residual error model of the present application. For the hole convolution with expansion rates d 1, d 2, and d 3 for the combined tridentate depth residual model, the field becomes 3 × 3, 5 × 5, and 7 × 7, respectively.

When the cheating behaviors of the examination room are identified, in order to solve the problem that the examinee is blocked by multiple people in the examination room environment, the three-fork halberd network optimization depth residual error network based on the cavity convolution is adopted for identifying and classifying the various cheating behaviors, the head and shoulder areas of the human body and the distribution of key points in a normal behavior mode can be analyzed, the target behaviors are identified, and the accuracy of the behavior identification method can be effectively improved.

Step S704: and carrying out key point identification on the head and shoulder area by using the cascade pyramid model to obtain a thermodynamic diagram of the key points of the head and shoulder area.

And acquiring a head and shoulder area of the to-be-processed image of the human body with the target behavior, and identifying key points of the head and shoulder area through the cascaded pyramid model, thereby acquiring a thermodynamic diagram of the key points of the head and shoulder area.

When constructing the cascade pyramid model, as shown in fig. 10, fig. 10 is a schematic structural diagram of an embodiment of the cascade pyramid model of the present application, and the thermodynamic diagram of the key points in the head-shoulder region is obtained by identifying the key points in the head-shoulder region by using a top-down method according to the cascade pyramid model of the present application.

In the application, the positioning accuracy of key points of the whole network can be further improved by the cascade pyramid model. The network model has good adaptability to the key point positioning of the image background with variability, the posture and the like.

Step S705: and carrying out fusion processing on the image characteristics of the head and shoulder area and the thermodynamic diagrams of the key points of the head and shoulder area to obtain fusion characteristics.

The image characteristics obtained by adopting the combined trident depth residual error model and the thermodynamic diagrams of the key points obtained by adopting the cascade pyramid model are fused.

Fig. 11 is a schematic structural diagram of an embodiment of an overall network of the cascaded pyramid model and the combined tridentate depth residual error model according to the present application. The cascade pyramid model can extract behavior information of important skeleton key points of a human body to obtain thermodynamic diagrams of the key points of the human body, the important information of the behavior actions of the human body is hidden in coordinates of the skeleton key points due to the fact that the absolute positions and the relative positions of different skeleton key points are changed due to different actions of the human body, the combined trident depth residual error model serves as a main network, the obtained image features are general surface feature information of the image, such as texture, color and the like, and in the whole structure network, the cascade pyramid model or the combined trident depth residual error model plays a role in promoting enhancement and improvement.

Optionally, in this embodiment, the step S703 may be implemented by the following method, and the specific implementation steps include:

and performing pixel-level multiplication operation on the image characteristic and the thermodynamic diagram characteristic to obtain a fused characteristic, wherein the pixel value of the fused characteristic is (1+ alpha) times of the pixel value of the image characteristic, and alpha is the pixel value of the corresponding pixel in the thermodynamic diagram of the image characteristic. As shown in fig. 11, when the to-be-processed image of the target behavior is transmitted into the neural network, the to-be-processed image is simultaneously transmitted into two network branches, the upper branch is a trained cascade pyramid model, and a thermodynamic diagram of a series of key points can be extracted; the lower branch is the joint trident depth residual model, which is used to extract image features. And multiplying the thermodynamic diagrams of the key points and the image features at a pixel level. The pixel value size in the key point thermodynamic diagram is between 0 and 1, and the fused calculation formula is as follows:

F_a＝(1+α)F_b

pixel value F of fused features_aFor image characteristic pixel values F_bWhere α is the value of the corresponding pixel in the thermodynamic diagram for the keypoint. That is, the higher the thermodynamic map location value corresponding to a keypoint, i.e., the closer the location feature to the keypoint will be enhanced.

Step S706: and secondarily identifying target behaviors based on the fusion characteristics.

The final layer of the network is a softmax layer, and as a final classification layer, target behaviors can be identified based on the fusion features.

According to the method, the image characteristics of the head and shoulder area are obtained through the combined trident depth residual error model, the thermodynamic diagrams of the key points are obtained through the cascade pyramid model, the thermodynamic diagrams of the key points of the human body are fused with the image characteristics fully, the problem of error recognition caused by insufficient single characteristic information is solved, and the accuracy of the behavior recognition method is further improved by comparing and utilizing the single characteristic.

Optionally, as shown in fig. 12, fig. 12 is a flowchart illustrating an embodiment of a training strategy of the cascaded pyramid model and the combined tridentate depth residual model according to the present application. Before the target behavior is secondarily identified by the combined tridentate depth residual error model and the cascade pyramid model according to the preselected behavior, the target behavior needs to be trained. For example, when cheating behaviors in an examination room are identified, the specific steps include steps S801 to S805:

step S801: and collecting cheating behavior data.

Collecting monitoring video data of different examination rooms, intercepting video stream images according to a frame rate, extracting cheating behavior fragments in the video stream images, marking cheating type labels, specifically defining cheating behaviors of an examination scene, and constructing a data set for identifying the cheating behaviors of the examination scene. The cheating behavior data collection is used for statistical analysis of the target detection model and training of the cascade pyramid model and the combined trident depth residual error model.

Step S802: and independently training the cascaded pyramid model by using the data set.

Step S803: and fixing parameters of the cascade pyramid model with the key point positioned.

Step S804: and inputting the pictures in the data set into a key point positioning network with fixed parameters, and training by combining the obtained result with the image characteristics obtained by combining the tridentate depth residual error model.

Step S805: and obtaining a final training model until all the neural network models are converged.

The training strategy of the application trains the cascade pyramid model positioned by the fixed key point, fixes the parameters of the cascade pyramid model and trains the whole network, so that the training time of the network model can be effectively saved, and the training efficiency of the network model is improved.

The present application further provides a behavior recognition method, as shown in fig. 13, fig. 13 is a schematic flow chart of a fourth embodiment of the behavior recognition method of the present application. The images to be processed comprise images of a video stream acquired for the examination room. The behavior recognition method further includes steps S901 to S902:

step S901: and respectively determining a plurality of continuous video frames in the video stream as images to be processed, and determining whether the same human body in each video frame has examination room cheating behaviors.

Step S902: and based on the judgment result of whether the cheating behaviors exist in the examination room or not in the same human body in each video frame, if the duration of the cheating behaviors existing in the same human body is determined to be longer than the preset duration, the early warning information is triggered aiming at the same human body.

The method and the device have the advantages that a cheating behavior recognition alarm system is further built, when a plurality of continuous video frames in the video stream are respectively determined to be images to be processed, after the fact that the cheating behaviors of an examination hall exist in the same human body in all the video frames, the duration time of the cheating behaviors of the examination hall of the human body exceeds the preset duration time, early warning is conducted on the same human body, and a invigilator is reminded that the human body is cheating possibly.

Optionally, the present application further proposes an electronic device 100. As shown in fig. 14, fig. 14 is a schematic structural diagram of an embodiment of an electronic device 100 of the present application, where the electronic device 100 includes a processor 101 and a memory 102 connected to the processor 101, where the memory 102 stores program data, and the processor 101 executes the program data stored in the memory 102 to perform: acquiring a video stream; acquiring key point characteristics and image characteristics of a human body from a video stream; and recognizing the behavior of the human body based on the key point features and the image features.

The processor 101 may also be referred to as a Central Processing Unit (CPU). The processor 101 may be an electronic chip having signal processing capabilities. The processor 101 may also be a general purpose processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic device, discrete gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The storage 102 may be a memory bank, a TF card, etc., and may store all information in the electronic device 100, including the input raw data, the computer program, the intermediate operation result, and the final operation result, all stored in the storage 102. Which stores and retrieves information based on the location specified by the processor 101. With the memory 102, the electronic device 100 has a memory function to ensure normal operation. The storage 102 of the electronic device 100 may be classified into a main storage (internal storage) and an auxiliary storage (external storage) according to the purpose, and there is a classification method into an external storage and an internal storage. The external memory is usually a magnetic medium, an optical disk, or the like, and can store information for a long period of time. The memory refers to a storage component on the main board, which is used for storing data and programs currently being executed, but is only used for temporarily storing the programs and the data, and the data is lost when the power is turned off or the power is cut off.

Optionally, the present application further proposes a computer-readable storage medium 110. As shown in fig. 15, fig. 15 is a schematic structural diagram of an embodiment of the computer-readable storage medium 110 of the present application, and when the integrated unit of the functional units in the embodiments of the present application is implemented in the form of a software functional unit and is sold or used as a stand-alone product, the integrated unit may be stored in the computer-readable storage medium 110. Based on such understanding, the technical solution of the present application may be embodied in the form of a software product, and the computer-readable storage medium 110 includes several instructions in a program instruction 111 to enable a computer device (which may be a personal computer, a system server, or a network device, etc.), an electronic device (such as MP3, MP4, etc., and may also be a mobile terminal such as a mobile phone, a tablet computer, a wearable device, etc., or a desktop computer, etc.), or a processor to execute all or part of the steps of the method according to the embodiments of the present application.

Different from the situation of the prior art, the behavior recognition method can quickly and accurately recognize and classify the target behaviors through the head and shoulder areas and key points of the human body; the target behavior is filtered by analyzing the position distribution of the head and shoulder area and the key points of the human body in the normal behavior mode, detecting the head and shoulder area and the key points of the human body through the target detection model and presetting a preselection rule, so that the influence of part of irrelevant targets can be effectively eliminated, and the calculated amount in secondary identification can be greatly reduced; in secondary recognition, feature information can be fully fused by fusing image features and thermodynamic diagrams of key points, the problem of error recognition caused by insufficient single feature information is solved, and the accuracy of the behavior recognition classification method is further improved; in addition, the combined trident depth residual error model adopted in the secondary recognition can solve the problem that human bodies are shielded mutually, the calculated amount in the behavior recognition method can be reduced, and the accuracy of the adopted cascaded pyramid model for positioning key points of the human bodies can be improved, so that the accuracy of the whole behavior recognition method for recognizing the target behaviors is improved.

The above description is only an example of the present application and is not intended to limit the scope of the present application, and all modifications of equivalent structures and equivalent processes, which are made by the contents of the specification and the drawings, or which are directly or indirectly applied to other related technical fields, are intended to be included within the scope of the present application.

Claims

1. A method of behavior recognition, comprising:

detecting an image to be processed, and determining a head and shoulder area and key points of a human body contained in the image to be processed;

and determining whether the human body has target behaviors or not based on the head and shoulder regions and the position distribution characteristics of the key points.

2. The behavior recognition method according to claim 1, wherein the behavior recognition method is applied to an examination room, the target behavior comprises cheating behavior of the examination room, and the key points comprise hand-related sub-key points of the human body;

the determining whether the human body has a target behavior based on the head and shoulder area and the position distribution characteristics of the key points comprises:

determining a reference point for the head and shoulder region; and

and determining whether the cheating behaviors of the examination room exist in the human body or not based on the sub key points, the position distribution among the reference points and the size information of the head and shoulder areas, or determining whether the cheating behaviors of the examination room exist in the human body or not based on the size information of the head and shoulder areas.

3. The behavior recognition method according to claim 2, wherein the examination room cheating behavior comprises a table-lying behavior, the reference point comprises a bottom center point of the head-shoulder area, and the sub-key points comprise at least a finger key point and a wrist key point of the same hand of the human body;

the step of determining whether the human body has the cheating behaviors in the examination room based on the sub-key points, the position distribution among the reference points and the size information of the head and shoulder areas comprises the following steps:

determining a first reference straight line passing through a finger key point and a wrist key point of the same hand;

and determining whether the human body has the table bending behavior or not based on the distance from the bottom central point to the first reference straight line.

4. The behavior recognition method according to claim 3, wherein the determining whether the human body has the table-prone behavior based on the distance from the bottom center point to the first reference line comprises:

and determining that the human body has the table bending behavior in response to the fact that the distance from the bottom center point to the first reference straight line is larger than a first preset threshold value.

5. The behavior recognition method according to claim 2, wherein the examination room cheating behavior comprises a pass item behavior, and the reference points comprise an upper corner point, an upper boundary, a left shoulder key point and a right shoulder key point of the head and shoulder area; the sub key points comprise wrist key points and elbow key points of the same hand of the human body;

determining a second distance from the upper corner point to the corresponding wrist key point;

judging whether the second distance is larger than half of the width of the head-shoulder area or not, and whether the minimum horizontal distance between the wrist key point and the upper boundary is smaller than a second preset threshold or not;

if the second distance is greater than half of the width of the head-shoulder area and the minimum horizontal distance between the wrist key point and the upper boundary is less than the second preset threshold, determining a second reference straight line passing through the left shoulder key point and the right shoulder key point, and determining a third reference straight line passing through the elbow key point and the left shoulder key point or the right shoulder key point on the same side as the elbow key point;

and determining whether the human body has the behavior of transferring articles or not based on the second reference straight line, the third reference straight line and the wrist key point.

6. The behavior recognition method according to claim 5, wherein the sub-keypoints further comprise wrist keypoints of another hand of the human body, and the determining whether the behavior of the delivered-article exists in the human body based on the second reference straight line, the third reference straight line and the wrist keypoints comprises:

calculating the cosine distance between the second reference straight line and the third reference straight line;

calculating a third distance between the wrist keypoint and a wrist keypoint of the other hand;

judging whether the cosine distance is smaller than a third preset threshold value and whether the third distance is larger than a fourth preset threshold value;

and determining that the human body has the article transferring behavior in response to the fact that the cosine distance is smaller than the third preset threshold and the third distance is larger than the fourth preset threshold.

7. The behavior recognition method according to claim 2, wherein the examination room cheating behavior comprises a turning behavior, and the size information of the head-shoulder area comprises a width of the head-shoulder area and a height of the head-shoulder area;

the step of determining whether the cheating behaviors exist in the examination room on the basis of the size information of the head and shoulder areas comprises the following steps:

calculating a ratio of a width of the head-shoulder region to a height of the head-shoulder region;

determining whether the human body has the turning behavior based on the ratio.

8. The behavior recognition method according to claim 7, wherein the determining whether the turning behavior exists in the human body based on the ratio comprises:

and responding to the fact that the ratio is larger than a fifth preset threshold value, and determining that the turning behavior exists in the human body.

9. The behavior recognition method according to claim 1, further comprising, after the step of determining whether the human body has a target behavior based on the head-shoulder area and the position distribution characteristics of the key points:

carrying out image recognition on the head and shoulder area by using a combined trident depth residual error model to obtain image characteristics of the head and shoulder area;

performing key point identification on the head and shoulder area by using a cascade pyramid model to obtain a thermodynamic diagram of the key points of the head and shoulder area;

fusing the image features of the head and shoulder area with the thermodynamic diagrams of the key points of the head and shoulder area to obtain fused features;

and secondarily identifying the target behaviors based on the fusion characteristics.

10. The behavior recognition method according to claim 9, wherein the fusing the image features of the head-shoulder region and the thermodynamic diagrams of the key points of the head-shoulder region to obtain fused features comprises:

performing pixel-level multiplication operation on the image feature and the thermodynamic diagram feature to obtain the fused feature, wherein,

the pixel value of the fusion feature is (1+ alpha) times the pixel value of the image feature, wherein alpha is the pixel value of the corresponding pixel in the thermodynamic diagram of the image feature.

11. The behavior recognition method according to any one of claims 2 to 10, wherein the image to be processed comprises an image of a video stream acquired for an examination room, the method further comprising:

respectively determining a plurality of continuous video frames in the video stream as the images to be processed, and determining whether the cheating behaviors of the examination room exist in the same human body in each video frame;

and based on the judgment result of whether the cheating behaviors exist in the examination room or not in the same human body in each video frame, determining that the duration of the cheating behaviors existing in the same human body is longer than the preset duration, and triggering early warning information aiming at the same human body.

12. An electronic device comprising a processor and a memory coupled to the processor, wherein the memory has program data stored therein, and the processor executes the program data stored in the memory to implement:

13. A computer readable storage medium having stored therein program instructions, the program instructions being executable to implement: