CN113516092A

CN113516092A - Method and device for determining target behavior, storage medium and electronic device

Info

Publication number: CN113516092A
Application number: CN202110853411.5A
Authority: CN
Inventors: 陈升; 潘华东
Original assignee: Zhejiang Dahua Technology Co Ltd
Current assignee: Zhejiang Dahua Technology Co Ltd
Priority date: 2021-07-27
Filing date: 2021-07-27
Publication date: 2021-10-19
Anticipated expiration: 2041-07-27

Abstract

The invention discloses a method and device for determining a target behavior, a storage medium and an electronic device, wherein the method includes: in the process of tracking the target object's trajectory, performing a bone key on the first body frame of the target object. point detection, and in the case of detecting the hand and head of the target object in the first body frame, obtain first behavior preselection information; obtain the wrist detection frame and head detection frame of the target object, and when the positional relationship between the wrist detection frame and the head detection frame conforms to the preset posture, obtain second behavior preselection information; according to the first behavior preselection information and the second behavior preselection information, determine the Whether the target object performs the target behavior, wherein the target behavior includes the behavior determined based on the positional relationship between the hand and the chin of the target object. The above technical solution solves the problem that the traditional method cannot accurately identify the behavior of the teller holding the chin.

Description

Method and device for determining target behavior, storage medium and electronic device

Technical Field

The present invention relates to the field of communications, and in particular, to a method and an apparatus for determining a target behavior, a storage medium, and an electronic apparatus.

Background

Due to the deep study and wide application of the deep learning technology, the artificial intelligence product is rapidly developed. In both the reuse rate and the intelligent degree, the artificial intelligence product provides convenience for the work and life of people in various large aspects of cities.

In the financial industry, a teller needs to directly face a customer to perform service, and good service attitude and normative working behavior can improve customer satisfaction and brand awareness. In the working process of bank tellers, people hold the chin and stutter for a long time, and the working attitude is not actively reflected to the bank. And the behavior of the teller is supervised and warned in real time by adopting an artificial intelligence scheme, so that the working behavior of the teller can be reminded to a great extent, and the service awareness of the teller is improved.

In the related art, the illegal operation and behavior can be judged only by recognizing the voice of the teller and recognizing the shot image processing, but the behaviour of the chin cannot be accurately judged.

Aiming at the problem that the traditional method cannot accurately identify the behavior of the teller under the chin in the related technology, an effective solution is not provided at present.

Accordingly, there is a need for improvement in the related art to overcome the disadvantages of the related art.

Disclosure of Invention

The embodiment of the invention provides a method and a device for determining a target behavior, a storage medium and an electronic device, which are used for at least solving the problem that the behavior of a teller under the chin cannot be accurately identified by a traditional method.

According to an aspect of the embodiments of the present invention, there is provided a method for determining a target behavior, including: in the process of tracking a target object, carrying out bone key point detection on a first body frame of the target object, and obtaining first behavior preselection information under the condition that a hand and a head of the target object are detected in the first body frame; acquiring a wrist detection frame and a head detection frame of the target object, and acquiring second behavior preselection information under the condition that the position relation of the wrist detection frame and the head detection frame accords with a preset posture; and determining whether the target object executes the target behavior according to the first behavior preselection information and the second behavior preselection information, wherein the target behavior comprises a behavior determined based on the position relation of the hand and the chin of the target object.

Further, determining whether the target object performs the target behavior according to the first behavior preselection information and the second behavior preselection information includes: acquiring a wrist key point corresponding to the first behavior preselection information and a wrist detection frame corresponding to the second behavior preselection information; determining whether the wrist keypoints are located in the wrist detection frame; determining that the target object performed the target behavior if it is determined that a wrist keypoint is located in the wrist detection box; determining that the target object does not perform the target behavior if it is determined that a wrist keypoint is not located in the wrist detection box.

Further, the method further comprises: under the condition that the target object is determined to execute the target behavior according to the first behavior preselection information and the second behavior preselection information, acquiring an image of the target object at a target moment, wherein the target moment is the moment when the position relation of the wrist detection frame and the head detection frame conforms to a preset posture; inputting the image into a deep learning model to again determine whether the target object performed the target behavior.

Further, determining that the hand and the head of the target object are detected in the first body frame by: acquiring wrist key points, elbow key points and shoulder key points obtained in the process of detecting bone key points of the first body frame; determining whether the wrist keypoints are located within a head-shoulder frame of the target object; under the condition that the wrist key point is located in the head-shoulder frame, acquiring an included angle between a first vector formed by the elbow key point and the wrist key point and a second vector formed by the elbow key point and the shoulder key point; and determining whether the hand and the head of the target object are detected in the first body frame or not according to the size relation between the included angle and a preset included angle.

Further, determining that the hand and the head of the target object are detected in the first body frame according to the size relationship between the included angle and a preset included angle, includes: determining that the hand and the head of the target object are detected in the first body frame under the condition that the included angle is smaller than the preset included angle; determining that the hand and the head of the target object are not detected in the first body frame under the condition that the included angle is larger than the preset included angle.

Further, under the condition that the position relation between the wrist detection frame and the head detection frame conforms to a preset posture, second behavior preselection information is obtained, and the second behavior preselection information comprises the following steps: and obtaining the second behavior preselection information under the condition that the positions of the wrist detection frame and the head detection frame are smaller than a preset distance and the wrist detection frame and the head detection frame are overlapped.

Further, after determining whether the target object performs the target behavior according to the first behavior preselection information and the second behavior preselection information, the method further includes: and sending out an alarm event of the target behavior under the condition that the duration of the target object executing the target behavior is determined to be greater than a preset time threshold.

According to another aspect of the embodiments of the present invention, there is also provided an apparatus for determining a target behavior, including: the detection module is used for detecting bone key points of a first body frame of a target object in the process of tracking the target object, and obtaining first behavior preselection information under the condition that a hand and a head of the target object are detected in the first body frame; the acquisition module is used for acquiring a wrist detection frame and a head detection frame of the target object and acquiring second behavior preselection information under the condition that the position relation of the wrist detection frame and the head detection frame accords with a preset posture; and the determining module is used for determining whether the target object executes the target behavior according to the first behavior preselection information and the second behavior preselection information, wherein the target behavior comprises a behavior determined based on the position relation of the hand and the chin of the target object.

According to still another aspect of the embodiments of the present invention, there is also provided a computer-readable storage medium having a computer program stored therein, wherein the computer program is configured to execute the above-mentioned determination method of the target behavior when running.

According to another aspect of the embodiments of the present invention, there is also provided an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the method for determining the target behavior through the computer program.

According to the invention, in the process of tracking the target object, bone key point detection is carried out on the target object, under the condition that the hand and the head of the target object are detected, first behavior preselection information of the target behavior is obtained, the position relation of a wrist detection frame and a head detection frame is further detected, second behavior preselection information of the target behavior is obtained, and whether the target object executes the target behavior is determined according to the first behavior preselection information and the second behavior preselection information, wherein the target behavior comprises the behavior determined based on the position relation of the hand and the chin of the target object. By adopting the technical scheme, the problem that the behavior of the teller under the chin cannot be accurately identified by the traditional method is solved. Therefore, the behavior of the lower jaw of the teller can be accurately detected.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:

fig. 1 is a block diagram of a hardware configuration of a computer terminal of a determination method of a target behavior of an embodiment of the present invention;

FIG. 2 is a flow chart of a method of determining a target behavior according to an embodiment of the present invention;

FIG. 3 is a flow chart of a method of determining a target behavior according to an embodiment of the invention (two);

fig. 4 is a block diagram of the structure of a target behavior determination apparatus according to an embodiment of the present invention.

Detailed Description

In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

The method embodiments provided in the embodiments of the present application may be executed in a computer terminal or a similar computing device. Taking the example of the present invention running on a computer terminal, fig. 1 is a block diagram of a hardware structure of the computer terminal of the method for determining a target behavior according to the embodiment of the present invention. As shown in fig. 1, the computer terminal may include one or more processors 102 (only one is shown in fig. 1), wherein the processors 102 may include, but are not limited to, a Microprocessor (MPU) or a Programmable Logic Device (PLD), and a memory 104 for storing data, and in an exemplary embodiment, the computer terminal may further include a transmission device 106 for communication function and an input/output device 108. It will be understood by those skilled in the art that the structure shown in fig. 1 is only an illustration and is not intended to limit the structure of the computer terminal. For example, the computer terminal may also include more or fewer components than shown in FIG. 1, or have a different configuration with equivalent functionality to that shown in FIG. 1 or with more functionality than that shown in FIG. 1.

The memory 104 may be used to store computer programs, for example, software programs and modules of application software, such as computer programs corresponding to the method for determining the target behavior in the embodiment of the present invention, and the processor 102 executes various functional applications and data processing by running the computer programs stored in the memory 104, so as to implement the method described above. The memory 104 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory located remotely from the processor 102, which may be connected to a computer terminal over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The transmission device 106 is used for receiving or transmitting data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the computer terminal. In one example, the transmission device 106 includes a Network adapter (NIC), which can be connected to other Network devices through a base station so as to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.

In this embodiment, a method for determining a target behavior is provided, and fig. 2 is a flowchart (a) of the method for determining a target behavior according to the embodiment of the present invention, where the flowchart includes the following steps:

step S202, in the process of tracking a target object, carrying out bone key point detection on a first body frame of the target object, and obtaining first behavior preselection information under the condition that a hand and a head of the target object are detected in the first body frame;

it should be noted that the target object in the present embodiment may be understood as an animal having a brain and arms, such as a human.

Step S204, acquiring a wrist detection frame and a head detection frame of the target object, and acquiring second behavior preselection information under the condition that the position relation of the wrist detection frame and the head detection frame accords with a preset posture;

step S206, determining whether the target object executes the target behavior according to the first behavior preselection information and the second behavior preselection information, wherein the target behavior comprises a behavior determined based on the position relation of the hand and the chin of the target object.

Through the steps, in the process of tracking the target object, bone key point detection is carried out on the target object, first behavior preselection information is obtained under the condition that the hand and the head of the target object are detected, the position relation of a wrist detection frame and a head detection frame is further detected, second behavior preselection information is obtained, and whether the target object executes a target behavior or not is determined according to the first behavior preselection information and the second behavior preselection information, wherein the target behavior is used for indicating the position relation of the hand and the chin of the target object. By adopting the technical scheme, the problem that the behavior of the teller under the chin cannot be accurately identified by the traditional method is solved. Therefore, the behavior of the lower jaw of the teller can be accurately detected.

It should be noted that the target behavior in the embodiment of the present invention includes a behavior determined based on a position relationship between the hand and the chin of the target object, and may alternatively be a behavior for supporting the chin with the hand, specifically including a one-handed chin and a two-handed chin, or a behavior for supporting the chin with the hand, which is not limited in the embodiment of the present invention.

Optionally, in order to better understand how the hands and the head of the target object are detected in the first body frame in step S202, the following technical solutions may be implemented: acquiring wrist key points, elbow key points and shoulder key points obtained in the process of detecting bone key points of the first body frame; determining whether the wrist keypoints are located within a head-shoulder frame of the target object; under the condition that the wrist key point is located in the head-shoulder frame, acquiring an included angle between a first vector formed by the elbow key point and the wrist key point and a second vector formed by the elbow key point and the shoulder key point; and determining whether the hand and the head of the target object are detected in the first body frame or not according to the size relation between the included angle and a preset included angle.

That is to say, the positions of bone key points corresponding to the wrist, the elbow and the shoulder in the first body frame need to be acquired, then, whether the wrist key points are in the detection head and shoulder frame is judged, if yes, whether an included angle between a first vector formed by the elbow key points and the wrist key points and an included angle between a second vector formed by the elbow key points and the shoulder key points meets a preset included angle is needed, and whether the hand and the head of the target object are detected in the first body frame is determined according to the size relationship between the included angle between the first vector and the second vector and the preset included angle.

Specifically, if an included angle formed by the first vector and the second vector is smaller than a preset included angle, it is determined that the hand and the head of the target object are detected in the first body frame; determining that the hand and the head of the target object are not detected in the first body frame if an angle formed by the first vector and the second vector is greater than a preset angle.

In order to better understand the step S204, optionally, in a case that the positional relationship between the wrist detection frame and the head detection frame conforms to a preset posture, the second behavior preselection information is obtained, specifically, the second behavior preselection information is obtained by: and obtaining the second behavior preselection information under the condition that the positions of the wrist detection frame and the head detection frame are smaller than a preset distance and the wrist detection frame and the head detection frame are overlapped.

It should be noted that, if the target object is ready to execute the target behavior, the distance between the wrist and the head of the target object is gradually decreased, that is, as long as it is detected that the positions of the wrist detection frame and the head detection frame are smaller than the preset distance and the wrist detection frame and the head detection frame are overlapped, it can be determined that the target object has a tendency to execute the target behavior, and at this time, the second behavior preselection information is obtained.

Further, there are various execution manners of the step S206, and in an alternative embodiment, the following manners are implemented: acquiring a wrist key point corresponding to the first behavior preselection information and a wrist detection frame corresponding to the second behavior preselection information; determining whether the wrist keypoints are located in the wrist detection frame; determining that the target object performed the target behavior if it is determined that a wrist keypoint is located in the wrist detection box; determining that the target object does not perform the target behavior if it is determined that a wrist keypoint is not located in the wrist detection box.

That is, if the first behavior preselection information and the second behavior preselection information of the target object are obtained at the same time, it is necessary to determine whether the target object performs the target behavior by discriminating based on the first behavior preselection information and the second behavior preselection information. And the first behavior preselection information has hand information of the target object, namely wrist key points, and the second behavior preselection information has related information corresponding to the wrist detection frame, if the wrist key points are determined to be positioned in the wrist detection frame according to the first behavior preselection information and the second behavior preselection information, the target object can be judged to execute the target behavior, otherwise, the target behavior is not executed.

It should be noted that, if it is determined in step S206 that the target object has executed the target behavior according to the first behavior preselection information and the second behavior preselection information, an image of the target object at a target time is obtained, where the target time is a time when a positional relationship between the wrist detection frame and the head detection frame conforms to a preset posture; inputting the image into a deep learning model to again determine whether the target object performed the target behavior.

That is, there may be a certain error, that is, there is a certain probability of a determination error, in determining that the target object performs the target action based on the first action preselection information and the second action preselection information. In order to more accurately determine whether the target object executes the target behavior, the image of the target object may be acquired at a time when the position relationship between the wrist detection frame and the head detection frame conforms to the preset posture, the image is input into the deep learning model, the image is recognized by using the deep learning model, and if the deep learning model determines that the target object executes the target behavior, the target object is finally determined to execute the target behavior.

Further, if it is determined in step S206 that the target object executes the target behavior, it is further determined whether a duration of the target object executing the target behavior is greater than a preset time threshold, and an alarm event of the target behavior is issued to remind the target object when it is determined that the duration of the target object executing the target behavior is greater than the preset time threshold.

It is to be understood that the above-described embodiments are only a few, but not all, embodiments of the present invention. In order to better understand the method for determining the target behavior, the following describes the above process with reference to an embodiment, but the method is not limited to the technical solution of the embodiment of the present invention, and specifically:

in an alternative embodiment, fig. 3 is a flowchart (ii) of a method for determining a target behavior according to an embodiment of the present invention, and the specific steps are as follows:

step S302: inputting an image of a teller (corresponding to the target object in the above-described embodiment);

step S304: carrying out target detection on a human body;

step S306: tracking a target of a human body, and then respectively executing the step S308 and the step S312;

step S308: detecting the wrist and the palm of the human body, and then executing the step S310;

step S310: judging whether the secondary pre-alarm logic is met, if so, executing step S316, and if not, executing step S324;

step S312: detecting the human body joint point, and then executing step S314;

step S314: judging whether a one-time pre-alarm logic is met, if so, executing step S316, and if not, executing step S324;

step S316: judging whether the final pre-alarm logic is satisfied according to the results of the step S310 and the step S314, if so, executing the step S318, and if not, executing the step S324;

step S318: discriminating an image input model (corresponding to the deep learning model in the above embodiment);

step S320: judging whether a preset time threshold is met, if so, executing step S320, and if not, executing step S324;

step S320: alarming the target behavior;

step S324: and (6) outputting.

In order to better understand the above flow, in an alternative embodiment, the following steps may also be used to determine the target behavior, specifically:

step 1: setting a behavior time threshold (corresponding to a preset time threshold in the above embodiment) for judging that the toboggan (corresponding to the target behavior in the above embodiment) does not meet the regulation;

step 2: through a deep learning target detection algorithm, target frames such as human bodies, heads, shoulders and the like are detected in the image, and a teller tracking track is formed through multi-target associated tracking.

And step 3: detecting bone key points on the basis of obtaining a target body frame to obtain the body information of the human body;

and 4, step 4: performing hand-head relation information on the basis of acquiring bone key points, thereby completing hand-rest chin behavior preselection information (equivalent to the first behavior preselection information in the above embodiment);

and 5: performing wrist detection on the basis of a target body frame;

step 6: obtaining secondary chin supporting behavior preselection information (equivalent to second behavior preselection information in the above embodiment) through the relationship between the wrist and the target head detection frame on the basis of obtaining the wrist detection frame;

and 7: performing comprehensive treatment on the two times of preselection information in the steps 4 and 6 to form final preselection information of the behaviour of the chin below;

and 8: performing deep learning identification technology on the result of the step 6 to finally confirm whether the behavior of the chin is supported;

and step 9: if the result duration time in the step 7 is larger than a preset time threshold, finally giving an alarm for the chin-off violation;

in step 4, the main steps are as follows:

step (1): acquiring positions of bone key points corresponding to wrists, elbows and shoulders; judging whether the key point of the wrist is in the shoulder frame of the detection target head, if so, executing the step (2), otherwise, directly returning;

step (2): judging an included angle between a vector formed by the elbow to the wrist (which is equivalent to a first vector in the embodiment) and a vector formed by the elbow to the shoulder (which is equivalent to a second vector in the embodiment), if the included angle between the two vectors is smaller than a threshold (which is equivalent to a preset included angle in the embodiment), the included angle is considered as chin support preselection information, otherwise, the information is directly returned;

in step 6, the main steps are as follows:

step (1): acquiring a target wrist detection frame and a head detection frame;

step (2): judging the position relation between the wrist detection frame and the head detection frame, if the two detection frames are closer and overlapped, considering the two detection frames as secondary preselected information of the chin, otherwise, directly returning;

in step 7, the main steps are as follows:

step (1): acquiring primary chin support preselection information in the step 4 and secondary chin support preselection information in the step 6; if the two times of preselection information exist, executing the step (2), otherwise, directly returning;

step (2): judging whether the wrist key points in the corresponding preselected information in the step 4 are in the wrist detection frame of the corresponding preselected information in the step 6, if so, outputting final chin support pre-warning information, otherwise, directly returning;

in step 8, the main steps are as follows:

step (1): acquiring a large number of preselection result pictures of the step 7 and carrying out manual labeling; training a corresponding recognition model (corresponding to the deep learning model in the above embodiment);

step (2): and (4) inputting the result of the step (7) into the recognition model, and finally outputting the result of the recognition model.

In addition, according to the technical scheme of the embodiment of the invention, various technologies such as human body target detection, joint point detection, track judgment, wrist detection, target pattern recognition judgment and the like are fused to judge whether the teller has illegal action of holding the chin or not. In the fusion process of each key point, the detection result position relation and the angle relation are adopted to finish alarm preselection, finally, the alarm filtering is carried out through the identification module, the possibility of false alarm is reduced, and meanwhile, the false alarm caused by the detection error of the identification module in a certain state is eliminated through time domain accumulation.

It should be noted that, in this embodiment, the human body joint is used as the judgment for the chin holding action preselection, so that the action judgment accuracy is high; the wrist detection technology and the joint point technology are combined, so that the problem that the joint point model has poor detection effect in certain occasions is solved; the recognition model recognition is used as the final confirmation of the alarm, so that the accuracy rate of action judgment is enhanced; the alarm accuracy can be ensured by adopting time domain accumulation.

Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.

In this embodiment, a device for determining a target behavior is further provided, and the device is used to implement the foregoing embodiments and preferred embodiments, which have already been described and are not described again. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. Although the devices described in the following embodiments are preferably implemented in software, implementations in hardware or a combination of software and hardware are also possible and contemplated.

Fig. 4 is a block diagram of a structure of a target behavior determination apparatus according to an embodiment of the present invention, the apparatus including:

a detection module 42, configured to perform bone key point detection on a first body frame of a target object in a process of performing trajectory tracking on the target object, and obtain first behavior preselection information when a hand and a head of the target object are detected in the first body frame;

an obtaining module 44, configured to obtain a wrist detection frame and a head detection frame of the target object, and obtain second behavior preselection information when a position relationship between the wrist detection frame and the head detection frame matches a preset posture;

a determining module 46, configured to determine whether the target object performs a target behavior according to the first behavior preselection information and the second behavior preselection information, where the target behavior includes a behavior determined based on a position relationship between a hand and a chin of the target object.

Through the modules, in the process of tracking the target object, bone key point detection is carried out on the target object, first behavior preselection information is obtained under the condition that the hand and the head of the target object are detected, the position relation of a wrist detection frame and a head detection frame is further detected, second behavior preselection information of the target behavior is obtained, and whether the target object executes the target behavior or not is determined according to the first behavior preselection information and the second behavior preselection information, wherein the target behavior is used for indicating the position relation of the hand and the chin of the target object. By adopting the technical scheme, the problem that the behavior of the teller under the chin cannot be accurately identified by the traditional method is solved. Therefore, the behavior of the lower jaw of the teller can be accurately detected.

Optionally, the determining module 46 is further configured to obtain a wrist key point corresponding to the first behavior preselection information and a wrist detection frame corresponding to the second behavior preselection information; determining whether the wrist keypoints are located in the wrist detection frame; determining that the target object performed the target behavior if it is determined that a wrist keypoint is located in the wrist detection box; determining that the target object does not perform the target behavior if it is determined that a wrist keypoint is not located in the wrist detection box.

Optionally, the determining module 46 is further configured to, when it is determined that the target object has executed the target behavior according to the first behavior preselection information and the second behavior preselection information, obtain an image of the target object at a target time, where the target time is a time when a positional relationship between the wrist detection frame and the head detection frame conforms to a preset posture; inputting the image into a deep learning model to again determine whether the target object performed the target behavior.

Optionally, the detecting module 42 is further configured to obtain a wrist key point, an elbow key point, and a shoulder key point obtained in a process of detecting a bone key point of the first body frame; determining whether the wrist keypoints are located within a head-shoulder frame of the target object; under the condition that the wrist key point is located in the head-shoulder frame, acquiring an included angle between a first vector formed by the elbow key point and the wrist key point and a second vector formed by the elbow key point and the shoulder key point; and determining whether the hand and the head of the target object are detected in the first body frame or not according to the size relation between the included angle and a preset included angle.

Optionally, the detecting module 42 is further configured to determine that the hand and the head of the target object are detected in the first body frame when the included angle is smaller than the preset included angle; determining that the hand and the head of the target object are not detected in the first body frame under the condition that the included angle is larger than the preset included angle.

Optionally, the obtaining module 44 is further configured to obtain the second behavior preselection information when the positions of the wrist detection frame and the head detection frame are smaller than a preset distance and the wrist detection frame and the head detection frame are overlapped.

Optionally, the determining module 46 is further configured to send an alarm event of the target behavior when determining that the duration of the target object executing the target behavior is greater than a preset time threshold.

Alternatively, in the present embodiment, the storage medium may be configured to store a computer program for executing the steps of:

s1, in the process of tracking a target object, detecting bone key points of a first body frame of the target object, and obtaining first behavior preselection information under the condition that a hand and a head of the target object are detected in the first body frame;

s2, acquiring a wrist detection frame and a head detection frame of the target object, and acquiring second behavior preselection information under the condition that the position relation of the wrist detection frame and the head detection frame accords with a preset posture;

and S3, determining whether the target object executes the target behavior according to the first behavior preselection information and the second behavior preselection information, wherein the target behavior comprises a behavior determined based on the position relationship of the hand and the chin of the target object.

In an exemplary embodiment, the computer-readable storage medium may include, but is not limited to: various media capable of storing computer programs, such as a usb disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk.

For specific examples in this embodiment, reference may be made to the examples described in the above embodiments and exemplary embodiments, and details of this embodiment are not repeated herein.

Embodiments of the present invention also provide an electronic device comprising a memory having a computer program stored therein and a processor arranged to run the computer program to perform the steps of any of the above method embodiments.

Optionally, in this embodiment, the processor may be configured to execute the following steps by a computer program:

s1, in the process of tracking a target object, detecting bone key points of a first body frame of the target object, and obtaining first behavior preselection information of a target behavior under the condition that a hand and a head of the target object are detected in the first body frame;

s2, acquiring a wrist detection frame and a head detection frame of the target object, and acquiring second behavior preselection information of the target behavior under the condition that the position relation between the wrist detection frame and the head detection frame accords with a preset posture;

In an exemplary embodiment, the electronic apparatus may further include a transmission device and an input/output device, wherein the transmission device is connected to the processor, and the input/output device is connected to the processor.

It will be apparent to those skilled in the art that the various modules or steps of the invention described above may be implemented using a general purpose computing device, they may be centralized on a single computing device or distributed across a network of computing devices, and they may be implemented using program code executable by the computing devices, such that they may be stored in a memory device and executed by the computing device, and in some cases, the steps shown or described may be performed in an order different than that described herein, or they may be separately fabricated into various integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the principle of the present invention should be included in the protection scope of the present invention.

Claims

1. a determination method of target behavior, is characterized in that, comprises:

In the process of tracking the target object, bone key point detection is performed on the first body frame of the target object, and the hands and head of the target object are detected in the first body frame , get the pre-selection information of the first row;

Acquire the wrist detection frame and the head detection frame of the target object, and obtain the second behavior preselection information when the positional relationship between the wrist detection frame and the head detection frame conforms to the preset posture;

According to the first behavior pre-selection information and the second behavior pre-selection information, it is determined whether the target object performs a target behavior, wherein the target behavior includes a behavior determined based on the positional relationship between the hand and the chin of the target object.

2. The method for determining a target behavior according to claim 1, wherein determining whether the target object has performed the target behavior according to the first behavior preselection information and the second behavior preselection information, comprising:

acquiring the wrist key points corresponding to the pre-selection information of the first behavior and the wrist detection frame corresponding to the pre-selection information of the second behavior;

determining whether the wrist key point is located in the wrist detection frame;

In the case where it is determined that the wrist key point is located in the wrist detection frame, it is determined that the target object has performed the target behavior;

When it is determined that the wrist key point is not located in the wrist detection frame, it is determined that the target object does not perform the target behavior.

3. The method for determining target behavior according to claim 1, wherein the method further comprises:

When it is determined according to the first behavior preselection information and the second behavior preselection information that the target object has performed the target behavior, acquire an image of the target object at a target time, where the target time is The moment when the positional relationship between the wrist detection frame and the head detection frame conforms to the preset posture;

The image is input into a deep learning model to again determine whether the target object performed the target behavior.

4 . The method for determining target behavior according to claim 1 , wherein determining that the hand and head of the target object are detected in the first body frame in the following manner, comprising: 5 .

Acquiring wrist key points, elbow key points, and shoulder key points obtained in the process of performing bone key point detection on the first body frame;

determining whether the wrist key point is located within the head and shoulders frame of the target object;

In the case that the wrist key point is located within the head and shoulders frame, obtain a first vector formed by the elbow key point and the wrist key point and a first vector formed by the elbow key point and the shoulder key point the angle between the two vectors;

Whether the hand and head of the target object are detected in the first body frame is determined according to the magnitude relationship between the included angle and the preset included angle.

5 . The method for determining target behavior according to claim 4 , wherein, according to the size relationship between the included angle and a preset included angle, it is determined that the hand of the target object is detected in the first body frame. 6 . Department and head, including:

In the case that the included angle is smaller than the preset included angle, it is determined that the hand and the head of the target object are detected in the first body frame;

In the case that the included angle is greater than the preset included angle, it is determined that the hand and head of the target object are not detected in the first body frame.

6. The method for determining a target behavior according to claim 1, wherein when the positional relationship between the wrist detection frame and the head detection frame conforms to a preset posture, obtaining second behavior preselection information, comprising:

When the positions of the wrist detection frame and the head detection frame are less than a preset distance, and the wrist detection frame and the head detection frame overlap, the second behavior preselection information is obtained.

7 . The method for determining a target behavior according to claim 1 , wherein after determining whether the target object performs the target behavior according to the first behavior pre-selection information and the second behavior pre-selection information, the method Also includes:

When it is determined that the target object performs the target behavior for a duration greater than a preset time threshold, an alarm event of the target behavior is issued.

8. A device for determining target behavior, comprising:

The detection module is used to perform bone key point detection on the first body frame of the target object in the process of tracking the target object, and detect the hand of the target object in the first body frame In the case of and the head, get the pre-selection information of the first row;

an acquisition module, configured to acquire the wrist detection frame and the head detection frame of the target object, and obtain the second behavior preselection information when the positional relationship between the wrist detection frame and the head detection frame conforms to the preset posture;

A determination module, configured to determine whether the target object has performed the target behavior according to the first behavior pre-selection information and the second behavior pre-selection information, wherein the target behavior includes a positional relationship based on the hand and the chin of the target object determined behavior.

9. A computer-readable storage medium, wherein the computer-readable storage medium comprises a stored program, wherein the program executes the method described in any one of the preceding claims 1 to 7 when the program is run .

10. An electronic device comprising a memory and a processor, wherein a computer program is stored in the memory, and the processor is configured to execute any one of claims 1 to 7 through the computer program the method described.