CN114529874A

CN114529874A - Behavior detection method, device and equipment and readable storage medium

Info

Publication number: CN114529874A
Application number: CN202210262901.2A
Authority: CN
Inventors: 闾凡兵; 王晓晶; 王勇; 李沁
Original assignee: Changsha Hisense Intelligent System Research Institute Co ltd
Current assignee: Changsha Hisense Intelligent System Research Institute Co ltd
Priority date: 2022-03-17
Filing date: 2022-03-17
Publication date: 2022-05-24

Abstract

The application discloses a behavior detection method, a behavior detection device, behavior detection equipment and a readable storage medium. Belongs to the field of computer vision. A behavior detection method, comprising: acquiring N frames of images within a first preset time length, wherein each frame of image comprises a target obstacle and a first object, and the first object is positioned on any side of the target obstacle; identifying first objects of preset identification areas positioned at two sides of a target obstacle in each frame of image as object matching pairs; inputting each frame of image into a preset attitude classification model, determining attitude information of each object matching pair in each frame of image, and determining an attitude classification result of each frame of image according to the attitude information of each object matching pair; and when the gesture classification results of the N frames of images are that the number of the target gesture classification results meets a preset number threshold, determining that the first preset time period comprises a target behavior. According to the embodiment of the application, the detection efficiency of the barrier passing behavior can be effectively improved, and the method and the device can be effectively suitable for different detection scenes.

Description

Behavior detection method, device and equipment and readable storage medium

Technical Field

The present application relates to the field of computer vision, and in particular, to a behavior detection method, apparatus, device, and readable storage medium.

Background

The fence is a common public precaution facility and can physically realize the isolation of the space. However, in the case of the fence which is not fully enclosed, pedestrians can easily pass through the gap of the fence to perform the operation of transferring objects, namely, the barrier object transferring action occurs.

In some required scenes, the distribution of fences is generally wide, at present, the behavior of barrier passing objects can be judged through a video analysis technology, but the problems of low detection efficiency and poor scene adaptability still exist in the detection and identification of the behavior of the barrier passing objects at present.

Disclosure of Invention

The embodiment of the application provides a behavior detection method, a behavior detection device, behavior detection equipment and a readable storage medium, which can effectively improve the detection efficiency of barrier passing behavior detection and can be effectively suitable for different detection scenes.

In a first aspect, an embodiment of the present application provides a behavior detection method, including:

acquiring N frames of images within a first preset time duration, wherein each frame of image comprises a target obstacle and a first object, and the first object is positioned on any side of the target obstacle;

identifying first objects of preset identification areas positioned at two sides of a target obstacle in each frame of image as object matching pairs;

inputting each frame of image into a preset attitude classification model, determining the attitude information of each object matching pair in each frame of image, and determining the attitude classification result of each frame of image according to the attitude information of each object matching pair;

and when the gesture classification results of the N frames of images are that the number of the target gesture classification results meets a preset number threshold, determining that the first preset time period comprises a target behavior.

In some implementations of the first aspect, acquiring N frames of images within a first preset duration includes:

acquiring N frames of images within a first preset time, wherein each frame of image comprises a second object positioned on any side of a target obstacle;

determining the position information of a second object in each frame of image according to a preset identification algorithm;

and corresponding to each frame of image, determining a first object meeting the preset position screening condition from the second objects according to the position information of the second objects and the preset position screening condition, wherein the preset position screening condition is that the distance between the second object and the target obstacle is smaller than a preset distance threshold.

In some realizations of the first aspect, the preset recognition area includes a first recognition area located on one side of the target obstacle in each frame of image and a second recognition area located on the other side of the target obstacle; the method for recognizing the first objects of the preset recognition areas positioned at two sides of the target obstacle in each frame of image as object matching pairs comprises the following steps:

and respectively matching the first object positioned in the first identification area with the first object positioned in the second identification area corresponding to each frame of image to obtain an object matching pair included in each frame of image.

In some implementations of the first aspect, for each frame of image, matching a first object located in a first identification region with a first object located in a second identification region respectively to obtain a matching pair of objects included in each frame of image, including:

corresponding to each frame of image, acquiring a first identification frame corresponding to a first object positioned in a first identification area, and acquiring a second identification frame corresponding to the first object positioned in a second identification area;

extending the target frame of the first identification frame by a preset length along a first direction to obtain a third identification frame, wherein the first direction is a direction from the first identification area to the second identification area;

and when the position relation between the third recognition frame and the second recognition frame meets a preset condition, matching the first object in the third recognition frame with the first object in the second recognition frame to obtain an object matching pair included in each frame of image.

In some implementation manners of the first aspect, when the position relationship between the third recognition frame and the second recognition frame satisfies a preset condition, matching the first object in the third recognition frame with the first object in the second recognition frame to obtain an object matching pair included in each frame of image, including:

determining the overlapping area of each third recognition frame and each second recognition frame according to the position relation of the third recognition frame and the second recognition frame;

and corresponding to each frame of image, when the overlapping area of the third recognition frame and the second recognition frame meets a preset area condition, determining that the first object in the third recognition frame and the first object in the second recognition frame are matched as an object matching pair.

In some realizations of the first aspect, the preset area condition is that a ratio of an overlapping area of the third recognition frame and the second recognition frame to an area of the second recognition frame satisfies a preset proportion threshold.

In a second aspect, an embodiment of the present application provides a behavior detection apparatus, including:

the system comprises an acquisition module, a processing module and a display module, wherein the acquisition module is used for acquiring N frames of images within a first preset time length, each frame of image comprises a target obstacle and a first object, and the first object is positioned on any side of the target obstacle;

the identification module is used for identifying first objects of preset identification areas positioned at two sides of the target barrier in each frame of image as object matching pairs;

the processing module is used for inputting each frame of image into a preset posture classification model, determining the posture information of each object matching pair in each frame of image, and determining the posture classification result of each frame of image according to the posture information of each object matching pair;

and the processing module is further used for determining that the first preset time duration comprises the target behavior when the gesture classification results of the N frames of images are that the number of the target gesture classification results meets a preset number threshold.

In a third aspect, the present application provides a behavior detection device, comprising: a processor and a memory storing computer program instructions; the processor, when executing the computer program instructions, implements the behavior detection method of the first aspect or any of the realizable forms of the first aspect.

In a fourth aspect, the present application provides a computer-readable storage medium, on which computer program instructions are stored, and the computer program instructions, when executed by a processor, implement the behavior detection method in the first aspect or any of the implementable manners of the first aspect.

In a fifth aspect, the present application provides a computer program product, and when executed by a processor of an electronic device, the instructions of the computer program product cause the electronic device to perform the behavior detection method according to the first aspect or any implementable manner of the first aspect.

The embodiment of the application provides a behavior detection method, a behavior detection device, behavior detection equipment and a readable storage medium. The method comprises the steps of obtaining N frames of images within a first preset time length, wherein each frame of image comprises a target obstacle and a first object, and the first object is located on any side of the target obstacle. Then, identifying first objects of preset identification areas positioned at two sides of the target barrier in each frame of image as object matching pairs, so as to perform preliminary judgment on users who may interact; then, inputting each frame of image into a preset posture classification model, determining the posture information of each object matching pair in each frame of image, and determining the posture classification result of each frame of image according to the posture information of each object matching pair; when the gesture classification results of the N frames of images are that the number of the target gesture classification results meets a preset number threshold, the target behaviors included in the first preset time can be quickly and accurately determined. Therefore, the object passing behaviors of different pedestrians can be better judged, and the method can be better suitable for different detection scenes.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the embodiments of the present application will be briefly described below, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a schematic flow chart of a behavior detection method provided in an embodiment of the present application;

FIG. 2 is a schematic diagram of an acquired image according to an embodiment of the present disclosure;

FIG. 3 is a schematic diagram of a first object after screening according to an embodiment of the present disclosure;

FIG. 4 is a schematic diagram of a moved target frame according to an embodiment of the present disclosure;

FIG. 5 is a schematic diagram of pose information of an object matching pair provided by an embodiment of the present application;

fig. 6 is a schematic structural diagram of a behavior detection device according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of a behavior detection device according to an embodiment of the present application.

Detailed Description

Features and exemplary embodiments of various aspects of the present application will be described in detail below, and in order to make objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail below with reference to the accompanying drawings and specific embodiments. It should be understood that the specific embodiments described herein are intended to be illustrative only and are not intended to be limiting. It will be apparent to one skilled in the art that the present application may be practiced without some of these specific details. The following description of the embodiments is merely intended to provide a better understanding of the present application by illustrating examples thereof.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

In some scenarios, the distribution of fences is typically wide, for example, a security corridor of a subway typically includes a large number of fences, and barrier delivery behavior can pose an unknown safety hazard for the subway. For example, if the delivered goods are dangerous goods, the safety of passengers is threatened, and because the number of fences is large, the behavior of staring at the barriers by manpower is very difficult.

At present, through a video analysis technology, judgment can be made on whether a barrier object transferring behavior occurs, but the problems of low detection efficiency and poor scene adaptability still exist in the detection and identification of the barrier object transferring behavior at present.

In view of the above, embodiments of the present application provide a behavior detection method, apparatus, device, and computer-readable storage medium, which can perform preliminary judgment on a user who may interact with a behavior detection apparatus by acquiring N frames of images within a first preset duration; and then, inputting each frame of image into a preset posture classification model, determining the posture information of each object matching pair in each frame of image in the N frames of images, and obtaining the posture classification node of the N frames of images, thereby realizing the purpose of quickly and accurately determining the target behavior included in the first preset time. Therefore, the object passing behaviors of different pedestrians can be better judged, and the method can be better suitable for different detection scenes.

The following describes a behavior detection method provided in an embodiment of the present application with reference to the drawings. Fig. 1 shows a schematic flow chart of a behavior detection method according to an embodiment of the present application. As shown in fig. 1, the method may include the steps of:

and step 110, acquiring N frames of images within a first preset time duration.

Wherein each frame of image comprises a target obstacle and a first object, the first object being located on either side of the target obstacle.

And 120, identifying first objects of preset identification areas positioned at two sides of the target obstacle in each frame of image as object matching pairs.

Step 130, inputting each frame of image into a preset posture classification model, determining posture information of each object matching pair in each frame of image, and determining a posture classification result of each frame of image according to the posture information of each object matching pair.

Step 140, when the gesture classification results of the N frames of images are that the number of the target gesture classification results meets a preset number threshold, determining that the first preset time period includes a target behavior.

Specific implementations of the above steps will be described in detail below.

According to the embodiment of the application, N frames of images within a first preset time length are obtained, wherein each frame of image comprises a target obstacle and a first object, and the first object is located on any side of the target obstacle. Then, identifying first objects of preset identification areas positioned at two sides of the target barrier in each frame of image as object matching pairs, so as to perform preliminary judgment on users who may interact; then, inputting each frame of image into a preset posture classification model, determining the posture information of each object matching pair in each frame of image, and determining the posture classification result of each frame of image according to the posture information of each object matching pair; when the gesture classification results of the N frames of images are that the number of the target gesture classification results meets a preset number threshold, the target behaviors included in the first preset time can be quickly and accurately determined. Therefore, the object passing behaviors of different pedestrians can be better judged, and the method can be better suitable for different detection scenes.

Specific implementations of the above steps are described below.

Specifically, step 110 is involved, acquiring N frames of images within a first preset duration. Wherein each frame of image comprises a target obstacle and a first object, the first object being located on either side of the target obstacle.

In some scenarios where barrier pass behavior needs to be detected, image capture may be performed by an image capture device, where the image capture device may be, for example, a camera, a vision sensor, or other device capable of image capture. The image acquisition device can acquire images in a behavior detection scene in real time, wherein the behavior detection scene can be, for example, a place of activity requiring security inspection, and barriers such as fences and the like for isolation are arranged in the security inspection process.

Specifically, the first object is an object capable of performing a target action, for example, a pedestrian, a robot arm, or the like, and the first object is not particularly limited herein. It will be appreciated that the first object may be located on one of the sides of the target obstacle. For example, when the first object is a pedestrian, the pedestrian may be located at any side of the fence. In some embodiments, all objects in the captured image may be directly subjected to behavior detection. Optionally, the first object may be subjected to a screening process to improve the accuracy of the behavior detection result.

In some embodiments, the image data captured by the image capture device may be video or the like. Optionally, the acquired image data may be cached in a database to facilitate extraction of the desired image when behavior detection is desired.

Specifically, according to the behavior detection requirement, N frames of images within a first preset duration may be acquired, where N is a positive integer.

Taking the image data as a video, the first preset time period may be set according to a specific detection requirement, for example, the first preset time period may be 5 seconds, and the first preset time period is not specifically limited herein. For example, for the entire video data, one image may be extracted every preset number of frames, so that N frames of images for behavior detection may be obtained. Alternatively, the preset frame number may be, for example, 5 sheets, and the preset frame number is not specifically limited herein. Therefore, the data processing amount can be effectively increased, and the speed of acquiring the behavior detection result is increased.

In some embodiments, each frame of image may be subjected to noise reduction according to a preset image noise reduction algorithm, so as to improve accuracy of a detection result of the behavior detection. For example, the predetermined noise reduction algorithm may be a gaussian filter algorithm, wherein the gaussian filter algorithm may be represented by formula (1).

Where x represents the abscissa of the pixel, y represents the ordinate of the pixel, and σ is a predetermined constant, and may be 1.5, where σ represents the width of the gaussian function.

In some embodiments, in order to improve the accuracy of the behavior detection result, in particular, the following steps may be referred to for obtaining the screened first object:

and step 111, acquiring N frames of images within a first preset time duration. Wherein each frame of image comprises a second object located on either side of the target obstacle; step 112, determining the position information of the second object in each frame of image according to a preset identification algorithm; and 113, corresponding to each frame of image, determining a first object meeting a preset position screening condition from second objects according to the position information of the second objects and the preset position screening condition, wherein the preset position screening condition is that the distance between the second objects and a target obstacle is smaller than a preset distance threshold.

Specifically, the second object is an object capable of performing a target action, for example, a pedestrian, a robot arm, or the like, and the second object is not particularly limited herein. Each frame of the N frames of images can be identified according to a preset identification algorithm, so that the second object included in each frame of image can be obtained. For example, if the second object is a pedestrian, each pedestrian in the obtained image may be identified according to a preset pedestrian identification algorithm. For example, the preset pedestrian recognition algorithm may be yolov5 detection algorithm.

According to a preset recognition algorithm, second objects included in an image can be recognized, and position information of each second object is obtained, taking the second object as a pedestrian as an example, fig. 2 is a schematic diagram of a captured image provided by an embodiment of the present application, and the image includes a target obstacle 201, pedestrians a1, a2, A3 located on one side of the target obstacle, and pedestrians B1, B2, B3, B4 located on the other side of the target obstacle. It will be appreciated that the identification frame for each pedestrian within an image is exemplary and is not included in the actual captured image.

In some embodiments, according to a preset recognition algorithm, position information of a pedestrian in each frame of image is obtained, the position information of the position of the central point of the pedestrian may be used as the position information of the pedestrian, or position information of a preset border in the recognition frame may be selected as the position information of the pedestrian, taking the pedestrian a2 on the left side of the target obstacle as an example, the preset border may be a right border a21 of the recognition frame, taking the pedestrian B2 on the right side of the target obstacle as an example, and the preset border may be a left border B23 of the recognition frame.

According to the position information of the position of each second object, the distance between each second object and the target obstacle can be obtained, and as shown in fig. 2, that is, after the position information of each pedestrian is obtained, the distance between each pedestrian and the target obstacle 201 can be obtained.

Next, the second object may be screened according to a preset position screening condition, for example, the second object whose distance from the target obstacle is smaller than a preset distance threshold is selected as the first object for behavior detection. The preset distance threshold may be set according to an actual application scenario, and the preset distance threshold is not specifically limited herein.

According to the embodiment of the application, the data processing amount in the behavior detection process can be obviously reduced by screening out part of objects in advance, the speed of obtaining the behavior identification result is facilitated, the calculation error in the data processing process is reduced, and therefore the accuracy of the behavior detection result is improved.

Optionally, as shown in fig. 2, the system further includes preset

guard lines

202 and 203 respectively located at two sides of the target obstacle 201, where the positions of the preset guard lines may be adjusted according to factors such as an angle and a size when an image is acquired, and the positions of the guard lines are not specifically limited herein.

And screening the second object according to the position information of the second object and a preset screening condition, namely according to the position information of the second object and the position of a preset guard ring line, specifically, for example, when the second object is overlapped with the guard ring line or the second object is positioned in the preset guard ring line, the distance between the second object and the target obstacle can be considered to be smaller than a preset distance threshold value, so that the screening of the second object is realized.

After the first object in the image is determined, step 120 may next be performed.

Specifically, the preset recognition area may include a first recognition area located at one side of the target obstacle in each frame of image and a second recognition area located at the other side of the target obstacle. Therefore, corresponding to each frame of image, the first object located in the first identification area and the first object located in the second identification area are respectively matched, and the object matching pair included in each frame of image is obtained.

For brevity, the first object obtained after screening is taken as an example to explain the embodiments of the present application. For example, fig. 3 is a schematic diagram of a screened first object provided in an embodiment of the present application. As shown in fig. 3, the first recognition area 301 at one side of the target obstacle includes first objects a1, a2, and the second recognition area 302 at the other side of the target obstacle includes first objects B1, B3. In the matching process, the first object in the first recognition area is matched with the first object in the second recognition area respectively, so that object matching pairs of A1-B1, A1-B3, A2-B1 and A2-B3 can be obtained.

In some embodiments, in order to reduce the data processing amount, when acquiring the object matching pairs included in each frame of image, specifically, the following steps may also be performed: corresponding to each frame of image, acquiring a first identification frame corresponding to a first object positioned in a first identification area, and acquiring a second identification frame corresponding to the first object positioned in a second identification area; extending the target frame of the first identification frame by a preset length along a first direction to obtain a third identification frame, wherein the first direction is a direction from the first identification area to the second identification area; and when the position relation between the third recognition frame and the second recognition frame meets a preset condition, matching the first object in the third recognition frame with the first object in the second recognition frame to obtain an object matching pair included in each frame of image.

For example, referring to fig. 3, in the first recognition area 301, the target frame a11 of the first recognition frame corresponding to the first object a1 and the target frame a21 of the first recognition frame corresponding to the first object a2 are respectively moved by a preset distance in the first direction X, so that a third recognition frame corresponding to each first object can be obtained. Illustratively, the position of the target frame a21 after moving a preset distance is shown as a211 in fig. 4.

In some embodiments, the size of the preset distance may be determined according to the width of the original recognition frame, for example, the width of the recognition frame obtained after the target border extends the preset distance is a preset multiple of the original recognition frame, for example, the width of the recognition frame obtained after the target border extends the preset distance is 2 times of the original recognition frame, which is not specifically limited herein.

And corresponding to each frame of image, extending the target frame of the first identification frame corresponding to each first object in the first identification area by a preset distance along the first direction, and then obtaining a third identification frame corresponding to each first object. Alternatively, the preset condition may be, for example, whether the third recognition frame and the second recognition frame overlap, and if so, matching the first object in the third recognition frame with the first object in the second recognition frame, so as to obtain an object matching pair. Therefore, the number of the object matching pairs in each frame of image can be reduced, and the detection speed of the whole behavior detection process can be effectively improved and the adaptability of the application scene of the behavior detection can be improved due to the simple calculation process.

In some embodiments, obtaining the object matching pair included in each frame of image may further include: determining the overlapping area of each third recognition frame and each second recognition frame according to the position relation of the third recognition frame and the second recognition frame; and corresponding to each frame of image, when the overlapping area of the third recognition frame and the second recognition frame meets a preset area condition, determining that the first object in the third recognition frame and the first object in the second recognition frame are matched as an object matching pair.

Specifically, after the third identification frames are obtained, the overlapping area of each third identification frame and the second identification frame may be calculated, so that the matching pair corresponding to each frame of image may be more accurately determined according to the overlapping area and a preset area condition, and specifically, the preset area condition may be that a ratio of the overlapping area of the third identification frame and the second identification frame to the area of the second identification frame satisfies a preset proportion threshold.

According to the method and the device, the number of the object matching pairs in each frame of image can be reduced, and the detection speed of the whole behavior detection process can be effectively improved due to the simple calculation process.

In some embodiments, optionally, in order to improve the accuracy of the object matching pair, the reduction step of the object matching pair is performed on a first object in another identification area in the same image, that is, in a case that the first object in the first identification area is not moving, a target frame of a second identification frame corresponding to the first object in the second identification area is extended by a preset distance in a second direction, so as to obtain a fourth identification frame. Wherein the second direction may be a direction pointing from the second recognition area to the first recognition area; and then determining the object matching pair in each frame of image according to the position relation between the fourth identification frame and the first identification frame. And then, further determining the object matching pairs in the image according to a preset area condition.

After the reduction step of the object matching pairs is respectively executed on the first objects of the two identification areas in the same image, two groups of object matching pairs can be obtained corresponding to one frame of image, and then the two groups of object matching pairs can be combined and the duplication removing operation is executed, so that the object matching pairs can be more accurate.

It is to be understood that the first recognition area and the second recognition area are only used for convenience of description and distinguishing the areas on both sides of the target obstacle, and are not used as specific limitations for the areas on both sides of the target obstacle in the embodiments of the present application.

After obtaining the object matching pairs included in each frame of image, step 130 may be performed next to detect pose information of the object matching pairs.

Specifically, step 130 is involved, inputting each frame of image into a preset pose classification model, determining pose information of each object matching pair in each frame of image, and determining a pose classification result of each frame of image according to the pose information of each object matching pair. The preset posture classification model is a model capable of detecting the posture of the object matching pair.

For example, the preset pose classification model may include a skeleton detection algorithm, such as an alphaphase algorithm. Specifically, the input first image can be subjected to spatial transformation through a skeleton detection algorithm to obtain a more accurate human body candidate region, and then, the human body candidate region can be subjected to posture detection to obtain a skeleton image of a person as posture information of an object matching pair. Alternatively, the pose of the human body candidate region may only reserve the arm and arm parts of the skeleton map and generate the final skeleton map as the pose information of the object matching pair, for example, fig. 5 shows a schematic diagram of the pose information of one object matching pair provided by the embodiment of the present application, which includes the pose information of the object matching pair a1-B1, and as shown in the figure, the pose information includes the skeleton map of a1 and the skeleton map of B1. And then, mapping the estimated skeleton map back to the original first image through space conversion, and determining the posture classification result of each frame of image according to the posture information of each object matching pair.

The preset posture classification model determines a posture classification result of each frame of image according to the posture information of each object matching pair, and specifically indicates that the image includes a barrier delivery behavior when the posture classification result is a target posture classification result, and indicates that the image does not include the barrier delivery behavior when the posture classification result is a non-target posture classification result.

According to the embodiment of the application, the preliminary behavior detection result of each image can be rapidly determined, and the accuracy of the behavior detection result is improved. Step 140 may also be performed next.

Referring to step 140, when the gesture classification results of the N frames of images are that the number of the target gesture classification results meets a preset number threshold, it is determined that the target behavior is included in a first preset time period.

Specifically, for N frames of images within a first preset time period, when the gesture classification result of the N frames of images is that the number of the target gesture classification results satisfies a preset number threshold, it may be determined that the first preset time period includes a target behavior, that is, a barrier passing behavior occurs within the first preset time period. For example, the preset number threshold is, for example, 3, and when the N posture classification results include more than 3 target posture classification results, it may be determined that the barrier article delivering behavior is included in the first preset time period. In some embodiments, the preset number threshold may be set according to an actual detection requirement, for example, the size of the preset number threshold may be adjusted according to the size of N.

According to the embodiment of the application, the target behavior included in the first preset time can be quickly and accurately determined. Therefore, the object passing behaviors of different pedestrians can be better judged, and the method can be better suitable for different detection scenes.

Based on the same inventive concept, the present application further provides a behavior detection apparatus 600 corresponding to the behavior detection method, which is described in detail below with reference to fig. 6. Fig. 6 is a schematic structural diagram of a behavior detection apparatus according to an embodiment of the present application, and as shown in fig. 6, the behavior detection apparatus 600 may include:

the acquiring module 610 is configured to acquire N frames of images within a first preset time duration, where each frame of image includes a target obstacle and a first object, and the first object is located on any side of the target obstacle;

the identification module 620 is configured to identify first objects in preset identification areas located on two sides of the target obstacle in each frame of image as object matching pairs;

the processing module 630 is configured to input each frame of image into a preset posture classification model, determine posture information of each object matching pair in each frame of image, and determine a posture classification result of each frame of image according to the posture information of each object matching pair;

the processing module 630 is further configured to determine that the first preset time duration includes the target behavior when the pose classification result of the N frames of images is that the number of the target pose classification results satisfies a preset number threshold.

In some embodiments, the obtaining module 610 is further configured to obtain N frames of images within a first preset time duration, where each frame of image includes a second object located on any side of the target obstacle;

the processing module 630 is further configured to determine, according to a preset recognition algorithm, position information of a second object in each frame of image;

the processing module 630 is further configured to determine, corresponding to each frame of image, a first object meeting a preset position screening condition from the second objects according to the position information of the second object and the preset position screening condition, where the preset position screening condition is that a distance between the second object and the target obstacle is smaller than a preset distance threshold.

In some embodiments, the preset recognition area includes a first recognition area located at one side of the target obstacle in each frame of image and a second recognition area located at the other side of the target obstacle;

the identifying module 620 is further configured to match the first object located in the first identifying area with the first object located in the second identifying area, respectively, corresponding to each frame of image, so as to obtain an object matching pair included in each frame of image.

In some embodiments, the obtaining module 610 is further configured to obtain, for each frame of image, a first recognition frame corresponding to the first object located in the first recognition area, and obtain a second recognition frame corresponding to the first object located in the second recognition area;

the processing module 630 is further configured to extend the target border of the first recognition frame by a preset length along a first direction to obtain a third recognition frame, where the first direction is a direction from the first recognition area to the second recognition area;

the processing module 630 is further configured to, when the position relationship between the third recognition frame and the second recognition frame meets a preset condition, match the first object in the third recognition frame with the first object in the second recognition frame to obtain an object matching pair included in each frame of image.

In some embodiments, the processing module 630 is further configured to determine an overlapping area of each third recognition frame and the second recognition frame according to the position relationship between the third recognition frame and the second recognition frame;

the processing module 630 is further configured to, for each frame of image, determine that the first object in the third recognition frame matches the first object in the second recognition frame as an object matching pair when an overlapping area of the third recognition frame and the second recognition frame satisfies a preset area condition.

In some embodiments, the preset area condition is that a ratio of an overlapping area of the third recognition frame and the second recognition frame to an area of the second recognition frame satisfies a preset proportion threshold.

It can be understood that the behavior detection apparatus 200 according to the embodiment of the present application may correspond to an execution main body of the behavior detection method provided in the embodiment of the present application, and specific details of operations and/or functions of each module/unit of the behavior detection apparatus 200 may refer to the description of the corresponding part in the behavior detection method in fig. 1 in the embodiment of the present application, and are not described herein again for brevity.

The behavior detection device of the embodiment of the application acquires N frames of images within a first preset time, wherein each frame of image comprises a target obstacle and a first object, and the first object is located on any side of the target obstacle. Then, identifying first objects of preset identification areas positioned at two sides of the target barrier in each frame of image as object matching pairs, thereby being capable of carrying out preliminary judgment on users who are likely to interact; then, inputting each frame of image into a preset posture classification model, determining the posture information of each object matching pair in each frame of image, and determining the posture classification result of each frame of image according to the posture information of each object matching pair; when the gesture classification results of the N frames of images are that the number of the target gesture classification results meets a preset number threshold, the target behaviors included in the first preset time can be quickly and accurately determined. Therefore, the object passing behaviors of different pedestrians can be better judged, and the method can be better suitable for different detection scenes.

Based on the same inventive concept, the present application further provides a behavior detection device 700 corresponding to the behavior detection method, which is described in detail below with reference to fig. 7. Fig. 7 shows a schematic structural diagram of a behavior detection device according to an embodiment of the present application. As shown in fig. 7, the apparatus may include a processor 701 and a memory 702 storing computer program instructions.

Specifically, the processor 701 may include a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits configured to implement the embodiments of the present Application.

Memory 702 may include a mass storage for information or instructions. By way of example, and not limitation, memory 702 may include a Hard Disk Drive (HDD), a floppy Disk Drive, flash memory, an optical Disk, a magneto-optical Disk, tape, or a Universal Serial Bus (USB) Drive or a combination of two or more of these. In one example, memory 702 may include removable or non-removable (or fixed) media, or memory 702 is non-volatile solid-state memory. The memory 702 may be internal or external to the activity detection device.

The memory may include Read Only Memory (ROM), Random Access Memory (RAM), magnetic disk storage media devices, optical storage media devices, flash memory devices, electrical, optical, or other physical/tangible memory storage devices. Thus, in general, the memory includes one or more tangible (non-transitory) computer-readable storage media (e.g., memory devices) encoded with software comprising computer-executable instructions and when the software is executed (e.g., by one or more processors), it is operable to perform operations described with reference to the methods according to an aspect of the present disclosure.

The processor 701 reads and executes the computer program instructions stored in the memory 702 to implement the method described in the embodiment of the present application, and achieves the corresponding technical effect achieved by executing the method in the embodiment of the present application, which is not described herein again for brevity.

In one example, the behavior detection device may also include a communication interface 703 and a bus 710. As shown in fig. 7, the processor 701, the memory 702, and the communication interface 703 are connected by a bus 710 to complete mutual communication.

The communication interface 703 is mainly used for implementing communication between modules, apparatuses, units and/or devices in this embodiment of the application.

Bus 710 comprises hardware, software, or both to couple the components of the online information traffic charging apparatus to each other. By way of example, and not limitation, a Bus may include an Accelerated Graphics Port (AGP) or other Graphics Bus, an Enhanced Industry Standard Architecture (EISA) Bus, a Front-Side Bus (Front Side Bus, FSB), a Hyper Transport (HT) interconnect, an Industry Standard Architecture (ISA) Bus, an infiniband interconnect, a Low Pin Count (LPC) Bus, a memory Bus, a Micro Channel Architecture (MCA) Bus, a Peripheral Component Interconnect (PCI) Bus, a PCI-Express (PCI-X) Bus, a Serial Advanced Technology Attachment (SATA) Bus, a video electronics standards association local (VLB) Bus, or other suitable Bus or a combination of two or more of these. Bus 710 may include one or more buses, where appropriate. Although specific buses are described and shown in the embodiments of the application, any suitable buses or interconnects are contemplated by the application.

The behavior detection device can execute the behavior detection method in the embodiment of the application, so that the corresponding technical effect of the behavior detection method described in the embodiment of the application is achieved.

In addition, in combination with the behavior detection method in the foregoing embodiment, the embodiment of the present application may be implemented by providing a readable storage medium. The readable storage medium having stored thereon computer program instructions; the computer program instructions, when executed by a processor, implement any of the behavior detection methods in the above embodiments. Examples of a readable storage medium may be a non-transitory machine-readable medium, such as an electronic circuit, a semiconductor Memory device, a Read-Only Memory (ROM), a floppy disk, a Compact Disc Read-Only Memory (CD-ROM), an optical disk, a hard disk, and so forth.

It is to be understood that the present application is not limited to the particular arrangements and instrumentality described above and shown in the attached drawings. A detailed description of known methods is omitted herein for the sake of brevity. In the above embodiments, several specific steps are described and shown as examples. However, the method processes of the present application are not limited to the specific steps described and illustrated, and those skilled in the art can make various changes, modifications, and additions or change the order between the steps after comprehending the spirit of the present application.

The functional blocks shown in the above-described structural block diagrams may be implemented as hardware, software, firmware, or a combination thereof. When implemented in hardware, it may be, for example, an electronic Circuit, an Application Specific Integrated Circuit (ASIC), suitable firmware, plug-in, function card, or the like. When implemented in software, the elements of the present application are the programs or code segments used to perform the required tasks. The program or code segments may be stored in a machine-readable medium or transmitted by a data signal carried in a carrier wave over a transmission medium or a communication link. A "machine-readable medium" may include any medium that can store or transfer information. Examples of a machine-readable medium include electronic circuits, semiconductor Memory devices, Read-Only memories (ROMs), flash memories, Erasable Read-Only memories (EROMs), floppy disks, Compact disk Read-Only memories (CD-ROMs), optical disks, hard disks, optical fiber media, Radio Frequency (RF) links, and so forth. The code segments may be downloaded via computer networks such as the internet, intranet, etc.

It should also be noted that the exemplary embodiments mentioned in this application describe some methods or systems based on a series of steps or devices. However, the present application is not limited to the order of the above-described steps, that is, the steps may be performed in the order mentioned in the embodiments, may be performed in an order different from the order in the embodiments, or may be performed simultaneously.

In addition, in combination with the behavior detection method, apparatus, and readable storage medium in the foregoing embodiments, the embodiments of the present application may be implemented by providing a computer program product. The instructions in the computer program product, when executed by a processor of an electronic device, cause the electronic device to perform any of the behavior detection methods in the above embodiments.

Aspects of the present disclosure are described above with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, enable the implementation of the functions/acts specified in the flowchart and/or block diagram block or blocks. Such a processor may be, but is not limited to, a general purpose processor, a special purpose processor, an application specific processor, or a field programmable logic circuit. It will also be understood that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware for performing the specified functions or acts, or combinations of special purpose hardware and computer instructions.

As described above, only the specific embodiments of the present application are provided, and it can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the system, the module and the unit described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. It should be understood that the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the present application, and these modifications or substitutions should be covered within the scope of the present application.

Claims

1. A method of behavior detection, comprising:

and when the gesture classification results of the N frames of images are that the number of the target gesture classification results meets a preset number threshold, determining that the target behaviors are included in the first preset time.

2. The method according to claim 1, wherein the acquiring N frames of images within a first preset time duration comprises:

acquiring N frames of images within a first preset time, wherein each frame of image comprises a second object positioned on any side of the target obstacle;

determining the position information of the second object in each frame of image according to a preset identification algorithm;

and corresponding to each frame of image, determining a first object meeting a preset position screening condition from the second objects according to the position information of the second objects and the preset position screening condition, wherein the preset position screening condition is that the distance between the second object and the target obstacle is smaller than a preset distance threshold.

3. The method according to claim 1, wherein the preset recognition area comprises a first recognition area located at one side of a target obstacle and a second recognition area located at the other side of the target obstacle in each frame of image; the method for recognizing the first objects of the preset recognition areas positioned at two sides of the target obstacle in each frame of image is a matched object pair, and comprises the following steps:

4. The method of claim 3, wherein the matching the first object located in the first identification region with the first object located in the second identification region for each frame of image to obtain a matching pair of objects included in each frame of image comprises:

corresponding to each frame of image, acquiring a first identification frame corresponding to a first object positioned in the first identification area, and acquiring a second identification frame corresponding to a first object positioned in the second identification area;

extending a target frame of the first identification frame by a preset length along a first direction to obtain a third identification frame, wherein the first direction is a direction from the first identification area to the second identification area;

and when the position relation between the third identification frame and the second identification frame meets a preset condition, matching the first object in the third identification frame with the first object in the second identification frame to obtain an object matching pair included in each frame of image.

5. The method according to claim 4, wherein when the positional relationship between the third recognition frame and the second recognition frame satisfies a preset condition, matching the first object in the third recognition frame with the first object in the second recognition frame to obtain an object matching pair included in each frame of image, includes:

determining the overlapping area of each third recognition frame and the second recognition frame according to the position relation of the third recognition frame and the second recognition frame;

and corresponding to each frame of image, when the overlapping area of the third identification frame and the second identification frame meets a preset area condition, determining that the first object in the third identification frame is matched with the first object in the second identification frame to be the object matching pair.

6. The method according to claim 5, wherein the preset area condition is that a ratio of an overlapping area of the third recognition frame and the second recognition frame to an area of the second recognition frame satisfies a preset ratio threshold.

7. A behavior detection device, characterized in that the device comprises:

the processing module is further configured to determine that the target behavior is included within the first preset duration when the gesture classification results of the N frames of images are that the number of target gesture classification results satisfies a preset number threshold.

8. A behavior detection device, characterized in that the device comprises: a processor, and a memory storing computer program instructions;

the processor reads and executes the computer program instructions to implement the behavior detection method according to any of claims 1-6.

9. A readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the behaviour detection method according to any one of claims 1 to 6.

10. A computer program product, characterized in that instructions in the computer program product, when executed by a processor of an electronic device, cause the electronic device to perform the behavior detection method according to any of claims 1-6.