CN113452954B

CN113452954B - Behavior analysis method, apparatus, device and medium

Info

Publication number: CN113452954B
Application number: CN202010223353.3A
Authority: CN
Inventors: 单远达
Original assignee: Zhejiang Uniview Technologies Co Ltd
Current assignee: Zhejiang Uniview Technologies Co Ltd
Priority date: 2020-03-26
Filing date: 2020-03-26
Publication date: 2023-02-28
Anticipated expiration: 2040-03-26
Also published as: CN113452954A

Abstract

The embodiment of the invention discloses a behavior analysis method, a behavior analysis device, behavior analysis equipment and behavior analysis media. The method comprises the following steps: selecting a monitoring target in a video of a monitoring scene, and displaying a monitoring target frame in a three-dimensional virtual scene corresponding to the monitoring scene; determining the behavior of the monitoring target according to the position of the monitoring target frame in the three-dimensional virtual scene; and sending out early warning information when the behavior of the monitoring target is abnormal. The embodiment of the invention realizes automatic analysis of the behavior of the monitored target so as to automatically perform early warning when the behavior of the monitored target is abnormal, thereby reducing the labor cost and improving the behavior analysis efficiency of the monitored target.

Description

Behavior analysis method, apparatus, device, and medium

Technical Field

The embodiment of the invention relates to the technical field of computer vision, in particular to a behavior analysis method, a behavior analysis device, behavior analysis equipment and behavior analysis media.

Background

By fusing the monitoring video and the three-dimensional virtual environment, services such as identifying and tracking a monitoring target and analyzing the behavior of the monitoring target from one virtual environment can be visually and accurately realized.

At present, when analyzing a behavior of a monitoring target, a three-dimensional monitoring system obtained by fusing a monitoring video and a three-dimensional virtual environment is generally based on manual marking on the monitoring video, and then analyzing behavior information of the monitoring target according to marked information so as to give an alarm when the behavior of the monitoring target is abnormal. The above method is not only complicated in operation, but also needs to consume a large amount of labor cost and has low efficiency.

Disclosure of Invention

Embodiments of the present invention provide a behavior analysis method, apparatus, device, and medium, which implement automatic analysis of a monitored target behavior, so as to automatically perform an early warning when the monitored target behavior is abnormal, thereby reducing labor cost and improving the behavior analysis efficiency of the monitored target.

In a first aspect, an embodiment of the present invention provides a behavior analysis method, where the method includes:

selecting a monitoring target in a video of a monitoring scene, and displaying a monitoring target frame in a three-dimensional virtual scene corresponding to the monitoring scene;

determining the behavior of the monitoring target according to the position of the monitoring target frame in the three-dimensional virtual scene;

and when the behavior of the monitoring target is abnormal, sending out early warning information.

In a second aspect, an embodiment of the present invention further provides a behavior analysis apparatus, where the apparatus includes:

the target display module is used for framing a monitoring target in a video of a monitoring scene and displaying a monitoring target frame in a three-dimensional virtual scene corresponding to the monitoring scene;

the behavior determining module is used for determining the behavior of the monitoring target according to the position of the monitoring target frame in the three-dimensional virtual scene;

and the early warning module is used for sending out early warning information when the behavior of the monitoring target is abnormal.

In a third aspect, an embodiment of the present invention further provides an electronic device, where the electronic device includes:

one or more processors;

a storage device for storing one or more programs,

when the one or more programs are executed by the one or more processors, the one or more processors are caused to implement the behavior analysis method according to any embodiment of the present invention.

In a fourth aspect, an embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to implement the behavior analysis method according to any embodiment of the present invention.

The technical scheme disclosed by the embodiment of the invention has the following beneficial effects:

the method comprises the steps of selecting a monitoring target in a video of a monitoring scene, displaying a monitoring target frame in a three-dimensional virtual scene corresponding to the monitoring scene, determining the behavior of the monitoring target according to the position of the monitoring target frame in the three-dimensional virtual scene, and sending early warning information when the behavior of the monitoring target is abnormal. Therefore, automatic analysis of the monitoring target behaviors is achieved, and early warning is automatically carried out when the monitoring target behaviors are abnormal, so that the labor cost is reduced, and the behavior analysis efficiency of the monitoring target is improved.

Drawings

FIG. 1 is a schematic flow chart of a behavior analysis method in an embodiment of the present invention;

fig. 2 is a schematic flow chart illustrating a process of fusing and displaying a video of a monitored scene with a three-dimensional virtual scene in an embodiment of the present invention;

FIG. 3 is a schematic flow chart of determining at least one unobstructed visible object in a video according to an embodiment of the present invention;

FIG. 4 is a schematic illustration of an embodiment of the invention in which an object EFGH is determined to be an occluded visible object;

FIG. 5 is a schematic diagram of a video image showing a monitoring target frame and other non-occluded visible objects in a three-dimensional virtual scene according to an embodiment of the present invention;

FIG. 6 is a schematic flow chart diagram of another behavior analysis method in an embodiment of the invention;

FIG. 7 is a schematic flow chart diagram illustrating a further method for behavior analysis in an embodiment of the present invention;

fig. 8 is a flowchart for determining the behavior of the vehicle when the preset area type is the parking area in the embodiment of the invention;

fig. 9 is a flowchart illustrating a process of determining a behavior of a vehicle when a preset zone type is a no-parking zone in the embodiment of the present invention;

fig. 10 is a schematic structural view of a behavior analysis device in an embodiment of the present invention;

fig. 11 is a schematic structural diagram of an electronic device in the embodiment of the present invention.

Detailed Description

The embodiments of the present invention will be described in further detail with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of and not restrictive on the broad invention. It should be further noted that, for convenience of description, only some structures, not all structures, relating to the embodiments of the present invention are shown in the drawings.

The embodiment of the invention provides a behavior analysis method aiming at the problems that in the related art, when monitoring target behaviors are analyzed, manual marking is usually carried out on a monitoring video, then behavior information of a monitoring target is analyzed according to the marked information, and an alarm is given when the monitoring target behaviors are abnormal, so that the operation is complex, a large amount of labor cost is consumed, and the efficiency is low.

The embodiment of the invention selects the monitoring target in the video of the monitoring scene, displays the monitoring target frame in the three-dimensional virtual scene corresponding to the monitoring scene, determines the behavior of the monitoring target according to the position of the monitoring target frame in the three-dimensional virtual scene, and sends the early warning information when the behavior of the monitoring target is abnormal. Therefore, automatic analysis of the monitoring target behaviors is achieved, and early warning is automatically carried out when the monitoring target behaviors are abnormal, so that the labor cost is reduced, and the behavior analysis efficiency of the monitoring target is improved.

A behavior analysis method, an apparatus, a device, and a storage medium according to embodiments of the present invention will be described in detail below with reference to the accompanying drawings.

First, referring to fig. 1, a behavior analysis method provided by an embodiment of the present invention is specifically described.

Fig. 1 is a schematic flow chart of a behavior analysis method in an embodiment of the present invention, where the embodiment is applicable to a situation where a behavior of a monitoring target in a surveillance video is automatically analyzed, and the method may be executed by a behavior analysis device, and the device may be composed of hardware and/or software and may be integrated in an electronic device. The behavior analysis method specifically comprises the following steps:

s101, selecting a monitoring target in a video of a monitoring scene, and displaying a monitoring target frame in a three-dimensional virtual scene corresponding to the monitoring scene.

In an embodiment of the invention, the monitoring target comprises a person and/or a vehicle.

It is understood that the monitoring target includes a person and/or a vehicle, which may refer to a person; alternatively, it may also refer to a vehicle; further alternatively, the term "person" or "vehicle" may be used, and this embodiment is not particularly limited thereto.

It should be noted that the embodiment of the present invention can be applied to any monitoring video (i.e., video of a monitoring scene) accessed to a real monitoring scene and fused with a three-dimensional virtual scene, so as to automatically analyze and early warn the behavior of a monitoring target based on the fused scene. In particular to a scene for simultaneously presenting monitoring videos of a plurality of areas so as to analyze the behaviors of monitoring targets in different areas. For example, for monitoring of a floor, if monitoring pictures of a plurality of key areas in the floor are to be opened simultaneously to be presented in one display interface, a three-dimensional virtual scene can be constructed according to the real size of the floor, a corresponding virtual camera is deployed in the three-dimensional virtual scene according to the real camera in the floor, and the visual field of the virtual camera is adjusted to be consistent with that of the real camera, so that a monitoring video can be projected into the three-dimensional virtual scene to realize panoramic real-time monitoring, and meanwhile, the behavior of a monitoring target can be automatically analyzed; alternatively, panoramic analysis, analysis of target behavior monitoring and the like can be performed on other scenes such as schools, prisons or airports.

Through the introduction, in order to fuse the video of the monitoring scene with the three-dimensional virtual scene, services such as analyzing the behavior of the monitoring target in a virtual environment intuitively and accurately are realized. Before executing S101, a three-dimensional virtual scene with the same size as the monitoring scene can be constructed based on modeling software or other modes according to the size of the monitoring scene, and the constructed three-dimensional virtual scene is imported into the electronic equipment, so that a foundation is laid for analyzing the behavior of the monitoring target in the three-dimensional virtual scene in the following process.

Because the real camera is arranged in the monitoring scene, and the three-dimensional virtual scene is the same as the monitoring scene, the embodiment of the invention can set the corresponding virtual camera in the three-dimensional virtual scene according to the installation position and the model of the real camera in the monitoring scene, and adjust the visible area parameters of the virtual camera according to the visible area of the real camera, so that the visible area of the virtual camera is consistent with the visible area of the real camera, thereby ensuring that the images output by the real camera and the virtual camera are consistent. Wherein, the visual field parameters of the virtual camera comprise: horizontal field angle, vertical field angle, and imaging aspect ratio.

When the virtual camera is set in the three-dimensional virtual scene, the virtual camera with the same model can be selected from the camera model base according to the model of the real camera, and the virtual camera is installed at the same position according to the installation position of the real camera. In the embodiment of the invention, the virtual camera applied to the three-dimensional virtual scene can shoot and record the three-dimensional virtual scene environment.

After the virtual camera is installed, the embodiment of the invention can adjust the parameters of the visual field of the virtual camera, so that the visual field of the virtual camera is consistent with the visual field of the real camera.

Specifically, the visual field parameters of the virtual camera include: imaging Aspect ratio Aspect, horizontal field angle AG _hor And angle of vertical field AG _ver . The imaging Aspect ratio Aspect is determined by the size of the photosensitive element in the virtual camera, and specifically includes:

where w is the width of the photosensitive element and h is the height of the photosensitive element. Since the size of the photosensitive element is determined by the model of the real camera, and the cameras of different models have the corresponding photosensitive element size, the imaging Aspect ratio Aspect of the virtual camera in the embodiment of the invention is determined when the virtual camera is selected according to the model of the real camera.

In general, in the basic imaging principle of a camera, the focal length F and the size of a photosensitive element have a trigonometric function relationship as described in formula (1):

wherein w is the width of the photosensitive element, h is the height of the photosensitive element, and F is the focal length.

Based on the above formula (1), it can be seen that the embodiment of the present invention adjusts the horizontal field angle AG in the visible field parameter of the virtual camera _hor And angle of vertical field AG _ver In the process, only the focal length F needs to be adjusted, and the required horizontal field angle AG can be obtained _hor And angle of vertical field AG _ver 。

It can be understood that, the present embodiment adjusts the visual field parameter of the virtual camera by adjusting the horizontal field angle AG of the virtual camera _hor And vertical field angle AG _ver Such that the field of view of the virtual camera coincides with the field of view of the real camera.

Further, after the visual field of the virtual camera in the three-dimensional virtual scene is adjusted, the embodiment of the invention can acquire the video of the monitoring scene through the three-dimensional virtual camera so as to fuse and display the video of the monitoring scene, wherein the method comprises the steps of framing the monitoring target in the video of the monitoring scene and displaying the monitoring target frame in the three-dimensional virtual scene corresponding to the monitoring scene.

It should be noted that, the blending and displaying of the video of the monitoring scene and the three-dimensional virtual scene in the present embodiment will be specifically described in the following embodiments, which are not set forth herein too much.

S102, determining the behavior of the monitoring target according to the position of the monitoring target frame in the three-dimensional virtual scene.

Since the monitoring target may be a person or a vehicle. Therefore, the behavior of the monitoring target is determined in the embodiment, and different analysis modes can be divided according to the monitoring target.

For example, if the monitoring target is a person, it may be determined whether the behavior of the monitoring person is normal behavior or abnormal behavior according to the vertical height of the monitoring person frame.

For another example, if the monitored target is a vehicle, it may be determined whether the monitored vehicle is parked in a specified area or the like according to a position of at least one vertex angle of the monitored vehicle frame.

And S103, sending out early warning information when the behavior of the monitoring target is abnormal.

The behavior abnormality refers to a behavior with a potential safety hazard, such as a person climbing, a wrestling, or a vehicle parking violation.

That is to say, when the behavior of the monitoring target is determined to have potential safety hazard, the electronic equipment automatically sends out early warning information, so that the staff can prevent the monitoring target from stopping dangerous behavior or take remedial measures in time, and the safety of the monitoring target is ensured.

According to the behavior analysis method provided by the embodiment of the invention, the monitoring target is framed in the video of the monitoring scene, the monitoring target frame is displayed in the three-dimensional virtual scene corresponding to the monitoring scene, the behavior of the monitoring target is determined according to the position of the monitoring target frame in the three-dimensional virtual scene, and when the behavior of the monitoring target is abnormal, the early warning information is sent. Therefore, automatic analysis of the monitoring target behaviors is achieved, and early warning is automatically carried out when the monitoring target behaviors are abnormal, so that the labor cost is reduced, and the behavior analysis efficiency of the monitoring target is improved.

The following describes, with reference to fig. 2, a process of fusing and displaying a video of a monitored scene and a three-dimensional virtual scene in an embodiment of the present invention. As shown in fig. 2, the method specifically includes:

s201, obtaining the video of the monitoring scene, and determining whether the video format of the video is consistent with the video format supported by the three-dimensional virtual scene, if not, executing S202, otherwise, executing S203.

In this embodiment, the video format refers to a data stream format of a video. Wherein the video format comprises: RGB format, YUV format, and the like, and are not particularly limited herein.

Illustratively, the embodiment of the present invention may acquire the video of the monitored scene through the video connection interface with the monitored scene.

Because the video format of the video of the monitoring scene may be different from the video format supported by the three-dimensional virtual scene, the embodiment of the invention can match the video format of the acquired video with the video format supported by the three-dimensional virtual scene after the video of the monitoring scene is acquired, so as to determine whether the video format of the video is consistent with the video format supported by the three-dimensional virtual scene. If so, performing subsequent operations based on the acquired video; and if not, converting the video format of the video according to the video format supported by the three-dimensional virtual scene so as to enable the converted video format to be consistent with the video format supported by the three-dimensional virtual scene, and further normally displaying the video picture in the three-dimensional virtual scene.

And S202, if the video formats are not consistent, converting the video formats of the videos according to the video formats supported by the three-dimensional virtual scene.

For example, if the video format of the video is YUV format and the video format supported by the three-dimensional virtual scene is RGB format, the video format of the video may be converted from YUV format to RGB format according to a format conversion algorithm.

It should be noted that, when converting the video format, the fluency of video playing is ensured. In the embodiment, the video format conversion speed can be increased by a multithreading technology, so that the problem that the video is not smoothly played and is blocked due to the phenomenon of frame skipping during playing caused by format conversion of the video is solved.

And S203, if the data are consistent, no processing is needed.

In the embodiment of the present invention, after the video format is converted or is not converted, a visible object that is not occluded in the video may be further determined according to a virtual camera in the three-dimensional virtual camera, so that a video picture of the object that is not occluded is displayed in the three-dimensional virtual scene. Wherein the visible objects which are not shielded comprise monitoring targets and non-monitoring targets. The non-monitoring target in this embodiment may refer to green plants, street lamps, fences, and other objects. Referring specifically to steps S204-S207 below, the visible objects in the video that are not occluded are determined.

S204, determining at least one unoccluded visible object in the video according to the virtual camera in the three-dimensional virtual scene.

In a specific implementation, at least one visible object in the video that is not occluded can be determined by the following steps, which are specifically shown in fig. 3;

s31, determining at least two visible objects in the video according to the visible field of the virtual camera.

Objects that are typically within the cone, a mathematical model of the field of view of the camera, are visible objects, while objects that are not within the cone are invisible objects. Therefore, the embodiment can quickly determine at least one visible object according to the view frustum corresponding to the visual field of the virtual camera.

S32, determining the depth value of each pixel point of each visible object on the depth map and the space distance between each pixel point of each visible object and the virtual camera.

Wherein, the depth map represents the distance between each pixel point in the visual field of the camera and the camera.

Optionally, a depth map corresponding to the virtual camera may be obtained first, and then the depth value of each pixel point of each visible object on the depth map and the spatial distance between each pixel point of each visible object and the virtual camera may be determined.

During specific implementation, the spatial distance values between all pixel points in the visual field of the virtual camera and the virtual camera can be calculated, and then the pixel value of each pixel point is calculated according to the spatial distance value between each pixel point and the virtual camera, and the near cutting plane distance and the far cutting plane distance in the view cone of the virtual camera. And then, obtaining a depth map corresponding to the virtual camera according to the pixel values of all pixel points in the visual field of the virtual camera.

After obtaining the depth map, the present embodiment may determine a depth value of each pixel point of each visible object on the depth map and a spatial distance between each pixel point of each visible object and the virtual camera.

For example, the determining the depth value of each pixel point of each visible object on the depth map and the spatial distance between each pixel point of each visible object and the virtual camera includes: determining the depth value of each pixel point on the depth map according to the pixel value of each pixel point of each visible object, the near clipping surface distance of the virtual camera and the far clipping surface distance of the virtual camera; and determining the space distance between each pixel point and the virtual camera according to the world coordinates of each pixel point of each visible object and the world coordinates of the virtual camera.

The depth value of each pixel point of each visible object on the depth map is determined, and the world coordinate pos of each pixel point of each visible object can be determined based on the video of the monitored scene and the world coordinate system where the real camera is located in the embodiment of the invention _wi Wherein i represents the ith pixel point, and i is a positive integer greater than 1, and then according to the world coordinate pos of each pixel point of each visible object _wi Calculating the two-dimensional coordinate of each pixel point corresponding to the imaging image of the virtual camera, namely V _i . Then according to V _i And acquiring a pixel value R of each pixel point at a corresponding position from the depth map, so as to determine the depth value of each pixel point on the depth map based on the pixel value of each pixel point, the near clipping plane distance of the virtual camera and the far clipping plane distance of the virtual camera.

In this embodiment, when the spatial distance between each pixel point of each visible object and the virtual camera is determined, the spatial distance can be calculated according to the world coordinates of the virtual camera and the world coordinates of each pixel point.

It should be noted that, in this embodiment, reference is made to the prior art for a specific process of obtaining a depth map, and determining, based on the depth map, a depth value of each pixel point of each visible object on the depth map, and a spatial distance between each pixel point of each visible object and a virtual camera, which is not described in detail herein.

S33, determining at least one un-occluded visible object according to the depth value of each pixel point of each visible object on the depth map and the space distance between each pixel point of each visible object and the virtual camera.

Specifically, the depth value of each pixel point of each visible object on the depth map may be compared with the spatial distance between each pixel point of each visible object and the virtual camera. If the depth values of all pixel points in any visible object on the depth map are smaller than the spatial distance, determining that the visible object is blocked; and if the depth value of at least one pixel point in any visible object on the depth map is larger than or equal to the spatial distance, determining that the visible object is not blocked.

When any visible object is determined to be occluded, the virtual camera can eliminate the occluded visible object, namely, the video image of the occluded visible object is not projected in the three-dimensional virtual scene.

For example, as shown in fig. 4, the object EFGH is located on the back side of the object ABCD, i.e., the object EFGH is completely occluded, and thus the object EFGH is an occluded visible object for the virtual camera.

S205, if the at least one unobstructed visible object is determined to include the monitoring target, the monitoring target is selected in the video of the monitored scene.

For example, after determining the visible object that is not occluded, the embodiment of the present invention may determine whether the visible object that is not occluded is the monitoring target by using a recognition algorithm. If the unshielded visible object is determined to be the monitoring target, the monitoring target is framed in the video of the monitoring scene, and the framed visible pixel point set of the monitoring target is sent to a graphic processor in the electronic equipment, so that the graphic processor performs display processing based on the visible pixel point set of the monitoring target. The identification algorithm may be determined according to the monitored target, for example, if the monitored target is a person, the identification algorithm may be a human face identification algorithm or a human body identification algorithm, etc.; if the monitored target is a vehicle, the recognition algorithm may be a license plate recognition algorithm or other recognition algorithms, and the like, which is not specifically limited in this embodiment.

It should be noted that, in the embodiment of the present invention, a set of visible pixel points of a non-monitoring target (other visible objects that are not blocked) other than the monitoring target may also be sent to the graphics processor, so that the graphics processor performs display processing based on the set of visible pixel points of the other visible objects that are not blocked.

And S206, acquiring the pixel values of the visible pixel points of the monitoring target and the pixel values of the visible pixel points of the non-monitoring target.

In the embodiment of the invention, after receiving the visible pixel point set of the framed monitoring target and the visible pixel point set of the non-monitoring target, the graphic processor can calculate the pixel point position of each visible pixel point in each visible pixel point set in the three-dimensional space corresponding to the two-dimensional image based on the GPU acceleration technology, and perform sampling operation on the video according to the position of each visible pixel point in each visible pixel point set to obtain the pixel value of the position of each visible pixel point coordinate in each visible pixel point set.

And S207, replacing the pixel value of the pixel point which is positioned at the same position as the visible pixel point in the three-dimensional virtual scene with the pixel value of the visible pixel point according to a texture mapping method, and performing frame selection display.

For example, the graphics processor may replace the pixel value of the pixel point located at the same position as the visible pixel point in the three-dimensional virtual scene with the pixel value of the visible pixel point based on a texture mapping method, and perform framing display.

For example, as shown in fig. 5, a monitoring target frame, specifically, marked 51 and 52 in the figure, and video images of other unobstructed visible objects are displayed in a three-dimensional virtual field corresponding to a monitored scene.

According to the technical scheme provided by the embodiment of the invention, the video of the monitoring scene and the three-dimensional virtual scene are fused and displayed, so that the traditional video monitoring service can be efficiently introduced into the three-dimensional virtual monitoring, and a user can experience the dynamic panoramic monitoring service in the immersive virtual scene.

As can be seen from the foregoing description, in the embodiment of the present invention, the monitoring target is selected by frames, the monitoring target frame is displayed in the three-dimensional virtual scene, and the monitoring target behavior is determined according to the position of the monitoring target frame, so that the early warning information is sent when the monitoring target behavior is abnormal.

In an implementation scenario of the present invention, when the monitored target is a person, the present invention determines a behavior of the monitored target according to the position of the monitored target frame, specifically, determines the behavior of the person according to a first vertical distance between target pixel points taken from horizontal boundary lines of the monitored person frame and/or a second vertical distance between the first target pixel point and the ground. The first target pixel point is a pixel point close to the ground. The following describes the above-mentioned situation of the behavior analysis method provided by the embodiment of the present invention with reference to fig. 6. The method specifically comprises the following steps:

s601, selecting a monitoring person in a video of a monitoring scene, and displaying a monitoring person frame in a three-dimensional virtual scene corresponding to the monitoring scene.

S602, respectively taking a target pixel point from two horizontal boundary lines of the monitoring character frame, and determining a first vertical distance between the two target pixel points and a second vertical distance between the first target pixel point and the ground according to the space coordinate of each target pixel point, wherein the first target pixel point is a pixel point close to the ground.

Based on the above embodiment, taking the monitored character frame a in fig. 5 as an example, a target pixel point can be taken from each of two horizontal boundary lines in the monitored character frame a, for example, one target pixel point is the pixel point Q at the top left corner position, and the other target pixel point is the pixel point S at the bottom left corner position. The first target pixel point is a pixel point S; or, the pixel point at the position of the left lower vertex angle of the monitoring character frame can be used as a target pixel point, and the pixel point at the position of the right upper vertex angle of the monitoring character frame can be used as another target pixel point. Wherein the first target pixel point is a pixel point at the position of the lower left vertex angle, and the like.

After a target pixel point is taken from each of two horizontal boundary lines of the monitoring character frame, the embodiment of the invention can determine the space coordinate of each target pixel point. In specific implementation, the pixel value of each target pixel point is obtained from the depth map; determining the distance between each target pixel point and the virtual camera according to the pixel value of each target pixel point, the near cutting surface distance of the virtual camera and the far cutting surface distance of the virtual camera; and determining the space coordinate of each target pixel point according to the world coordinate of the virtual camera, the direction of each target pixel point and the distance between each target pixel point and the virtual camera. The specific determination process is referred to the existing scheme, and will not be described in detail herein.

Further, according to the space coordinates of each target pixel point, a first vertical distance between two target pixel points and a second vertical distance between the first target pixel point and the ground are determined, and the first vertical distance and the second vertical distance can be determined according to a two-point distance formula. The specific calculation process is referred to the existing scheme, and is not described in detail herein. It should be noted that, determining a second vertical distance between the first target pixel point and the ground, specifically, determining a vertical distance between the first target pixel point and a pixel point perpendicular to the ground.

S603, determining the behavior of the person according to the first vertical distance and/or the second vertical distance.

Specifically, the determining the behavior of the person according to the first vertical distance and/or the second vertical distance may be implemented by:

first mode

And if the second vertical distance between the first target pixel point and the ground is greater than or equal to a first distance threshold, determining that the behavior of the person is climbing.

The first distance threshold is set according to a distance of the person in the actual scene when the person performs the climbing action, for example, set to 0.3m or 0.5m, and the like, which is not specifically limited herein.

For example, if the first distance threshold is 0.5m, when the second vertical distance between the point a of the first target pixel and the ground is 0.7m, it is determined that the behavior of the person is climbing.

Second mode

And if the second vertical distance between the first target pixel point and the ground is smaller than the first distance threshold value and the first vertical distance is smaller than the second distance threshold value, determining that the behavior of the character is wrestling.

The second distance threshold is set according to the vertical height of the human body when the human wrestling occurs, or may be set by other methods, which are not specifically limited herein. For example, set to 0.3m or 0.4m; or, alternatively, set to 0.5m, etc.

Assuming that the first distance threshold is 0.3m, the second distance threshold is 0.5m, the first target pixel point selected from the monitoring character frame is a point a, and the second target pixel point is a point B, when the second vertical distance between the point a and the ground is 0.1m, and the first vertical distance between the point a and the point B is 0.46m, it is determined that the character acts as a wrestling.

Third mode

And if the first vertical distance is greater than or equal to the second distance threshold and less than the third distance threshold, determining that the person squats or sits.

The third distance threshold is set according to the normal height parameter of the person, for example, set to 1m or 1.1 m.

That is, when the first vertical distance between two target pixel points in the monitored character frame does not exceed the third distance threshold, it indicates that the character is in a non-standing state such as squatting or sitting.

Fourth mode

And if the first vertical distance is greater than the third distance threshold, determining that the person acts as standing.

That is, in this embodiment, the behavior of the person may be determined according to a first vertical distance between two target pixel points respectively taken from two horizontal boundary lines of the monitoring person frame and/or a second vertical distance between a target pixel point close to the ground and the ground.

It should be noted that the above-mentioned several ways of determining the behavior of the monitoring person are only exemplary, and should not be taken as a specific limitation to the present invention.

It should be noted that, in the embodiment of the present invention, multiple frame images of the same monitored person may also be obtained, and whether a potential safety hazard exists in the behavior of the monitored person is determined by comparing behaviors of the monitored person in the multiple frame images. For example, the behavior of any monitored person is determined to be safe by comparing multiple frames of images to determine that the person has the behavior of crossing roads.

S604, if the behavior of the person is abnormal, early warning information is sent out.

When the behavior of the monitored character is determined to be other abnormal behaviors with potential safety hazards such as climbing or wrestling, the early warning information is sent to the staff, so that the staff can timely stop the character from stopping dangerous behaviors or take protective measures, and the safety of the staff is improved.

That is, in this embodiment, if the monitoring target is a person and the behavior of the person is climbing or wrestling, it is determined that the behavior of the monitoring target is abnormal, and an early warning message is issued.

According to the technical scheme provided by the embodiment of the invention, when the monitored target is a person, the behavior of the monitored person is determined according to the position of the frame of the monitored person in the three-dimensional virtual scene, so that early warning information is sent out when the behavior of the monitored person is abnormal, a guardian can take protective measures in time based on the early warning information, and the safety of the person is improved. Therefore, automatic analysis of the monitoring target behaviors is achieved, and early warning is automatically carried out when the monitoring target behaviors are abnormal, so that the labor cost is reduced, and the behavior analysis efficiency of the monitoring target is improved.

In an implementation scenario of the invention, when the monitored target is a vehicle, the behavior of the monitored target is determined according to the position of the monitored target frame, specifically, when the position of the vehicle is unchanged when the preset time duration is exceeded, the behavior of the monitored target is determined according to the position of the vehicle frame. The above-mentioned situation of the behavior analysis method provided by the embodiment of the present invention is explained with reference to fig. 7. The method specifically comprises the following steps:

s701, selecting a monitoring vehicle in a video of a monitoring scene, and displaying a monitoring vehicle frame in a three-dimensional virtual scene corresponding to the monitoring scene.

S702, at least one vertex on the monitoring vehicle frame is obtained and used as a target pixel point.

It should be noted that, in order to determine whether the monitored vehicle has the illegal parking behavior, the embodiment may first determine whether the staying time of the monitored vehicle frame at the current position exceeds a preset time period, and when the staying time exceeds the preset time period, determine that the monitored vehicle is in the parking state. At the moment, at least one vertex on the monitoring vehicle frame is obtained and used as a target pixel point. For example, a top left corner is obtained as a target pixel point; or acquiring a right lower vertex angle as a target pixel point; or, the left lower vertex angle is obtained as a target pixel point, and the like, which is not specifically limited in this embodiment.

The preset duration can be adaptively set according to the actual application scenario, and is not specifically limited here. For example, 2 minutes (min) or 3min, etc.

For example, if the preset time is 2min, when the stay time of the monitoring vehicle frame at the current position exceeds 3min, it is determined that the monitoring vehicle is in a parking state, and at least one vertex on the monitoring vehicle frame is obtained and used as a target pixel point.

And S703, determining the behavior of the vehicle according to the matching degree between the space coordinate of each target pixel point and the space coordinate in the preset area and the type of the preset area.

In this embodiment, the preset area types include: parking areas and no parking areas.

Before performing S703, the spatial coordinates of each target pixel point may be determined. In specific implementation, the pixel value of each target pixel point is obtained from the depth map; determining the distance between each target pixel point and the virtual camera according to the pixel value of each target pixel point, the near cutting surface distance of the virtual camera and the far cutting surface distance of the virtual camera; and determining the space coordinate of each target pixel point according to the world coordinate of the virtual camera, the direction of each target pixel point and the distance between each target pixel point and the virtual camera.

After the space coordinates of each target pixel point are determined, the behavior of the vehicle can be determined according to the matching degree between the space coordinates of each target pixel point and the space coordinates in the preset area and the type of the preset area.

During specific implementation, if the type of the preset area is a parking area, determining the behavior of the vehicle according to the matching degree between the space coordinates of each target pixel point and the space coordinates in the preset area and the type of the preset area, and implementing the following steps:

and S801, acquiring a parking space closest to the vehicle frame.

Specifically, the distance values between at least two vertex angle positions of the vehicle frame and at least two vertex angle positions of surrounding parking spaces can be calculated according to a two-point distance formula. And selecting two minimum values from all the distance values, and then determining the parking spaces corresponding to the two minimum values as the parking spaces with the closest distance to the vehicle frame.

S802, determining the space coordinate of each target pixel point, and determining whether the matching degree between the space coordinate of each target pixel point and the space coordinate in the parking space reaches a matching degree threshold value.

And S803, if the matching degree between the space coordinate of any target pixel point and the space coordinate in the parking space does not reach the threshold value of the matching degree, determining that the behavior of the vehicle is an illegal parking behavior.

In this embodiment, the threshold of the matching degree is set according to the actual application requirement, for example, set to 1. For example, if the distance between any two pixels is 0, the matching degree between the two pixels is determined to be 1, otherwise, the matching degree is not 0.

Optionally, the distance between the space coordinate of each target pixel point and each space coordinate in the parking space may be calculated by using an euclidean distance or other methods. And then, determining the matching degree between the space coordinate of each target pixel point and each space coordinate in the parking space according to the distance, and comparing the matching degree with a matching degree threshold value so as to determine the behavior of the vehicle according to the comparison result.

During specific implementation, if the matching degree threshold is 1, when the distance between the space coordinate of each target pixel point and one space coordinate in the parking space is 0, determining the space coordinate of each target pixel point, and determining that the matching degree with one space coordinate in the parking space reaches the matching degree threshold 1, so that the vehicle behavior is determined to be normal; and if the distance between the space coordinate of any target pixel point and all space coordinates in the parking space area is not 0, determining that the matching degree of the space coordinate of the target pixel point and the space coordinate in the parking space area does not reach a matching degree threshold value 1, namely the matching degree threshold value is smaller than, and determining that the current behavior of the vehicle is illegal parking behavior.

In another embodiment of the present invention, if the preset area type is a no-parking area, the behavior of the vehicle is determined according to the matching degree between the space coordinate of each target pixel point and the space coordinate in the preset area and the preset area type, and the method includes the following steps:

s901, determining whether the spatial coordinates of each target pixel point and the matching degree between the spatial coordinates in the no-parking area reach a matching degree threshold value;

and S902, if the matching degree between the space coordinate of any target pixel point and the space coordinate in the no-parking area reaches the matching degree threshold value, determining that the behavior of the vehicle is an illegal parking behavior.

In this embodiment, the threshold of the matching degree may be set to be the same as the threshold of the matching degree in S802, for example, set to 1. Of course, it may be different, and is not particularly limited herein.

Optionally, the distance between the spatial coordinate of each target pixel point and each spatial coordinate in the no-parking area may be calculated by using an euclidean distance or other methods. And then, determining the matching degree between the space coordinate of each target pixel point and each space coordinate in the no-parking area according to the distance, so as to compare the matching degree with a matching degree threshold value, and determine the behavior of the vehicle according to the comparison result.

In concrete implementation, if the distance between the space coordinate of any target pixel point and one space coordinate in the no-parking area is 0, determining that the matching degree of the space coordinate of the target pixel point and one space coordinate in the no-parking area reaches a matching degree threshold value 1, and thus determining that the current behavior of the vehicle is an illegal parking behavior; and if the distances between the space coordinate of each target pixel point and all space coordinates in the no-parking area are not 0, determining that the matching degree between the space coordinate of the target pixel point and the space coordinate in the no-parking area does not reach a matching degree threshold value 1, namely, the matching degree is smaller than the matching degree threshold value, and accordingly determining that the behavior of the vehicle is normal.

And S704, if the behavior of the vehicle is abnormal, sending out early warning information.

When the behavior of the monitored vehicle is determined to be abnormal behaviors with potential safety hazards such as illegal parking and the like, early warning information is sent to the staff, so that the staff can timely stop the dangerous behavior of a vehicle owner of the monitored vehicle or take protective measures, and the safety of the staff is improved.

That is, in this embodiment, if the monitored target is a vehicle and the behavior of the vehicle is an illegal parking behavior, it is determined that the behavior of the monitored target is abnormal, and warning information is sent.

According to the technical scheme provided by the embodiment of the invention, when the monitored target is the vehicle, the behavior of the monitored vehicle is determined according to the position of the frame of the monitored vehicle in the three-dimensional virtual scene, so that early warning information is sent out when the behavior of the monitored vehicle is abnormal, therefore, a guardian can take protective measures in time based on the early warning information, and the safety of the guardian is improved. Therefore, automatic analysis of the monitoring target behaviors is achieved, and early warning is automatically carried out when the monitoring target behaviors are abnormal, so that the labor cost is reduced, and the behavior analysis efficiency of the monitoring target is improved.

In order to achieve the above object, an embodiment of the present invention further provides a behavior analysis device.

Fig. 10 is a schematic structural diagram of a behavior analysis device in an embodiment of the present invention. As shown in fig. 10, the present invention is implemented in a routine manner in which an analysis device 1000 includes: a target display module 1010, a behavior determination module 1020, and an early warning module 1030.

The target display module 1010 is configured to select a monitoring target in a video of a monitoring scene, and display a monitoring target frame in a three-dimensional virtual scene corresponding to the monitoring scene;

a behavior determining module 1020, configured to determine a behavior of the monitoring target according to a position of the monitoring target frame in the three-dimensional virtual scene;

and the early warning module 1030 is used for sending early warning information when the behavior of the monitoring target is abnormal.

As an optional implementation manner of the embodiment of the present invention, the behavior analysis device 600 in the embodiment of the present invention further includes: the device comprises a format determining module, a format converting module and a first determining module;

the format determining module is used for acquiring the video of the monitoring scene and determining whether the video format of the video is consistent with the video format supported by the three-dimensional virtual scene;

the format conversion module is used for converting the video format of the video according to the video format supported by the three-dimensional virtual scene if the video format is inconsistent with the video format supported by the three-dimensional virtual scene;

a first determining module, configured to determine, according to a virtual camera in the three-dimensional virtual scene, at least one unobstructed visible object in the video;

accordingly, the target display module 810 is specifically configured to:

and if the monitoring target is determined to be included in the at least one unobstructed visible object, the monitoring target is selected in the video of the monitored scene.

As an optional implementation manner of the embodiment of the present invention, the first determining module includes: a first determination unit, a second determination unit, and a third determination unit;

wherein the first determining unit is used for determining at least two visible objects in the video according to the visible field of the virtual camera;

a second determining unit, configured to determine a depth value of each pixel point of each visible object on the depth map, and a spatial distance between each pixel point of each visible object and the virtual camera;

and the third determining unit is used for determining at least one un-occluded visible object according to the depth value of each pixel point of each visible object on the depth map and the spatial distance between each pixel point of each visible object and the virtual camera.

As an optional implementation manner of the embodiment of the present invention, if the monitoring target is a person, the behavior determining module 1020 is configured to:

respectively taking a target pixel point from two horizontal boundary lines of a monitoring character frame, and determining a first vertical distance between the two target pixel points and a second vertical distance between the first target pixel point and the ground according to the space coordinate of each target pixel point, wherein the first target pixel point is a pixel point close to the ground;

and determining the behavior of the character according to the first vertical distance and/or the second vertical distance.

As an optional implementation manner of the embodiment of the present invention, the behavior determining module 1020 is specifically configured to:

if the second vertical distance between the first target pixel point and the ground is larger than or equal to a first distance threshold value, determining that the behavior of the person is climbing;

As an optional implementation manner of the embodiment of the present invention, if the monitoring target is a vehicle, the behavior determination module 1020 includes: a target pixel point obtaining unit and a behavior determining unit;

the target pixel point acquisition unit is used for acquiring at least one vertex on the monitoring vehicle frame as a target pixel point;

and the behavior determining unit is used for determining the behavior of the vehicle according to the matching degree between the space coordinate of each target pixel point and the space coordinate in the preset area and the type of the preset area.

As an optional implementation manner of the embodiment of the present invention, if the preset area type is a parking area, the behavior determining unit is specifically configured to:

acquiring a parking space closest to the vehicle frame;

determining whether the spatial coordinates of each target pixel point and the matching degree between the spatial coordinates in the parking space reach a matching degree threshold value or not;

and if the matching degree between the space coordinate of any target pixel point and the space coordinate in the parking space does not reach the threshold value of the matching degree, determining that the behavior of the vehicle is the illegal parking behavior.

As an optional implementation manner of the embodiment of the present invention, if the preset area type is a no parking area, the behavior determining unit is further configured to:

determining whether the spatial coordinates of each target pixel point and the matching degree between the spatial coordinates in the no-parking area reach a matching degree threshold value or not;

and if the matching degree between the space coordinate of any target pixel point and the space coordinate in the no-parking area reaches the threshold value of the matching degree, determining that the behavior of the vehicle is an illegal parking behavior.

As an optional implementation manner of the embodiment of the present invention, the early warning module 1030 is specifically configured to:

if the monitoring target is a character and the behavior of the character is climbing or wrestling, determining that the behavior of the monitoring target is abnormal and sending out early warning information;

and if the monitored target is a vehicle and the behavior of the vehicle is illegal parking behavior, determining that the behavior of the monitored target is abnormal and sending out early warning information.

It should be noted that the foregoing explanation of the embodiment of the behavior analysis method is also applicable to the behavior analysis device of the embodiment, and the implementation principle thereof is similar and will not be described herein again.

According to the technical scheme provided by the embodiment of the invention, the monitoring target is framed in the video of the monitoring scene, the monitoring target frame is displayed in the three-dimensional virtual scene corresponding to the monitoring scene, the behavior of the monitoring target is determined according to the position of the monitoring target frame in the three-dimensional virtual scene, and when the behavior of the monitoring target is abnormal, the early warning information is sent. Therefore, automatic analysis of the monitoring target behaviors is achieved, and early warning is automatically carried out when the monitoring target behaviors are abnormal, so that the labor cost is reduced, and the behavior analysis efficiency of the monitoring target is improved.

In order to achieve the above object, an embodiment of the present invention further provides an electronic device.

Fig. 11 is a schematic structural diagram of an electronic device provided in the present invention. FIG. 11 illustrates a block diagram of an exemplary electronic device 1100 suitable for use in implementing embodiments of the present invention. The electronic device 1100 shown in fig. 11 is only an example and should not bring any limitations to the function and the scope of use of the embodiments of the present invention.

As shown in fig. 11, electronic device 1100 is embodied in the form of a general purpose computing device. The components of the electronic device 1100 may include, but are not limited to: one or more processors or processing units 16, a system memory 28, and a bus 18 that couples various system components including the system memory 28 and the processing unit 16.

Bus 18 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, industry Standard Architecture (ISA) bus, micro-channel architecture (MAC) bus, enhanced ISA bus, video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.

Electronic device 1100 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by electronic device 1100 and includes both volatile and nonvolatile media, removable and non-removable media.

The system memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM) 30 and/or cache memory 32. The electronic device 1100 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 11, and commonly referred to as a "hard drive"). Although not shown in FIG. 11, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to bus 18 by one or more data media interfaces. Memory 28 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the invention.

A program/utility 40 having a set (at least one) of program modules 42 may be stored, for example, in memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. Program modules 42 generally carry out the functions and/or methodologies of the described embodiments of the invention.

The electronic device 1100 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), with one or more devices that enable a user to interact with the electronic device 1100, and/or with any devices (e.g., network card, modem, etc.) that enable the electronic device 1100 to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface 22. Also, the electronic device 1100 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the internet) via the network adapter 20. As shown, the network adapter 20 communicates with the other modules of the electronic device 1100 over a bus 18. It should be appreciated that although not shown, other hardware and/or software modules may be used in conjunction with the electronic device 1100, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.

The processing unit 16 executes various functional applications and data processing by executing programs stored in the system memory 28, for example, to implement a behavior analysis method provided by the embodiment of the present invention, including:

It should be noted that the foregoing explanation of the embodiment of the behavior analysis method is also applicable to the electronic device of the embodiment, and the implementation principle thereof is similar and will not be described herein again.

According to the electronic device provided by the embodiment of the invention, the monitoring target is framed in the video of the monitoring scene, the monitoring target frame is displayed in the three-dimensional virtual scene corresponding to the monitoring scene, the behavior of the monitoring target is determined according to the position of the monitoring target frame in the three-dimensional virtual scene, and when the behavior of the monitoring target is abnormal, the early warning information is sent. Therefore, automatic analysis of the monitoring target behaviors is achieved, and early warning is automatically carried out when the monitoring target behaviors are abnormal, so that the labor cost is reduced, and the behavior analysis efficiency of the monitoring target is improved.

In order to achieve the above object, the present invention also provides a computer-readable storage medium.

A computer-readable storage medium provided in an embodiment of the present invention stores thereon a computer program, which when executed by a processor implements a behavior analysis method according to an embodiment of the present invention, the method including:

Computer storage media for embodiments of the present invention may take the form of any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in some detail by the above embodiments, the invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the invention, and the scope of the invention is determined by the scope of the appended claims.

Claims

1. A method of behavioral analysis, comprising:

constructing a three-dimensional virtual scene with the same size as the monitoring scene according to the size of the monitoring scene;

setting a corresponding virtual camera in the three-dimensional virtual scene according to the installation position and the model of a real camera in the monitored scene, correspondingly adjusting the visual field parameters of the virtual camera according to the visual field of the real camera, acquiring the video of the monitored scene through the three-dimensional virtual camera so as to fuse and display the video of the monitored scene and the three-dimensional virtual scene, wherein the virtual camera can shoot the three-dimensional virtual scene environment;

determining at least one unoccluded visible object in the video according to a virtual camera in the three-dimensional virtual scene;

when the behavior of the monitoring target is abnormal, sending out early warning information;

wherein the determining at least one unobstructed visible object in the video comprises:

determining at least two visible objects in the video according to the visible field of the virtual camera;

determining the depth value of each pixel point of each visible object on the depth map and the space distance between each pixel point of each visible object and the virtual camera;

and if the depth value of each pixel point in any visible object on the depth map is larger than or equal to the spatial distance, determining that the visible object is not blocked.

2. The method of claim 1, wherein prior to framing a surveillance target in a video of a surveillance scene, the method further comprises: acquiring a video of the monitoring scene, and determining whether the video format of the video is consistent with the video format supported by the three-dimensional virtual scene;

if not, converting the video format of the video according to the video format supported by the three-dimensional virtual scene;

correspondingly, the selecting a monitoring target in a video of a monitoring scene specifically includes:

3. The method of claim 1, wherein if the monitoring target is a human figure, determining the behavior of the monitoring target according to the position of the monitoring target frame in the three-dimensional virtual scene comprises:

4. The method of claim 3, wherein determining the behavior of the character based on the first vertical distance and/or the second vertical distance comprises:

and if the second vertical distance between the first target pixel point and the ground is smaller than the first distance threshold value and the first vertical distance is smaller than the second distance threshold value, determining that the action of the character wrestles.

5. The method according to claim 1, wherein if the monitoring target is a vehicle, determining the behavior of the monitoring target according to the position of the monitoring target frame in the three-dimensional virtual scene comprises:

acquiring at least one vertex on a monitoring vehicle frame as a target pixel point;

and determining the behavior of the vehicle according to the matching degree between the space coordinate of each target pixel point and the space coordinate in the preset area and the type of the preset area.

6. The method of claim 5, wherein if the preset area type is a parking area, determining the behavior of the vehicle according to the matching degree between the spatial coordinates of each target pixel and the spatial coordinates in the preset area and the preset area type comprises:

acquiring a parking space closest to the vehicle frame;

determining the space coordinate of each target pixel point, and determining whether the matching degree between the space coordinate of each target pixel point and the space coordinate in the parking space reaches a matching degree threshold value;

7. The method of claim 5, wherein if the preset area type is a no-parking area, determining the behavior of the vehicle according to a matching degree between the spatial coordinates of each target pixel and the spatial coordinates in the preset area and the preset area type comprises:

8. A behavior analysis device, comprising:

the early warning module is used for sending out early warning information when the behavior of the monitoring target is abnormal;

the scene construction module is used for constructing a three-dimensional virtual scene with the same size as the monitoring scene according to the size of the monitoring scene; setting a corresponding virtual camera in the three-dimensional virtual scene according to the installation position and the model of a real camera in the monitored scene, correspondingly adjusting the visual field parameters of the virtual camera according to the visual field of the real camera, acquiring the video of the monitored scene through the three-dimensional virtual camera so as to fuse and display the video of the monitored scene and the three-dimensional virtual scene, wherein the virtual camera can shoot the three-dimensional virtual scene environment;

wherein the first determining module further comprises:

a first determining unit, configured to determine at least two visible objects in the video according to the visible field of the virtual camera;

and a third determining unit, configured to determine that the visible object is not blocked if the depth value of each pixel point in any visible object on the depth map is greater than or equal to the spatial distance, according to the depth value of each pixel point of each visible object on the depth map and the spatial distance between each pixel point of each visible object and the virtual camera.

9. An electronic device, comprising:

one or more processors;

a storage device for storing one or more programs,

when executed by the one or more processors, cause the one or more processors to implement a behavior analysis method as recited in any of claims 1-7.