CN113887384A - Pedestrian trajectory analysis method, device, equipment and medium based on multi-trajectory fusion - Google Patents

Pedestrian trajectory analysis method, device, equipment and medium based on multi-trajectory fusion Download PDF

Info

Publication number
CN113887384A
CN113887384A CN202111150133.3A CN202111150133A CN113887384A CN 113887384 A CN113887384 A CN 113887384A CN 202111150133 A CN202111150133 A CN 202111150133A CN 113887384 A CN113887384 A CN 113887384A
Authority
CN
China
Prior art keywords
target
pixel
image
trajectory
frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111150133.3A
Other languages
Chinese (zh)
Inventor
李会璟
赖众程
王晟宇
谢鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Bank Co Ltd
Original Assignee
Ping An Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Bank Co Ltd filed Critical Ping An Bank Co Ltd
Priority to CN202111150133.3A priority Critical patent/CN113887384A/en
Publication of CN113887384A publication Critical patent/CN113887384A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features

Abstract

The invention relates to an artificial intelligence technology, and discloses a pedestrian trajectory analysis method based on multi-trajectory fusion, which comprises the following steps: and constructing a three-dimensional model of a preset space, analyzing the monitoring videos captured by the cameras in different directions in the preset space to obtain the position information, the human face characteristics and the human body characteristics of the targets in the monitoring videos of different cameras, and judging whether the motion tracks of different cameras belong to the same target according to the position information, the human face characteristics and the human body characteristics. In addition, the invention also relates to a block chain technology, and the monitoring picture can be stored in the node of the block chain. The invention also provides a pedestrian trajectory analysis device based on multi-trajectory fusion, electronic equipment and a storage medium. The invention can improve the accuracy of pedestrian track identification.

Description

Pedestrian trajectory analysis method, device, equipment and medium based on multi-trajectory fusion
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a pedestrian trajectory analysis method and device based on multi-trajectory fusion, electronic equipment and a computer-readable storage medium.
Background
With the increasing importance of people on life safety, people increasingly utilize equipment such as cameras and video recorders to monitor and analyze the living environment, for example, the video of a camera is analyzed, and then the track of a target pedestrian is distinguished from the tracks of a plurality of rows of people, but the monitoring picture of the camera is a two-dimensional plane, so that the monitoring range of a single camera is limited, even if the track of the same person is in the pictures monitored by a plurality of cameras, the monitoring picture cannot be overlapped, and therefore how to analyze the monitoring picture, so as to determine whether the tracks monitored by different cameras are the tracks of the same person, and the problem to be solved urgently is solved.
The existing method for tracking and distinguishing the track of the target pedestrian is mainly the face of the pedestrian captured by the camera, whether the pedestrian is the same person is judged through a face recognition technology, and then monitoring pictures of different cameras are analyzed according to the judgment result, so that the track of the pedestrian is tracked. However, in this method, due to factors such as technology or cost, the resolution of most cameras is not high enough, and it is difficult to capture the clear front face of a pedestrian, so that the face recognition effect is not ideal.
Disclosure of Invention
The invention provides a pedestrian trajectory analysis method and device based on multi-trajectory fusion and a computer readable storage medium, and mainly aims to solve the problem of low accuracy of pedestrian trajectory identification.
In order to achieve the above object, the present invention provides a pedestrian trajectory analysis method based on multi-trajectory fusion, which includes:
acquiring monitoring pictures of cameras at different positions in a preset space, and establishing a three-dimensional model of the preset space according to the monitoring pictures;
acquiring a first monitoring video of a camera at a first position in the preset space, and identifying the position information of a first target in each frame of image in the first monitoring video;
mapping the position information of the first target in each frame of image to the three-dimensional model to obtain a first motion track of the first target;
extracting a first human body characteristic and a first face characteristic of the first target from each frame of image;
acquiring a second monitoring video of a camera at a second position in the preset space, and generating a second motion track, a second human body characteristic and a second face characteristic of a second target according to the second monitoring video;
and calculating the coincidence degree of the first target and the second target according to the first motion track, the second motion track, the first human body characteristic, the second human body characteristic, the first human face characteristic and the second human face characteristic, and determining whether the first motion track and the second motion track belong to the motion track of the same person according to the coincidence degree.
Optionally, the establishing a three-dimensional model of the preset space according to the monitoring picture includes:
acquiring shooting pictures for shooting the same target from different angles in the monitoring pictures;
selecting the shooting pictures corresponding to one angle one by one as target pictures, and randomly selecting any pixel point of the target from the target pictures as a target pixel point;
generating a vector towards the direction of the target pixel point by taking a camera for shooting the shot picture as an origin;
measuring a horizontal included angle between the vector and the horizontal direction, and measuring a vertical included angle between the vector and the vertical direction;
calculating the space coordinate of a camera for shooting the shot picture according to the modular length of the vector, the horizontal included angle and the vertical included angle;
and constructing a three-dimensional coordinate system by taking the target pixel point as an origin and taking the space coordinate of each camera at different positions as a known coordinate, and determining the three-dimensional coordinate system as a three-dimensional model of the preset space.
Optionally, the identifying the position information of the first target in each frame of image in the first monitoring video includes:
selecting one frame of image from the first monitoring video one by one as a target image;
performing convolution and pooling operations on the target image to obtain image characteristics of the target image;
and determining the position of the image feature in the target image as the position information of the first target.
Optionally, the mapping the position information of the target in each frame of image to the three-dimensional model to obtain a first motion trajectory of the first target includes:
constructing a plane coordinate system in the target image by taking a central pixel of the target image as an origin;
calculating position coordinates corresponding to position information contained in the target image from the plane coordinate system;
mapping the position coordinates to the three-dimensional model by using a preset mapping function to obtain three-dimensional coordinates of the position information in the three-dimensional model;
and connecting the three-dimensional coordinates of the position information of the target in each frame of image in the three-dimensional model to obtain a first motion track of the first target in the three-dimensional model.
Optionally, the extracting the first human body feature and the first face feature of the first target from each frame of image includes:
cutting each frame of image in the first monitoring video according to the position information to obtain a human body image area corresponding to each frame of image;
selecting human body image areas corresponding to one frame of image in the first monitoring video one by one as target areas, generating global features of the target areas according to pixel gradients in the target areas, and taking the global features as the first human body features;
calculating the probability value of each pixel point in the human body image region as a human face pixel by using a preset activation function, and determining the region where the pixel point with the probability value larger than a preset threshold value is located as a human face region;
performing frame selection on the face regions one by using a preset sliding window to obtain a pixel window;
and generating local features of the target area according to the pixel values in each pixel window, and taking the local features as the first face features.
Optionally, the generating a global feature of the target region according to the pixel gradient in the target region includes:
counting the pixel value of each pixel point in the target area;
taking the maximum pixel value and the minimum pixel value in the pixel values as the input of a preset mapping function, and mapping the pixel value of each pixel point in the target area to a preset range by using the mapping function;
calculating the pixel gradient of each line of pixels in the mapped target area, converting the pixel gradient of each line of pixels into a line vector, and splicing the line vector into the global feature of the target area.
Optionally, the generating a local feature of the target region according to the pixel value in each of the pixel windows includes:
selecting one pixel point from the pixel window one by one as a target pixel point;
judging whether the pixel value of the target pixel point is an extreme value in the pixel window;
when the pixel value of the target pixel point is not an extreme value in the pixel window, returning to the step of selecting one pixel point from the pixel window one by one as the target pixel point;
when the pixel value of the target pixel point is an extreme value in the pixel window, determining the target pixel point as a key point;
vectorizing the pixel values of all key points in all the pixel windows, and collecting the obtained vectors as the local features of the target area.
In order to solve the above problem, the present invention further provides a pedestrian trajectory analysis device based on multi-trajectory fusion, the device comprising:
the three-dimensional model building module is used for acquiring monitoring pictures of cameras at different positions in a preset space and building a three-dimensional model of the preset space according to the monitoring pictures;
the position identification module is used for acquiring a first monitoring video of a camera at a first position in the preset space and identifying the position information of a first target in each frame of image in the first monitoring video;
the first track analysis module is used for mapping the position information of the first target in each frame of image to the three-dimensional model to obtain a first motion track of the first target and extracting a first human body feature and a first face feature of the first target from each frame of image;
the second track analysis module is used for acquiring a second monitoring video of the camera at a second position in the preset space and generating a second motion track, a second human body characteristic and a second face characteristic of a second target according to the second monitoring video;
and the track judgment module is used for calculating the contact ratio of the first target and the second target according to the first motion track, the second motion track, the first human body characteristic, the second human body characteristic, the first human face characteristic and the second human face characteristic, and confirming whether the first motion track and the second motion track belong to the motion track of the same person according to the contact ratio.
In order to solve the above problem, the present invention also provides an electronic device, including:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores a computer program executable by the at least one processor, the computer program being executable by the at least one processor to enable the at least one processor to perform the multi-trajectory fusion based pedestrian trajectory analysis method described above.
In order to solve the above problem, the present invention further provides a computer-readable storage medium, in which at least one computer program is stored, and the at least one computer program is executed by a processor in an electronic device to implement the pedestrian trajectory analysis method based on multi-trajectory fusion described above.
According to the embodiment of the invention, the three-dimensional model of the preset space can be constructed by monitoring pictures at different positions of the space, the first target and the second target are analyzed from different shooting angles, so that the motion tracks of the first target and the second target are mapped into the three-dimensional model, and the comprehensive judgment is carried out by combining the human body characteristics and the human face characteristics of the first target and the second target, so that whether the motion tracks of the first target and the second target belong to the same person is determined, and the accurate analysis of the motion tracks is realized. Therefore, the pedestrian trajectory analysis method, the pedestrian trajectory analysis device, the electronic equipment and the computer-readable storage medium based on multi-trajectory fusion can solve the problem of low accuracy of pedestrian trajectory identification.
Drawings
Fig. 1 is a schematic flow chart of a pedestrian trajectory analysis method based on multi-trajectory fusion according to an embodiment of the present invention;
fig. 2 is a schematic flowchart of analyzing location information of a first target according to an embodiment of the present invention;
fig. 3 is a schematic flow chart of analyzing a first motion trajectory according to an embodiment of the present invention;
FIG. 4 is a functional block diagram of a pedestrian trajectory analysis apparatus based on multi-trajectory fusion according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of an electronic device for implementing the pedestrian trajectory analysis method based on multi-trajectory fusion according to an embodiment of the present invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The embodiment of the application provides a pedestrian track analysis method based on multi-track fusion. The execution subject of the pedestrian trajectory analysis method based on multi-trajectory fusion includes, but is not limited to, at least one of electronic devices such as a server and a terminal that can be configured to execute the method provided by the embodiments of the present application. In other words, the pedestrian trajectory analysis method based on multi-trajectory fusion may be executed by software or hardware installed in a terminal device or a server device, and the software may be a blockchain platform. The server includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like. The server may be an independent server, or may be a cloud server that provides basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a Content Delivery Network (CDN), a big data and artificial intelligence platform, and the like.
Fig. 1 is a schematic flow chart of a pedestrian trajectory analysis method based on multi-trajectory fusion according to an embodiment of the present invention. In this embodiment, the pedestrian trajectory analysis method based on multi-trajectory fusion includes:
s1, acquiring monitoring pictures of the cameras at different positions in the preset space, and establishing a three-dimensional model of the preset space according to the monitoring pictures.
In the embodiment of the present invention, the preset space may be any space that can be monitored by a camera, for example, a bedroom, a hospital monitoring area, a working room, or an outdoor park.
In detail, the monitoring picture refers to a picture obtained by monitoring the preset space from multiple directions by using cameras in different directions in the preset space, for example, a picture obtained by monitoring the preset space from at least two directions of east, south, west, north, and the like of the preset space by using a camera.
Specifically, the monitoring picture can be captured from data storage areas corresponding to the cameras at different positions in the preset space by using a computer sentence (such as a java sentence, a python sentence, etc.) with a data capture function, where the data storage areas include, but are not limited to, a database, a block chain node, and a network cache.
In the embodiment of the invention, because the monitoring picture comprises pictures monitored by the cameras at different positions in the preset space, the spatial positions of the different pictures are not uniform and cannot be used for uniformly analyzing the pedestrian track subsequently, a three-dimensional model of the preset space can be established through the monitoring picture, so that the coordinate dimensions of objects in the pictures can be uniform when each camera at different positions monitors the pictures, and the accuracy of analyzing the pedestrian track is improved.
In an embodiment of the present invention, the establishing a three-dimensional model of the preset space according to the monitoring picture includes:
acquiring shooting pictures for shooting the same target from different angles in the monitoring pictures;
selecting the shooting pictures corresponding to one angle one by one as target pictures, and randomly selecting any pixel point of the target from the target pictures as a target pixel point;
generating a vector towards the direction of the target pixel point by taking a camera for shooting the shot picture as an origin;
measuring a horizontal included angle between the vector and the horizontal direction, and measuring a vertical included angle between the vector and the vertical direction;
calculating the space coordinate of a camera for shooting the shot picture according to the modular length of the vector, the horizontal included angle and the vertical included angle;
and constructing a three-dimensional coordinate system by taking the target pixel point as an origin and taking the space coordinate of each camera at different positions as a known coordinate, and determining the three-dimensional coordinate system as a three-dimensional model of the preset space.
In detail, any pixel point on the target can be selected from pictures captured by cameras at different positions on the same target as an original point, so that the unification of coordinates of each camera is realized, a vector is generated towards the target pixel point by taking the camera for shooting the pictures as the original point, and a horizontal included angle between the vector and the horizontal direction and a vertical included angle between the vector and the vertical direction are measured by utilizing a cosine law.
Specifically, the target pixel point may be used as an origin of the preset space, and coordinate information of the camera in the preset space is determined according to the horizontal included angle, the vertical included angle and the modular length of the vector, so as to construct a three-dimensional coordinate system including the origin and each camera, and use the three-dimensional coordinate system as a three-dimensional model of the preset space.
S2, acquiring a first monitoring video of the camera at a first position in the preset space, and identifying the position information of the first target in each frame of image in the first monitoring video.
In this embodiment of the present invention, the first monitoring video is a picture of the preset space captured by the camera at the first position in the preset space, and the step of obtaining the first monitoring video of the camera at the first position in the preset space is consistent with the step of obtaining the monitoring pictures of the cameras at different positions in the preset space in S1, which is not described herein again.
In the embodiment of the present invention, each frame of image in the first surveillance video may be analyzed to obtain the position information of the first target in each frame of image, where the first target may be an object, a pedestrian, or the like moving in the preset space, which is monitored by the camera at the first position.
In an embodiment of the present invention, referring to fig. 2, the identifying the position information of the first target in each frame of image in the first monitoring video includes:
s21, selecting one frame of image from the first monitoring video one by one as a target image;
s22, performing convolution and pooling operations on the target image to obtain the image characteristics of the target image;
and S23, determining the position of the image feature in the target image as the position information of the first target.
In detail, the target image may be convolved, pooled and the like by using an artificial intelligence model with a feature extraction function to obtain a plurality of image features in the target image, wherein the artificial intelligence model includes but is not limited to Vgg-net network model, Rcnn-net network model and the like.
Specifically, after the image feature of the target image is obtained, the position information of the image feature in the target image may be used as the position information of the first target, and then the above operation is performed on each frame of image in the first monitoring video to obtain the position information of the first target in each frame of image in the first monitoring video.
And S3, mapping the position information of the first target in each frame of image into the three-dimensional model to obtain a first motion track of the first target.
In this embodiment of the present invention, since the position information of the first target in each frame in the first surveillance video extracted in step S2 is the position information in the planar picture acquired by the camera corresponding to the first surveillance video, and the effect of analyzing the spatial trajectory is poor, the position information of the target in each frame of image may be mapped to the three-dimensional model, so as to obtain the first motion trajectory of the first target in the three-dimensional model.
In the embodiment of the invention, the position information of the target in each frame of image can be analyzed by using the yoloV5 network, and the position information is tracked by using a depsort (tracking) technology, so that the change track of the position information in the three-dimensional model is obtained, and the first motion track is obtained.
In another embodiment of the present invention, referring to fig. 3, the mapping the position information of the object in each frame of image to the three-dimensional model to obtain the first motion trajectory of the first object includes:
s31, constructing a plane coordinate system in the target image by taking the central pixel of the target image as an origin;
s32, calculating position coordinates corresponding to the position information contained in the target image from the plane coordinate system;
s33, mapping the position coordinates to the three-dimensional model by using a preset mapping function to obtain three-dimensional coordinates of the position information in the three-dimensional model;
and S34, connecting the three-dimensional coordinates of the position information of the target in each frame of image in the three-dimensional model to obtain a first motion track of the first target in the three-dimensional model.
In detail, a planar coordinate system may be constructed in the target image with a central pixel point as an origin, and then a position coordinate corresponding to the position information is calculated in the planar coordinate system, and the position coordinate is mapped to the three-dimensional model by using a preset mapping function, so as to obtain a three-dimensional coordinate of the position information in the three-dimensional model, where the mapping function includes but is not limited to a gaussian function and a map function.
Specifically, after the position information of the target in each frame in the first monitoring video is mapped into the three-dimensional model, all the mapped three-dimensional coordinates in the three-dimensional model may be connected by using a smooth curve, and the curve obtained by the connection is used as the first motion trajectory of the first target.
And S4, extracting the first human body feature and the first face feature of the first target from each frame of image.
In one practical application scenario of the invention, since the analysis condition is too single due to the fact that only the image features are used for analyzing the motion trajectory of the pedestrian, the accuracy of the analysis result is low, and therefore, in order to improve the accuracy of the final analysis of the pedestrian trajectory, the first human body feature and the first human face feature of the first target can be extracted from each frame of image, and the analysis of the pedestrian trajectory by combining the motion trajectory, the human body feature and the human face feature is facilitated.
In detail, the human body characteristics refer to the physical characteristics of the first target, such as fat, thin, tall, short, etc.; the first face feature refers to a local feature of the face portion of the first target, such as face texture, face key points, and the like.
In an embodiment of the present invention, the extracting a first human body feature and a first face feature of the first target from each frame of image includes:
cutting each frame of image in the first monitoring video according to the position information to obtain a human body image area corresponding to each frame of image;
selecting human body image areas corresponding to one frame of image in the first monitoring video one by one as target areas, generating global features of the target areas according to pixel gradients in the target areas, and taking the global features as the first human body features;
calculating the probability value of each pixel point in the human body image region as a human face pixel by using a preset activation function, and determining the region where the pixel point with the probability value larger than a preset threshold value is located as a human face region;
performing frame selection on the face regions one by using a preset sliding window to obtain a pixel window;
and generating local features of the target area according to the pixel values in each pixel window, and taking the local features as the first face features.
In one embodiment of the present invention, the global features of the target image may be extracted by using a Histogram of Oriented Gradients (HOG), a Deformable Part Model (DPM), a Local Binary Pattern (LBP), or the like, or may be extracted by using a pre-trained artificial intelligence Model with a specific image feature extraction function, where the artificial intelligence Model includes, but is not limited to, a VGG-net Model and a U-net Model.
In another embodiment of the present invention, the generating the global feature of the target region according to the pixel gradient in the target region includes:
counting the pixel value of each pixel point in the target area;
taking the maximum pixel value and the minimum pixel value in the pixel values as the input of a preset mapping function, and mapping the pixel value of each pixel point in the target area to a preset range by using the mapping function;
calculating the pixel gradient of each line of pixels in the mapped target area, converting the pixel gradient of each line of pixels into a line vector, and splicing the line vector into the global feature of the target area.
Illustratively, the preset mapping function may be:
Figure BDA0003286952040000101
wherein, YiMapping the ith pixel point in the target region to the pixel value, x, within the preset rangeiThe pixel value of the ith pixel point in the target region, max (x) is the maximum pixel value in the target region, and min (x) is the minimum pixel value in the target region.
Further, a preset activation function can be used for calculating a probability value that each pixel point in the human body image region is a human face pixel, and then a region where the pixel point with the probability value larger than a preset threshold value is located is selected from the human body image region to be a human face region, wherein the preset activation function includes but is not limited to a softma activation function, a sigmoid activation function and a relu activation function.
In the embodiment of the invention, the pixel gradient of each line of pixels in the target region after mapping can be calculated by using a preset gradient algorithm, wherein the gradient algorithm comprises but is not limited to a two-dimensional discrete derivative algorithm, a cable operator and the like.
In the embodiment of the present application, the pixel gradient of each row of pixels may be converted into a row vector, and the row vector may be spliced into a global feature of the target region.
For example, the selected target region includes three rows of pixels, where the pixel gradient of the first row of pixels is a, b, and c, the pixel gradient of the second row of pixels is d, e, and f, and the pixel gradient of the third row of pixels is g, h, and i, then the pixel gradient of each row of pixels can be respectively used as a row vector, and the following global features are spliced:
Figure BDA0003286952040000102
further, the generating a local feature of the target region according to the pixel values in each of the pixel windows includes:
selecting one pixel point from the pixel window one by one as a target pixel point;
judging whether the pixel value of the target pixel point is an extreme value in the pixel window;
when the pixel value of the target pixel point is not an extreme value in the pixel window, returning to the step of selecting one pixel point from the pixel window one by one as the target pixel point;
when the pixel value of the target pixel point is an extreme value in the pixel window, determining the target pixel point as a key point;
vectorizing the pixel values of all key points in all the pixel windows, and collecting the obtained vectors as the local features of the target area.
In this embodiment of the application, the sliding window may be a pre-constructed selection box with a certain area, and may be used to frame pixels in the target region, for example, a square selection box constructed with 10 pixels as height and 10 pixels as width.
In detail, the extreme value includes a maximum value and a minimum value, and when the pixel value of the target pixel point is the maximum value or the minimum value in the pixel window, the target pixel point is determined to be the key point of the pixel window.
Specifically, the step of vectorizing the pixel values of all the key points in the pixel window is consistent with the step of calculating and mapping the pixel gradient of each line of pixels in the target area, and the step of converting the pixel gradient of each line of pixels into a line vector is not repeated again.
S5, obtaining a second monitoring video of the camera at a second position in the preset space, and generating a second motion track, a second human body feature and a second face feature of a second target according to the second monitoring video.
In the embodiment of the present invention, the camera at the second position is any camera in the preset space, which is located at a different position from the camera at the first position.
In detail, the step of obtaining the second surveillance video of the camera at the second position in the preset space, and generating the second motion trajectory, the second human body feature and the second human face feature of the second target according to the second surveillance video is consistent with the step of extracting the first motion trajectory, the first human body feature and the first human face feature of the first target from the first surveillance video in S2 to S4, and is not repeated again.
S6, calculating the coincidence degree of the first target and the second target according to the first motion track, the second motion track, the first human body characteristic, the second human body characteristic, the first human face characteristic and the second human face characteristic, and confirming whether the first motion track and the second motion track belong to the motion track of the same person according to the coincidence degree.
In the embodiment of the present invention, a preset distance function may be used to calculate the first motion trajectory and the second motion trajectory, the first human body feature and the second human body feature, and the first human face feature and the second human face feature, respectively calculate a distance value between the first motion trajectory and the second motion trajectory, a distance value between the first human body feature and the second human body feature, and a distance value between the first human face feature and the second human face feature, and further calculate a coincidence degree between the first target and the second target according to the three calculated distance values, so as to determine whether the first target and the second target are the same target according to the coincidence degree.
For example, the distance value between the first motion trajectory and the second motion trajectory may be calculated using the following distance algorithm:
Figure BDA0003286952040000121
wherein D is a distance value between the first motion trajectory and the second motion trajectory, n is the first motion trajectory, and m is the second motion trajectory.
In other embodiments of the present invention, the distance value between the first motion trajectory and the second motion trajectory may be calculated by using an algorithm having a distance value calculation function, such as an euclidean distance algorithm, a cosine distance algorithm, or the like.
In detail, the step of calculating the distance value between the first human body feature and the second human body feature and the step of calculating the distance value between the first human face feature and the second human face feature are consistent with the step of calculating the distance value between the first motion trajectory and the second motion trajectory, and are not repeated herein.
In an embodiment of the present invention, the calculating a coincidence ratio of the first target and the second target according to the first motion trajectory, the second motion trajectory, the first human body feature, the second human body feature, the first face feature, and the second face feature includes:
calculating the coincidence degree of the first target and the second target by using the following weight algorithm:
C=α*A+β*B+γ*C
wherein C is a coincidence degree of the first target and the second target, a is a distance value between the first motion trajectory and the second motion trajectory, B is a distance value between the first human face feature and the second human face feature, C is a distance value between the first human face feature and the second human face feature, and α, β, and γ are preset weight coefficients.
In the embodiment of the present invention, the coincidence degree may be compared with a preset coincidence threshold, and when the coincidence degree is greater than the preset coincidence threshold, it may be determined that the trajectories of the first target and the second target are the same, so as to determine that the first target and the second target are the same; when the contact ratio is smaller than or equal to the preset contact threshold, determining that the tracks of the first target and the second target are inconsistent, and further determining that the first target is different from the second target.
According to the embodiment of the invention, the three-dimensional model of the preset space can be constructed by monitoring pictures at different positions of the space, the first target and the second target are analyzed from different shooting angles, so that the motion tracks of the first target and the second target are mapped into the three-dimensional model, and the comprehensive judgment is carried out by combining the human body characteristics and the human face characteristics of the first target and the second target, so that whether the motion tracks of the first target and the second target belong to the same person is determined, and the accurate analysis of the motion tracks is realized. Therefore, the pedestrian trajectory analysis method based on multi-trajectory fusion can solve the problem of low accuracy of pedestrian trajectory identification.
Fig. 4 is a functional block diagram of a pedestrian trajectory analysis apparatus based on multi-trajectory fusion according to an embodiment of the present invention.
The pedestrian trajectory analysis device 100 based on multi-trajectory fusion can be installed in electronic equipment. According to the realized functions, the pedestrian trajectory analysis device 100 based on multi-trajectory fusion can comprise a three-dimensional model building module 101, a position identification module 102, a first trajectory analysis module 103, a second trajectory analysis module 104 and a trajectory judgment module 105. The module of the present invention, which may also be referred to as a unit, refers to a series of computer program segments that can be executed by a processor of an electronic device and that can perform a fixed function, and that are stored in a memory of the electronic device.
In the present embodiment, the functions regarding the respective modules/units are as follows:
the three-dimensional model building module 101 is configured to obtain monitoring pictures of cameras at different positions in a preset space, and build a three-dimensional model of the preset space according to the monitoring pictures;
the position identification module 102 is configured to obtain a first monitoring video of a camera at a first position in the preset space, and identify position information of a first target in each frame of image in the first monitoring video;
the first trajectory analysis module 103 is configured to map the position information of the first target in each frame of image into the three-dimensional model to obtain a first motion trajectory of the first target, and is configured to extract a first human body feature and a first face feature of the first target from each frame of image;
the second track analysis module 104 is configured to obtain a second monitoring video of the camera at a second position in the preset space, and generate a second motion track, a second human body feature, and a second face feature of a second target according to the second monitoring video;
the trajectory determination module 105 is configured to calculate a coincidence degree of the first target and the second target according to the first motion trajectory, the second motion trajectory, the first human body feature, the second human body feature, the first face feature, and the second face feature, and determine whether the first motion trajectory and the second motion trajectory belong to a motion trajectory of the same person according to the coincidence degree.
In detail, when used, each module in the multi-track fusion-based pedestrian track analysis apparatus 100 according to the embodiment of the present invention adopts the same technical means as the multi-track fusion-based pedestrian track analysis method described in fig. 1 to 3, and can produce the same technical effect, which is not described herein again.
Fig. 5 is a schematic structural diagram of an electronic device for implementing a pedestrian trajectory analysis method based on multi-trajectory fusion according to an embodiment of the present invention.
The electronic device 1 may include a processor 10, a memory 11, a communication bus 12, and a communication interface 13, and may further include a computer program, such as a pedestrian trajectory analysis program based on multi-trajectory fusion, stored in the memory 11 and executable on the processor 10.
In some embodiments, the processor 10 may be composed of an integrated circuit, for example, a single packaged integrated circuit, or may be composed of a plurality of integrated circuits packaged with the same function or different functions, and includes one or more Central Processing Units (CPUs), a microprocessor, a digital Processing chip, a graphics processor, a combination of various control chips, and the like. The processor 10 is a Control Unit (Control Unit) of the electronic device, connects various components of the electronic device by using various interfaces and lines, and executes various functions and processes data of the electronic device by running or executing programs or modules stored in the memory 11 (for example, executing a pedestrian trajectory analysis program based on multi-trajectory fusion, etc.), and calling data stored in the memory 11.
The memory 11 includes at least one type of readable storage medium including flash memory, removable hard disks, multimedia cards, card-type memory (e.g., SD or DX memory, etc.), magnetic memory, magnetic disks, optical disks, etc. The memory 11 may in some embodiments be an internal storage unit of the electronic device, for example a removable hard disk of the electronic device. The memory 11 may also be an external storage device of the electronic device in other embodiments, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the electronic device. Further, the memory 11 may also include both an internal storage unit and an external storage device of the electronic device. The memory 11 may be used not only to store application software installed in the electronic device and various types of data, such as codes of a pedestrian trajectory analysis program based on multi-trajectory fusion, etc., but also to temporarily store data that has been output or is to be output.
The communication bus 12 may be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus. The bus may be divided into an address bus, a data bus, a control bus, etc. The bus is arranged to enable connection communication between the memory 11 and at least one processor 10 or the like.
The communication interface 13 is used for communication between the electronic device and other devices, and includes a network interface and a user interface. Optionally, the network interface may include a wired interface and/or a wireless interface (e.g., WI-FI interface, bluetooth interface, etc.), which are typically used to establish a communication connection between the electronic device and other electronic devices. The user interface may be a Display (Display), an input unit such as a Keyboard (Keyboard), and optionally a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch device, or the like. The display, which may also be referred to as a display screen or display unit, is suitable, among other things, for displaying information processed in the electronic device and for displaying a visualized user interface.
Fig. 5 only shows an electronic device with components, and it will be understood by a person skilled in the art that the structure shown in fig. 5 does not constitute a limitation of the electronic device 1, and may comprise fewer or more components than shown, or a combination of certain components, or a different arrangement of components.
For example, although not shown, the electronic device may further include a power supply (such as a battery) for supplying power to each component, and preferably, the power supply may be logically connected to the at least one processor 10 through a power management device, so that functions of charge management, discharge management, power consumption management and the like are realized through the power management device. The power supply may also include any component of one or more dc or ac power sources, recharging devices, power failure detection circuitry, power converters or inverters, power status indicators, and the like. The electronic device may further include various sensors, a bluetooth module, a Wi-Fi module, and the like, which are not described herein again.
It is to be understood that the described embodiments are for purposes of illustration only and that the scope of the appended claims is not limited to such structures.
The multi-track fusion based pedestrian track analysis program stored in the memory 11 of the electronic device 1 is a combination of a plurality of instructions, and when running in the processor 10, can realize:
acquiring monitoring pictures of cameras at different positions in a preset space, and establishing a three-dimensional model of the preset space according to the monitoring pictures;
acquiring a first monitoring video of a camera at a first position in the preset space, and identifying the position information of a first target in each frame of image in the first monitoring video;
mapping the position information of the first target in each frame of image to the three-dimensional model to obtain a first motion track of the first target;
extracting a first human body characteristic and a first face characteristic of the first target from each frame of image;
acquiring a second monitoring video of a camera at a second position in the preset space, and generating a second motion track, a second human body characteristic and a second face characteristic of a second target according to the second monitoring video;
and calculating the coincidence degree of the first target and the second target according to the first motion track, the second motion track, the first human body characteristic, the second human body characteristic, the first human face characteristic and the second human face characteristic, and determining whether the first motion track and the second motion track belong to the motion track of the same person according to the coincidence degree.
Specifically, the specific implementation method of the instruction by the processor 10 may refer to the description of the relevant steps in the embodiment corresponding to the drawings, which is not described herein again.
Further, the integrated modules/units of the electronic device 1, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. The computer readable storage medium may be volatile or non-volatile. For example, the computer-readable medium may include: any entity or device capable of carrying said computer program code, recording medium, U-disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM).
The present invention also provides a computer-readable storage medium, storing a computer program which, when executed by a processor of an electronic device, may implement:
acquiring monitoring pictures of cameras at different positions in a preset space, and establishing a three-dimensional model of the preset space according to the monitoring pictures;
acquiring a first monitoring video of a camera at a first position in the preset space, and identifying the position information of a first target in each frame of image in the first monitoring video;
mapping the position information of the first target in each frame of image to the three-dimensional model to obtain a first motion track of the first target;
extracting a first human body characteristic and a first face characteristic of the first target from each frame of image;
acquiring a second monitoring video of a camera at a second position in the preset space, and generating a second motion track, a second human body characteristic and a second face characteristic of a second target according to the second monitoring video;
and calculating the coincidence degree of the first target and the second target according to the first motion track, the second motion track, the first human body characteristic, the second human body characteristic, the first human face characteristic and the second human face characteristic, and determining whether the first motion track and the second motion track belong to the motion track of the same person according to the coincidence degree.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus, device and method can be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof.
The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.
The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
The embodiment of the application can acquire and process related data based on an artificial intelligence technology. Among them, Artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result.
Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the system claims may also be implemented by one unit or means in software or hardware. The terms first, second, etc. are used to denote names, but not any particular order.
Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims (10)

1. A pedestrian trajectory analysis method based on multi-trajectory fusion is characterized by comprising the following steps:
acquiring monitoring pictures of cameras at different positions in a preset space, and establishing a three-dimensional model of the preset space according to the monitoring pictures;
acquiring a first monitoring video of a camera at a first position in the preset space, and identifying the position information of a first target in each frame of image in the first monitoring video;
mapping the position information of the first target in each frame of image to the three-dimensional model to obtain a first motion track of the first target;
extracting a first human body characteristic and a first face characteristic of the first target from each frame of image;
acquiring a second monitoring video of a camera at a second position in the preset space, and generating a second motion track, a second human body characteristic and a second face characteristic of a second target according to the second monitoring video;
and calculating the coincidence degree of the first target and the second target according to the first motion track, the second motion track, the first human body characteristic, the second human body characteristic, the first human face characteristic and the second human face characteristic, and determining whether the first motion track and the second motion track belong to the motion track of the same person according to the coincidence degree.
2. The pedestrian trajectory analysis method based on multi-trajectory fusion as claimed in claim 1, wherein the building of the three-dimensional model of the preset space according to the monitoring picture comprises:
acquiring shooting pictures for shooting the same target from different angles in the monitoring pictures;
selecting the shooting pictures corresponding to one angle one by one as target pictures, and randomly selecting any pixel point of the target from the target pictures as a target pixel point;
generating a vector towards the direction of the target pixel point by taking a camera for shooting the shot picture as an origin;
measuring a horizontal included angle between the vector and the horizontal direction, and measuring a vertical included angle between the vector and the vertical direction;
calculating the space coordinate of a camera for shooting the shot picture according to the modular length of the vector, the horizontal included angle and the vertical included angle;
and constructing a three-dimensional coordinate system by taking the target pixel point as an origin and taking the space coordinate of each camera at different positions as a known coordinate, and determining the three-dimensional coordinate system as a three-dimensional model of the preset space.
3. The method for analyzing pedestrian trajectories based on multi-trajectory fusion as claimed in claim 1, wherein the identifying the position information of the first target in each frame of image in the first surveillance video comprises:
selecting one frame of image from the first monitoring video one by one as a target image;
performing convolution and pooling operations on the target image to obtain image characteristics of the target image;
and determining the position of the image feature in the target image as the position information of the first target.
4. The method for analyzing pedestrian trajectories based on multi-trajectory fusion as claimed in claim 1, wherein the mapping the position information of the object in each frame of image to the three-dimensional model to obtain the first motion trajectory of the first object comprises:
constructing a plane coordinate system in the target image by taking a central pixel of the target image as an origin;
calculating position coordinates corresponding to position information contained in the target image from the plane coordinate system;
mapping the position coordinates to the three-dimensional model by using a preset mapping function to obtain three-dimensional coordinates of the position information in the three-dimensional model;
and connecting the three-dimensional coordinates of the position information of the target in each frame of image in the three-dimensional model to obtain a first motion track of the first target in the three-dimensional model.
5. The method for analyzing pedestrian trajectories based on multi-trajectory fusion according to any one of claims 1 to 4, wherein the extracting first human features and first face features of the first target from each frame of image comprises:
cutting each frame of image in the first monitoring video according to the position information to obtain a human body image area corresponding to each frame of image;
selecting human body image areas corresponding to one frame of image in the first monitoring video one by one as target areas, generating global features of the target areas according to pixel gradients in the target areas, and taking the global features as the first human body features;
calculating the probability value of each pixel point in the human body image region as a human face pixel by using a preset activation function, and determining the region where the pixel point with the probability value larger than a preset threshold value is located as a human face region;
performing frame selection on the face regions one by using a preset sliding window to obtain a pixel window;
and generating local features of the target area according to the pixel values in each pixel window, and taking the local features as the first face features.
6. The method for analyzing pedestrian trajectories based on multi-trajectory fusion as claimed in claim 5, wherein the generating global features of the target region according to pixel gradients in the target region comprises:
counting the pixel value of each pixel point in the target area;
taking the maximum pixel value and the minimum pixel value in the pixel values as the input of a preset mapping function, and mapping the pixel value of each pixel point in the target area to a preset range by using the mapping function;
calculating the pixel gradient of each line of pixels in the mapped target area, converting the pixel gradient of each line of pixels into a line vector, and splicing the line vector into the global feature of the target area.
7. The method for analyzing pedestrian trajectories based on multi-trajectory fusion as claimed in claim 5, wherein the generating local features of the target region according to the pixel values in each of the pixel windows comprises:
selecting one pixel point from the pixel window one by one as a target pixel point;
judging whether the pixel value of the target pixel point is an extreme value in the pixel window;
when the pixel value of the target pixel point is not an extreme value in the pixel window, returning to the step of selecting one pixel point from the pixel window one by one as the target pixel point;
when the pixel value of the target pixel point is an extreme value in the pixel window, determining the target pixel point as a key point;
vectorizing the pixel values of all key points in all the pixel windows, and collecting the obtained vectors as the local features of the target area.
8. A pedestrian trajectory analysis device based on multi-trajectory fusion, characterized in that the device comprises:
the three-dimensional model building module is used for acquiring monitoring pictures of cameras at different positions in a preset space and building a three-dimensional model of the preset space according to the monitoring pictures;
the position identification module is used for acquiring a first monitoring video of a camera at a first position in the preset space and identifying the position information of a first target in each frame of image in the first monitoring video;
the first track analysis module is used for mapping the position information of the first target in each frame of image to the three-dimensional model to obtain a first motion track of the first target and extracting a first human body feature and a first face feature of the first target from each frame of image;
the second track analysis module is used for acquiring a second monitoring video of the camera at a second position in the preset space and generating a second motion track, a second human body characteristic and a second face characteristic of a second target according to the second monitoring video;
and the track judgment module is used for calculating the contact ratio of the first target and the second target according to the first motion track, the second motion track, the first human body characteristic, the second human body characteristic, the first human face characteristic and the second human face characteristic, and confirming whether the first motion track and the second motion track belong to the motion track of the same person according to the contact ratio.
9. An electronic device, characterized in that the electronic device comprises:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the method of pedestrian trajectory analysis based on multi-trajectory fusion according to any one of claims 1 to 7.
10. A computer-readable storage medium, in which a computer program is stored, which, when being executed by a processor, implements the method for analyzing a pedestrian trajectory based on multi-trajectory fusion according to any one of claims 1 to 7.
CN202111150133.3A 2021-09-29 2021-09-29 Pedestrian trajectory analysis method, device, equipment and medium based on multi-trajectory fusion Pending CN113887384A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111150133.3A CN113887384A (en) 2021-09-29 2021-09-29 Pedestrian trajectory analysis method, device, equipment and medium based on multi-trajectory fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111150133.3A CN113887384A (en) 2021-09-29 2021-09-29 Pedestrian trajectory analysis method, device, equipment and medium based on multi-trajectory fusion

Publications (1)

Publication Number Publication Date
CN113887384A true CN113887384A (en) 2022-01-04

Family

ID=79008031

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111150133.3A Pending CN113887384A (en) 2021-09-29 2021-09-29 Pedestrian trajectory analysis method, device, equipment and medium based on multi-trajectory fusion

Country Status (1)

Country Link
CN (1) CN113887384A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115862089A (en) * 2022-10-11 2023-03-28 深圳捷易建设集团有限公司 Security monitoring method, device, equipment and medium based on face recognition

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115862089A (en) * 2022-10-11 2023-03-28 深圳捷易建设集团有限公司 Security monitoring method, device, equipment and medium based on face recognition
CN115862089B (en) * 2022-10-11 2023-09-26 深圳捷易建设集团有限公司 Security monitoring method, device, equipment and medium based on face recognition

Similar Documents

Publication Publication Date Title
CN110458895B (en) Image coordinate system conversion method, device, equipment and storage medium
CN112446919A (en) Object pose estimation method and device, electronic equipment and computer storage medium
CN112100425B (en) Label labeling method and device based on artificial intelligence, electronic equipment and medium
CN114758249B (en) Target object monitoring method, device, equipment and medium based on field night environment
CN109934873B (en) Method, device and equipment for acquiring marked image
CN114241338A (en) Building measuring method, device, equipment and storage medium based on image recognition
CN113011280A (en) Method and device for detecting person contact distance, computer equipment and storage medium
CN113903068A (en) Stranger monitoring method, device and equipment based on human face features and storage medium
CN114049568A (en) Object shape change detection method, device, equipment and medium based on image comparison
CN111353429A (en) Interest degree method and system based on eyeball turning
CN112528903B (en) Face image acquisition method and device, electronic equipment and medium
CN113887384A (en) Pedestrian trajectory analysis method, device, equipment and medium based on multi-trajectory fusion
CN113658265A (en) Camera calibration method and device, electronic equipment and storage medium
CN113255456B (en) Inactive living body detection method, inactive living body detection device, electronic equipment and storage medium
CN112102398B (en) Positioning method, device, equipment and storage medium
CN114463685A (en) Behavior recognition method and device, electronic equipment and storage medium
CN113888086A (en) Article signing method, device and equipment based on image recognition and storage medium
CN112488076A (en) Face image acquisition method, system and equipment
CN114049676A (en) Fatigue state detection method, device, equipment and storage medium
CN112541436A (en) Concentration degree analysis method and device, electronic equipment and computer storage medium
CN111753766A (en) Image processing method, device, equipment and medium
CN117333929B (en) Method and system for identifying abnormal personnel under road construction based on deep learning
CN115862089B (en) Security monitoring method, device, equipment and medium based on face recognition
CN113723520A (en) Personnel trajectory tracking method, device, equipment and medium based on feature update
CN112581525B (en) Method, device and equipment for detecting state of human body wearing article and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination