CN115423735A

CN115423735A - Passenger flow volume statistical method and system

Info

Publication number: CN115423735A
Application number: CN202110519307.2A
Authority: CN
Inventors: 刘宏炜
Original assignee: China Mobile Communications Group Co Ltd; China Mobile Xiongan ICT Co Ltd; China Mobile System Integration Co Ltd
Current assignee: China Mobile Communications Group Co Ltd; China Mobile Xiongan ICT Co Ltd; China Mobile System Integration Co Ltd
Priority date: 2021-05-12
Filing date: 2021-05-12
Publication date: 2022-12-02

Abstract

The invention provides a passenger flow volume statistical method and a system, wherein the method comprises the following steps: extracting human body boundary characteristics of the video frame sequence image of the passenger flow volume to be counted through a pre-trained coding and decoding model to obtain a human body boundary characteristic diagram corresponding to the video frame sequence image; inputting the human body boundary characteristic diagram into a trained passenger flow tracking model, and outputting a tracking result of a target human body in a video frame sequence image; and according to the tracking result, counting the passenger flow in the target area to obtain a passenger flow counting result in the target area. The invention simultaneously carries out target human body detection and identity embedding, and shares the training tracking model through the characteristic diagram, thereby reducing repeated calculation, avoiding the waste of calculation resources and improving the efficiency and the real-time performance of target tracking.

Description

Passenger flow volume statistical method and system

Technical Field

The invention relates to the technical field of artificial intelligence and computer vision, in particular to a passenger flow volume statistical method and system.

Background

The passenger flow volume statistical technology can provide business decisions for a shopping mall, and the opening of an air conditioner is controlled by counting the total number of people in the shopping mall, so that the energy conservation and emission reduction can be favorably developed, and a green and environment-friendly shopping mall can be created; according to the statistics of the number of the large merchants exceeding each gate, the manager can be helped to effectively make early warning under the condition that the passenger flow at a certain gate rapidly rises, take shunting measures, and effectively avoid the occurrence of personnel jam and trample accidents by temporarily increasing access passages and maintenance personnel.

The traditional passenger flow volume statistical means usually adopt a mechanical or manual statistical method, so that the efficiency is low, normal personnel walking and the speed of people flow are interfered, and the requirement of real-time statistics cannot be met. In recent years, the rapid development of artificial intelligence, computer vision and deep learning greatly promotes the passenger flow volume statistical research based on monitoring videos. The passenger flow volume statistics based on the monitoring video is realized by accurately detecting, identifying and tracking target pedestrians in a monitoring area in real time through image processing and computer vision theories and technologies (particularly deep learning application in the image processing and computer vision fields) and depending on the high-efficiency computing capability of a cloud computing and deep learning acceleration chip, and thus the passenger flow volume statistics task is completed.

In the existing passenger flow volume statistical scheme, the two-step method of multi-target tracking generally treats target detection and re-identification as two separate tasks, although the most suitable model can be used for each task, an image patch can be cut according to a detected boundary box, and the image patch can be adjusted to the same size before the re-identification feature is predicted. However, this process is usually very slow, because the target detection and re-recognition feature embedding both require a lot of computation, and there is no computation sharing, so that high-rate reasoning cannot be performed, and the actual requirement of video playing cannot be met. While the single-step method, which detects and re-identifies objects at the same time, has the potential to significantly reduce the inference time, the tracking accuracy is generally lower than that of the two-step method.

Disclosure of Invention

The invention provides a passenger flow volume statistical method and a passenger flow volume statistical system, which are used for overcoming the defects of lower passenger flow volume statistical efficiency and lower accuracy in the prior art, realizing more efficient passenger flow volume statistics and improving the human body target tracking precision.

In a first aspect, the present invention provides a passenger flow volume statistical method, including:

extracting human body boundary characteristics of a video frame sequence image of passenger flow volume to be counted through a pre-trained coding and decoding model to obtain a human body boundary characteristic diagram corresponding to the video frame sequence image;

inputting the human body boundary characteristic diagram into a trained passenger flow tracking model, and outputting to obtain a tracking result of a target human body in the video frame sequence image; the trained passenger flow tracking model is obtained by training a neural network and a re-recognition model by using a sample human body boundary characteristic diagram marked with a human body boundary frame and a central point position, and is used for fusing a human body target and the re-recognition characteristic;

and according to the tracking result, carrying out statistics on the passenger flow in the target area to obtain a passenger flow statistical result in the target area.

In one embodiment, the trained passenger traffic tracking model is trained by the following steps:

acquiring a sample human body boundary characteristic diagram marked with a human body boundary frame;

marking the central point position and the corresponding central point deviant of the human body boundary box in each frame of the sample human body boundary characteristic diagram, marking the length and the width of the human body boundary box, and constructing a training sample set;

inputting the training sample set into a re-recognition model for training to obtain a trained re-recognition model;

inputting the training sample set into a neural network for training to obtain a trained human body target detection model;

and constructing and obtaining a trained passenger flow tracking model through the trained re-recognition model and the trained human body target detection model.

In an embodiment, the performing statistics on the passenger flow volume in the target area according to the tracking result to obtain a passenger flow volume statistical result in the target area includes:

acquiring a video frame sequence image corresponding to a target area;

constructing a preset elliptical boundary area in a video frame sequence image based on the directions of an in-out area in the video frame sequence image;

and counting the passenger flow entering and exiting the preset elliptical boundary area according to the central point position of each frame of human body boundary frame in the tracking result.

In one embodiment, after the obtaining of the sample human body boundary feature map marked with the human body boundary box, the method further comprises:

and marking the position of the central point of the human body boundary box in each frame of the sample human body boundary characteristic map based on thermodynamic diagram.

In one embodiment, the pre-trained codec model is a ResNet-34 network.

In one embodiment, the trained traffic tracking model comprises 4 identical hierarchies, each hierarchy being obtained by connecting a 3 × 3 convolutional layer, a Relu layer, and a 1 × 1 convolutional layer in sequence.

In one embodiment, the inputting the human body boundary feature map into a trained passenger flow tracking model, and outputting a tracking result of a target human body in the video frame sequence image includes:

on the basis of Kalman filtering, calculating to obtain a prediction mean value and a prediction covariance matrix of a tracking target in a tracking result according to a tracking result obtained by prediction of the trained passenger flow tracking model;

and calculating the Mahalanobis distance between the predicted value and the true value according to the predicted mean value and the predicted covariance matrix, and reserving the tracking target with the Mahalanobis distance meeting the preset threshold distance to obtain the tracking result of the target human body.

In a second aspect, the present invention provides a passenger flow volume statistics system, comprising:

the coding and decoding module is used for extracting human body boundary characteristics of a video frame sequence image of passenger flow volume to be counted through a pre-trained coding and decoding model to obtain a human body boundary characteristic diagram corresponding to the video frame sequence image;

the human body target tracking module is used for inputting the human body boundary characteristic diagram into a trained passenger flow tracking model and outputting a tracking result of a target human body in the video frame sequence image; the trained passenger flow tracking model is obtained by training a neural network and a re-recognition model by using a sample human body boundary characteristic diagram marked with a human body boundary frame and a central point position, and is used for fusing a human body target and the re-recognition characteristic;

and the passenger flow volume counting module is used for counting the passenger flow volume in the target area according to the tracking result and acquiring the passenger flow volume counting result in the target area.

In a third aspect, the present invention provides an electronic device, which includes a memory and a memory storing a computer program, and when the processor executes the program, the steps of the passenger flow volume statistics method in the first aspect are implemented.

In a fourth aspect, the present invention provides a processor-readable storage medium storing a computer program for causing a processor to perform the steps of the passenger flow volume statistics method of the first aspect.

According to the passenger flow volume statistical method and system, target human body detection and identity embedding are performed simultaneously, the training tracking model is shared through the characteristic diagram, repeated calculation is reduced, waste of calculation resources is avoided, and the target tracking efficiency and real-time performance are improved.

Drawings

In order to more clearly illustrate the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.

FIG. 1 is a schematic flow chart of a passenger flow volume statistical method according to the present invention;

FIG. 2 is a flow chart illustrating the arrangement of passenger flow statistics provided by the present invention;

FIG. 3 is a schematic diagram of a passenger flow statistics system according to the present invention;

FIG. 4 is a schematic diagram of an overall framework of a passenger flow statistics system according to the present invention;

fig. 5 is a schematic structural diagram of an electronic device provided in the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The existing passenger flow volume statistical scheme based on deep learning is roughly divided into two processes: multi Object Tracking (MOT) module and scribe people counting. The multi-target tracking generally regards target detection and re-identification as two separate tasks, and firstly, a convolutional neural network detector is applied, and a plurality of bounding boxes are used for locating all interested objects in an image; then, in a separate step, the convolutional neural network detector crops the image from the boxes and feeds it to the identity embedding network to extract the re-identified features, and then links the boxes to form multiple tracks. The schemes generally follow a standard box type link convention, a cost matrix is calculated according to the average Intersection and Union ratio (IOU) of the re-identification features and the bounding box, and then a link task is completed by using Kalman filtering and Hungarian algorithm. Other few schemes use more complex association strategies such as recurrent neural networks.

The number of people marked off of the existing scheme is usually a two-line counting method, two counting lines, namely an upper counting line and a lower counting line, are arranged in a video, and an area between the two counting lines is set as a tracking area and a counting area. When a target entering a monitoring area is detected to pass through an upper counting line, tracking the target; when the frame line on the target passes through the lower counting line, stopping tracking the frame line; for the target leaving the monitoring area, when the upper frame line of the target passes through the lower counting line, the target starts to be tracked; when the lower frame of the counter passes through the upper counting line, the tracking of the counter is stopped. Starting counting and stopping counting according to whether the central point of the tracking target is in the area between two lines or not, and finishing counting when the central point leaves the counting area; otherwise, the counting is started. When people flow statistics is performed on a target, four parameters of the target are needed: the initial direction, the direction at this time, the position of the previous frame, the position of the current frame, if the target center point of the current frame is in the counting area, but the previous frame is not in the counting area, the counting is started, thereby obtaining the initial direction of the target motion; when the central position of the current frame of the target is not in the counting area, and the previous frame is in the counting area, the counting is finished, and the current direction of the target is determined. And judging according to the initial direction and the current direction of the target, and if the directions of the initial direction and the current direction are consistent, adding one to the number of the entering people or the number of the leaving people.

In the existing passenger flow volume statistical scheme, a target detection and re-identification feature embedding method of a multi-target tracking two-step method needs a large amount of calculation, and calculation sharing does not exist, so that the method cannot execute high-speed reasoning and cannot meet the actual requirement of video playing. As multitask learning matures, single-step methods of simultaneously detecting objects and re-identifying have begun to draw more attention, and the two models share most features, which can greatly reduce inference time. However, the tracking accuracy of the existing single-step method is generally lower than that of the two-step method, and the single-step tracking method based on the anchor point is disadvantageous to re-recognition because the identity embedding feature extracted from the anchor point is not aligned with the target central point, which may cause multiple anchor frames corresponding to the same object, thereby causing ambiguity of re-recognition.

Fig. 1 is a schematic flow diagram of a passenger flow volume statistical method provided by the present invention, and as shown in fig. 1, the present invention provides a passenger flow volume statistical method, including:

step 101, extracting human body boundary characteristics of a video frame sequence image of passenger flow volume to be counted through a pre-trained coding and decoding model to obtain a human body boundary characteristic diagram corresponding to the video frame sequence image.

In the invention, firstly, a video image in a target area is converted into a video frame sequential image, wherein the target area is an in-out area of an intelligent retail store or a large mall, and for in-out areas of other scenes, such as scenes of tourist areas, hospitals and the like, statistics on the flow of people in the scene can be realized through subsequent steps.

Further, human boundary features in the video frame sequential images are extracted through a pre-trained coding and decoding model, and a human boundary feature map of each frame is obtained, wherein the human boundary feature map comprises global information (which is the boundary of the whole human boundary) and local information (which is part of the boundary, such as a hand, a leg, and the like) of the human boundary.

102, inputting the human body boundary characteristic diagram into a trained passenger flow tracking model, and outputting to obtain a tracking result of a target human body in the video frame sequence image; the trained passenger flow volume tracking model is obtained by training a neural network and a re-recognition model through a sample human body boundary characteristic diagram marked with a human body boundary frame and a central point position, and the trained passenger flow volume tracking model is used for fusing a human body target and the re-recognition characteristic.

In the invention, each frame of human body boundary characteristic graph is sequentially input into a trained passenger flow tracking model, human body target detection and re-identification (re-ID) characteristic extraction are simultaneously carried out through the tracking model, an offset value of the central point position of the human body boundary frame between each frame and the re-identification characteristic are fused, thus obtaining a human body boundary frame moving track based on the re-identification characteristic, and obtaining a tracking result of a target human body in a video frame sequence image according to the moving track to obtain a moving process of the human body from the previous frame to the current frame in the video frame sequence image.

And 103, counting the passenger flow in the target area according to the tracking result to obtain a passenger flow counting result in the target area.

In the invention, a counting area can be divided in the target area, and the passenger flow quantity entering and exiting the counting area in each frame of image is counted according to the tracking result. Fig. 2 is a flow chart of passenger flow statistics arrangement provided by the present invention, which can refer to fig. 2, and the video frame sequence is input into the codec model to obtain a feature map about a human body boundary, then, for the feature map of each frame, human body target detection and re-recognition feature extraction are performed simultaneously, and the two extracted features are fused, and further, tracking of a target human body in an image is realized according to a fusion result, so that passenger flow in a preset region is counted through the tracking result.

According to the passenger flow volume statistical method, the target human body detection and the identity embedding are simultaneously carried out, the training tracking model is shared through the characteristic diagram, the repeated calculation is reduced, the waste of calculation resources is avoided, and the efficiency and the real-time performance of target tracking are improved.

On the basis of the above embodiment, the trained passenger flow tracking model is obtained by training through the following steps:

step 201, a sample human body boundary characteristic diagram marked with a human body boundary frame is obtained.

In the invention, a video frame sample sequence image is input into a pre-trained coding model to obtain sample human body boundary characteristic diagrams, and human body boundary frames are marked in the sample human body boundary characteristic diagrams, wherein the pre-trained coding and decoding model is a ResNet-34 network. The invention adopts ResNet-34 as a backbone network, so that the accuracy and the speed of human body boundary extraction are well balanced. In order to adapt to human body boundary extraction objects with different scales, the coding and decoding network is applied to a backbone network, and more jump connections are formed between the lower-layer aggregation and the higher-layer aggregation. Furthermore, all the convolutional layers in the decoding stage are replaced by deformable convolutional layers so that they can dynamically adjust the receptive field according to the size and posture of the extraction object.

Step 202, marking the central point position of the human body boundary box in each frame of the sample human body boundary characteristic diagram and the corresponding central point deviation value, marking the length and the width of the human body boundary box, and constructing a training sample set.

In the invention, the position of the central point of the human body boundary box in each frame of the sample human body boundary characteristic diagram is marked, and in order to solve the ambiguity problem of the ReiD, the position of the central point of the human body boundary box in each frame of the sample human body boundary characteristic diagram is marked based on thermodynamic diagrams. By using the anchorless method, the ambiguity of a plurality of anchor frames is eliminated, and the tracking precision on all benchmarks is obviously improved.

Based on the marking process of the position of the center point of the human body boundary box, the deviation value of the center point between the human body boundary characteristic graphs of each frame of sample is marked, and the length and the width of the human body boundary box are marked, so that after the model is trained by the constructed training sample set, the trained model can obtain the tracking result of each target human body in the frame sequence according to the human body boundary characteristic graphs. Specifically, in the present invention, after completing encoding and decoding, based on 4 identical hierarchical structures, object tracking is performed in parallel, which is: { hm, wh, reg, id }, wherein hm, wh, reg are completed by a human body target detection model obtained through neural network training, and hm: the position of the central point of the human body frame is positioned based on thermodynamic diagrams, so that an anchor-free method is realized; wh is the length and width of the regressed target human body frame, and reg is the deviation value of the central point of the human body frame. And id is a re-identification feature and is completed by a re-identification model.

Optionally, on the basis of the foregoing embodiment, the trained passenger traffic tracking model includes 4 identical hierarchical structures, and each hierarchical structure is obtained by sequentially connecting a 3 × 3 convolutional layer, a Relu layer, and a 1 × 1 convolutional layer. Each hierarchy acts as an extraction task, each setting a loss constraint.

Step 203, inputting the training sample set into a re-recognition model for training to obtain a trained re-recognition model;

step 204, inputting the training sample set into a neural network for training to obtain a trained human body target detection model;

and step 205, constructing a trained passenger flow tracking model through the trained re-recognition model and the trained human target detection model.

In the invention, a re-recognition model and a neural network are trained based on the same training sample set, so that a passenger flow tracking model for tracking a target is obtained through training, and during the tracking process of the training model, four state variable containers are defined: the activated state container is used for storing the initialized target human body track once; the stacking container is used for storing the target human body which appears only once; the tracking state container is used for storing a target human body with a good tracking state; the lost container is used for storing the lost target human body.

Specifically, inputting sample data of a first frame in a training sample set into a re-recognition model and a neural network to obtain a human body boundary box and corresponding re-recognition features (in the training process, any frame of sample data input into the model is taken as the first frame, and the human body boundary box of the three frames behind the frame and the corresponding re-recognition features are predicted during training), keeping the sample data in a stacking container, initializing a tracking track, and endowing a sample target (namely the human body boundary box existing in a sample image in the first frame) in the frame data with a target human body track number;

and further, predicting a human body boundary box of the second frame of sample data and corresponding re-recognition features, calculating the intersection ratio of every two human body boundary boxes with the predicted human body boundary box and the human body boundary box of the first frame reserved in the database, and thus obtaining the distance between the human body target boxes of the first frame and the second frame as 1-IOU. Then, according to the distance 1-IOU between the human body target frames of the first frame and the second frame, storing every two target pairs smaller than the distance threshold value in the first frame into a tracking state container, wherein the target human body track number is unchanged, and the targets are regarded as the targets being tracked; and finding the first appearing target in the second frame and keeping the initialization track in the activated state container. Finally, the activated state tracker is incorporated into the stacked container.

Further, a human body boundary box and corresponding re-identification features of the third frame of sample data are predicted, and the cosine distance between the target being tracked in the first frames (namely the target in the tracked container in the second frame) and the re-identification features of the tracked target in the third frame is calculated, so that the apparent feature similarity of the target is calculated according to the cosine distance. In the invention, in the training process and the tracking application process of the passenger flow tracking model, the predicted tracking track is further screened, and in the embodiment, the prediction of the tracking result of the human body boundary characteristic diagram is explained by using the trained passenger flow tracking model. On the basis of the above embodiment, the inputting the human body boundary feature map into a trained passenger flow volume tracking model, and outputting a tracking result of a target human body in the video frame sequential image includes:

and calculating the Mahalanobis distance between the predicted value and the true value according to the predicted mean value and the predicted covariance matrix, and reserving the tracking target with the Mahalanobis distance meeting the preset threshold distance so as to obtain the tracking result of the target human body.

In the invention, kalman filtering is used to calculate the prediction mean value and the prediction covariance matrix of the tracked target in the tracked state container (namely the human body boundary box corresponding to the tracked target in the tracked state container), and calculate the Mahalanobis distance between the predicted value and the true value. Setting the mahalanobis distance corresponding to the tracking target which is greater than the preset threshold distance to be infinite, so as to facilitate deletion; and only the track corresponding to the tracking target with the distance less than the preset threshold value is reserved and is reserved in the activated state container.

Optionally, in an embodiment, the lost target human body tracked in the tracking state container in the above embodiment is stored in the lost state container, and the IOU processing of the human body target frame is performed on the target human body which is not matched in the third frame, if there is a match, the target human body is merged into the activated state container, the target human body track number is not changed,

optionally, in an embodiment, the target human body initialized in the previous frame and the target human body not matched with the current frame are subjected to the IOU processing of the human body target frame, if there is a matched target, that is, a target with consistent feature matching, is regarded as the same target, and is merged into the activated state container, with the target human body estimation number unchanged;

optionally, in an embodiment, the target human body on which the current frame does not match is merged into the active state container, and the track is initialized.

On the basis of the above embodiment, the performing statistics on the passenger flow volume in the target area according to the tracking result to obtain a passenger flow volume statistical result in the target area includes:

acquiring a video frame sequence image corresponding to a target area;

constructing a preset elliptical demarcation area in a video frame sequence image based on the directions of an in-out area in the video frame sequence image;

Because the existing people flow rate statistical scheme has the dividing line which is mostly a single or two parallel grids, the angle of the boundary crossing of the tracked target and the line crossing condition are not optimized, the invention adopts the method that the oval boundary area is preset in the monitoring area, the oval boundary area is arranged in the in-out area in the video image, the in-out statistics of the target human body in the out-of-area and in-area is carried out, the possibility of the target line crossing is reduced, and the influence of the angle of the target line crossing is avoided. Preferably, a plurality of oval boundary areas may be set in the monitoring area, for example, a plurality of mall entry and exit areas exist in the video image, and a preset oval boundary area may be set at each entry and exit area, so as to count the passenger flow of the plurality of entry and exit areas of the mall photographed by the same camera. According to the invention, the passenger flow in and out of each elliptical boundary area is counted according to the target tracking result, and through multi-area passenger flow volume counting, a more visual basis is provided for the super-reasonable layout of the businessman, the economic benefit is improved, the potential safety hazard is reduced, and the counting of the number of people in and out is more flexible.

The following describes the passenger flow volume statistical system provided by the present invention, and the passenger flow volume statistical system described below and the passenger flow volume statistical method described above can be referred to correspondingly.

Fig. 3 is a schematic structural diagram of a passenger flow volume statistical system provided by the present invention, and as shown in fig. 3, the present invention provides a passenger flow volume statistical system, which includes an encoding and decoding module 301, a human body target tracking module 302, and a passenger flow volume statistical module 303, where the encoding and decoding module 301 is configured to perform human body boundary feature extraction on a video frame sequence image of passenger flow volume to be counted through a pre-trained encoding and decoding model, so as to obtain a human body boundary feature map corresponding to the video frame sequence image; the human body target tracking module 302 is configured to input the human body boundary feature map into a trained passenger flow tracking model, and output a tracking result of a target human body in the video frame sequence image; the trained passenger flow volume tracking model is obtained by training a neural network and a re-recognition model by using a sample human body boundary characteristic diagram marked with a human body boundary frame and a central point position, and is used for fusing a human body target and a re-recognition characteristic; the passenger flow volume statistics module 303 is configured to perform statistics on the passenger flow volume in the target area according to the tracking result, and obtain a passenger flow volume statistics result in the target area.

In the present invention, fig. 4 is a schematic diagram of an overall framework of a passenger flow volume statistical system provided by the present invention, and as shown in fig. 4, first, a video image of a passenger flow volume monitoring area is obtained through a data access device and stored in a data storage device; then, converting the video image in the target monitoring area into a video frame sequential image through a coding and decoding model (namely a coding and decoding module 301); then, the human body boundary feature in the video frame sequential images is extracted through a pre-trained codec model, and a human body boundary feature map of each frame is obtained, in which the human body boundary feature map includes global information (which is the boundary of the whole human body boundary) and local information (which is part of the boundary, such as a hand, a leg, etc.) of the human body boundary.

Further, each frame of human body boundary characteristic graph is sequentially input into a trained passenger flow tracking model, so that each frame of human body boundary characteristic graph is simultaneously input into a detection model and a ReiD model of a tracking module (namely a human body target tracking module 302), human body target detection and re-identification characteristic extraction are simultaneously carried out through the tracking module, an offset value of the central point position of the human body boundary frame between each frame is fused with the re-identification characteristic, a human body boundary frame moving track based on the re-identification characteristic is obtained, a tracking result of a target human body in a video frame sequence image is obtained according to the moving track, and a moving process of the human body in the video frame sequence image from the previous frame to the current frame is obtained. Finally, the passenger flow volume statistics module 303 can divide a counting area into the target area, count the passenger flow volume entering and exiting the counting area in each frame of image according to the tracking result, and display the passenger flow volume through the visualization terminal.

The passenger flow volume statistical system provided by the invention can simultaneously carry out target human body detection and identity embedding, and the training tracking model is shared through the characteristic diagram, so that the repeated calculation is reduced, the waste of calculation resources is avoided, and the efficiency and the real-time performance of target tracking are improved.

On the basis of the above embodiment, the system further includes:

the sample characteristic map acquisition module is used for acquiring a sample human body boundary characteristic map marked with a human body boundary frame;

the sample set construction module is used for marking the central point position and the corresponding central point deviation value of the human body boundary box in each frame of the sample human body boundary characteristic diagram, marking the length and the width of the human body boundary box and constructing a training sample set;

the first training module is used for inputting the training sample set into a re-recognition model for training to obtain a trained re-recognition model;

the second training module is used for inputting the training sample set into a neural network for training to obtain a trained human body target detection model;

and the model construction module is used for constructing and obtaining a trained passenger flow tracking model through the trained re-recognition model and the trained human target detection model.

Fig. 5 is a schematic structural diagram of an electronic device provided in the present invention, and as shown in fig. 5, the electronic device may include: a processor (processor) 510, a Communication Interface (Communication Interface) 520, a memory (memory) 530 and a Communication bus 540, wherein the processor 510, the Communication Interface 520 and the memory 530 are communicated with each other via the Communication bus 540. Processor 510 may invoke a computer program in memory 530 to perform the steps of the passenger flow statistics method, including, for example: extracting human body boundary characteristics of a video frame sequence image of passenger flow volume to be counted through a pre-trained coding and decoding model to obtain a human body boundary characteristic diagram corresponding to the video frame sequence image; inputting the human body boundary characteristic diagram into a trained passenger flow tracking model, and outputting to obtain a tracking result of a target human body in the video frame sequence image; the trained passenger flow volume tracking model is obtained by training a neural network and a re-recognition model by using a sample human body boundary characteristic diagram marked with a human body boundary frame and a central point position, and is used for fusing a human body target and a re-recognition characteristic; and according to the tracking result, counting the passenger flow in the target area to obtain a passenger flow counting result in the target area.

Furthermore, the logic instructions in the memory 530 may be implemented in the form of software functional units and stored in a computer readable storage medium when the software functional units are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

In another aspect, the present invention also provides a computer program product comprising a computer program stored on a non-transitory computer-readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform a passenger flow statistics method provided by the above methods, the method comprising: extracting human body boundary characteristics of a video frame sequence image of passenger flow volume to be counted through a pre-trained coding and decoding model to obtain a human body boundary characteristic diagram corresponding to the video frame sequence image; inputting the human body boundary characteristic diagram into a trained passenger flow tracking model, and outputting to obtain a tracking result of a target human body in the video frame sequence image; the trained passenger flow volume tracking model is obtained by training a neural network and a re-recognition model by using a sample human body boundary characteristic diagram marked with a human body boundary frame and a central point position, and is used for fusing a human body target and a re-recognition characteristic; and according to the tracking result, counting the passenger flow in the target area to obtain a passenger flow counting result in the target area.

In another aspect, the present invention further provides a processor-readable storage medium, where the processor-readable storage medium stores a computer program, where the computer program is configured to cause the processor to execute the method provided in the foregoing embodiments, for example, the method includes: extracting human body boundary characteristics of a video frame sequence image of passenger flow volume to be counted through a pre-trained coding and decoding model to obtain a human body boundary characteristic diagram corresponding to the video frame sequence image; inputting the human body boundary characteristic diagram into a trained passenger flow tracking model, and outputting to obtain a tracking result of a target human body in the video frame sequence image; the trained passenger flow volume tracking model is obtained by training a neural network and a re-recognition model by using a sample human body boundary characteristic diagram marked with a human body boundary frame and a central point position, and is used for fusing a human body target and a re-recognition characteristic; and according to the tracking result, carrying out statistics on the passenger flow in the target area to obtain a passenger flow statistical result in the target area.

The processor-readable storage medium may be any available medium or data storage device that can be accessed by a processor, including, but not limited to, magnetic memory (e.g., floppy disks, hard disks, magnetic tape, magneto-optical disks (MOs), etc.), optical memory (e.g., CDs, DVDs, BDs, HVDs, etc.), and semiconductor memory (e.g., ROMs, EPROMs, EEPROMs, non-volatile memories (NAND FLASH), solid State Disks (SSDs)), etc.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A passenger flow volume statistical method is characterized by comprising the following steps:

and according to the tracking result, counting the passenger flow in the target area to obtain a passenger flow counting result in the target area.

2. The passenger flow statistical method of claim 1, wherein the trained passenger flow tracking model is trained by:

3. The passenger flow volume statistical method according to claim 1, wherein the performing statistics on the passenger flow volume in the target area according to the tracking result to obtain the passenger flow volume statistical result in the target area comprises:

acquiring a video frame sequence image corresponding to a target area;

4. The method of passenger flow statistics according to claim 2, wherein after said obtaining a sample human body boundary feature map marked with a human body boundary box, the method further comprises:

and marking the position of the central point of the human body boundary box in each frame of the sample human body boundary characteristic map based on the thermodynamic diagram.

5. The method of claim 1, wherein the pre-trained codec model is a ResNet-34 network.

6. The passenger flow statistical method according to claim 2, wherein the trained passenger flow tracking model comprises 4 identical hierarchies, and each hierarchy is obtained by sequentially connecting a 3 x 3 convolutional layer, a Relu layer, and a 1 x 1 convolutional layer.

7. The passenger flow volume statistical method according to claim 1, wherein the inputting the human body boundary feature map into a trained passenger flow volume tracking model and outputting a tracking result of a target human body in the video frame sequence image comprises:

8. A passenger flow volume statistic system, comprising:

9. An electronic device comprising a processor and a memory having a computer program stored thereon, wherein the processor when executing the computer program performs the steps of the passenger flow statistics method according to any of claims 1 to 7.

10. A processor-readable storage medium, characterized in that the processor-readable storage medium stores a computer program for causing a processor to perform the steps of the passenger flow volume statistic method according to any one of claims 1 to 7.