CN109410245B

CN109410245B - Video target tracking method and device

Info

Publication number: CN109410245B
Application number: CN201811070047.XA
Authority: CN
Inventors: 张恒; 苏俊; 陈国良; 杨伟杰; 季家友; 潘贵族
Original assignee: Beijing Miwen Power Technology Co ltd
Current assignee: Wuxi Yingchuang Tech Co.,Ltd.
Priority date: 2018-09-13
Filing date: 2018-09-13
Publication date: 2021-08-10
Anticipated expiration: 2038-09-13
Also published as: CN109410245A

Abstract

The invention discloses a video target tracking method and device, which set a middle channel region as a target candidate region, divide the target candidate region into a plurality of grids with the same size, acquire the identification characteristics and the corresponding grid positions of a target to be tracked in a current frame, and determine the position information of the target to be tracked according to the identification characteristics and the corresponding grid positions of the target to be tracked in the previous frame of the current frame, the identification characteristics and the corresponding grid positions of the target to be tracked in the current frame, so as to track the target to be tracked in a monitoring unit region. Through the technical scheme, accurate video tracking can be continuously carried out on the targets in the large-range area, and monitoring continuity tracking efficiency and accuracy are greatly improved.

Description

Video target tracking method and device

Technical Field

The invention relates to the technical field of video tracking, in particular to a video target tracking method and device.

Background

In recent years, artificial intelligence technology gradually enters human life from the laboratory level, and intelligent new retail is undoubtedly representative of the technology. As a class of intelligent new retail, the unmanned supermarket integrates technologies such as computer vision, sensors and deep learning, when commodities are taken away from a goods shelf by a user, a selling management system can automatically detect that the user can directly leave the supermarket without queuing for payment, and a bill can be received on a mobile phone.

The inventor finds out through research on the prior art that most of the attention of the current solutions of the unmanned supermarket is mainly focused on cargo state identification and pedestrian state identification. The cargo end mainly identifies cargo state changes (such as whether the cargo is taken away or not, whether the cargo is opened or not and the like), and the solution scheme of the cargo end mainly comprises an RFID (radio frequency identification) tag and a visual identification system. In pedestrian state identification, a camera is usually installed in a certain scene to be identified in the current unmanned supermarket, so that the pedestrian identification tracking of the scene is realized, and the identification task is only carried out on a certain fixed scene. Because the field of view of the camera is limited, and the general business over-space range is large, if a pedestrian gets out of the scene and enters other areas of a market, the pedestrian needs to be identified again in the prior art, and in a supermarket, in the scene with dense people, the situation that multiple targets (people) are mutually shielded often occurs.

For the above technical problems, the prior art has the following solutions:

firstly, installing a plurality of cameras at multiple viewing angles, defining a public monitoring area of the cameras, and calibrating a plurality of height layers; the method comprises a foreground extraction step, a homography matrix calculation step, a foreground likelihood fusion step and a multilayer fusion step, wherein the positioning information of each layer is processed by a shortest path algorithm based on the positioning information of a plurality of selected height layers, the positioning information of the plurality of selected height layers is extracted, a multilayer tracking track is obtained, and the three-dimensional tracking of the plurality of targets is completed by combining the processing result of the foreground extraction.

In the process, the shortest path algorithm and the codebook model used in the scheme do not solve the problem of ID (identification) confusion caused by mutual shielding of targets in the multi-target intercross traveling process. Therefore, how to accurately and continuously perform accurate video tracking on a target in a wide area becomes a technical problem to be solved by those skilled in the art.

Disclosure of Invention

The invention provides a video target tracking method, which is used for continuously and accurately tracking a target in a large-range area by video, and is applied to a detection area comprising a plurality of monitoring unit areas, wherein two sides of each monitoring unit area are respectively provided with a shelf, the front side and the rear side of a middle channel area of each shelf are respectively provided with a video acquisition device, the video acquisition devices are respectively arranged above each shelf, and the field range of all the video acquisition devices above the shelf covers the middle channel area, and the method comprises the following steps:

setting the middle channel region as a target candidate region, wherein the target candidate region is composed of a plurality of grids with the same size;

acquiring identification features of a target to be tracked in a current frame and corresponding grid positions, wherein the grid positions are generated according to video images of the current frame synchronously acquired by all video acquisition equipment, and the identification features are generated according to the video images of the current frame synchronously acquired by all the video acquisition equipment;

determining the position information of the target to be tracked according to the identification feature and the corresponding grid position of the target to be tracked in the previous frame of the current frame, and the identification feature and the corresponding grid position of the target to be tracked in the current frame;

and tracking each target to be tracked in the monitoring unit area according to the position information of each target to be tracked in the monitoring unit area.

Preferably, the identification feature of the tracking target in the current frame is obtained specifically by the following method:

respectively carrying out feature detection on the target to be tracked through the video images acquired by the video acquisition equipment in the current frame;

and constructing the identification features according to feature detection results of the video acquisition devices.

Preferably, the grid position corresponding to the target to be tracked in the current frame is obtained specifically by the following method:

detecting the target to be tracked in the upper area and the lower area of the middle channel area through the video images of the current frame synchronously acquired by all the video acquisition equipment;

establishing a corresponding relation between the upper part and the lower part according to the detection results of the upper part region and the lower part region;

and determining the grid position of the target to be tracked in the middle channel region according to the corresponding relation and the position of the upper part in each video image.

Preferably, the determining the position information of the target to be tracked specifically includes:

constructing a characteristic energy function according to the grid position and the identification characteristics of the target to be tracked in the previous frame and the grid position and the identification characteristics of the target to be tracked in the current frame;

and performing energy minimization processing on the characteristic energy function, and acquiring the position information of the target to be tracked in the current frame according to the energy minimization processing result.

Preferably, the method further comprises the following steps:

and if the current frame is the first frame, distributing a tracking identification ID corresponding to the target to be tracked for the identification feature.

Preferably, the tracking of each target to be tracked in the monitoring unit area according to the position information of each target to be tracked in the monitoring unit area specifically includes:

judging whether a target entering the monitoring unit area or a target leaving the monitoring unit area exists according to the position information of each target to be tracked;

if the target entering the monitoring unit area exists, distributing a new tracking ID for the target or appointing the currently existing tracking ID for the target;

and if the target leaves the monitoring unit area, notifying the identification characteristics of the target and the corresponding tracking ID of the target to other monitoring unit areas.

Preferably, assigning a new tracking ID to the target or assigning a currently existing tracking ID to the target specifically includes:

matching the identification features of all the targets to be tracked in the detection area according to the identification features of the targets;

if a target to be tracked with the characteristic similarity higher than a preset threshold exists, designating the tracking ID of the target to be tracked for the target;

and if the target to be tracked with the characteristic similarity higher than the preset threshold does not exist, distributing a new tracking ID for the target.

Correspondingly, the invention also provides video target tracking equipment, which is applied to a detection area comprising a plurality of monitoring unit areas, wherein two sides of each monitoring unit area are respectively provided with a shelf, the front side and the rear side of a middle channel area of the shelf are respectively provided with video acquisition equipment, video acquisition equipment is respectively arranged above each shelf, and the field range of all the video acquisition equipment above the shelf covers the middle channel area, and the equipment comprises:

the setting module is used for setting the middle channel region as a target candidate region, and the target candidate region consists of a plurality of grids with the same size;

the system comprises an acquisition module, a tracking module and a tracking module, wherein the acquisition module is used for acquiring the identification characteristics of a target to be tracked in a current frame and the corresponding grid position, the grid position is generated according to the video image of the current frame synchronously acquired by all video acquisition equipment, and the identification characteristics are generated according to the video image of the current frame synchronously acquired by all the video acquisition equipment;

the determining module is used for determining the position information of the target to be tracked according to the identification feature and the corresponding grid position of the target to be tracked in the previous frame of the current frame, and the identification feature and the corresponding grid position of the target to be tracked in the current frame;

and the tracking module tracks each target to be tracked in the monitoring unit area according to the position information of each target to be tracked in the monitoring unit area.

Accordingly, the present invention also provides a computer-readable storage medium, in which instructions are stored, and when the instructions are executed on a terminal device, the terminal device is caused to execute the video target tracking method as described above.

Correspondingly, the present invention further provides a computer program product, which is characterized in that when the computer program product runs on a terminal device, the terminal device is caused to execute the video target tracking method as described above.

By applying the technical scheme of the application, the intermediate channel area is set as the target candidate area, the target candidate area is divided into a plurality of grids with the same size, the identification feature of the target to be tracked in the current frame and the corresponding grid position are obtained, meanwhile, the position information of the target to be tracked is determined according to the identification feature of the target to be tracked in the previous frame of the current frame, the corresponding grid position, the identification feature of the target to be tracked in the current frame and the corresponding grid position, and then the target to be tracked in the monitoring unit area is tracked. Through the technical scheme, accurate video tracking can be continuously carried out on the targets in the large-range area, and monitoring continuity tracking efficiency and accuracy are greatly improved.

Drawings

In order to more clearly illustrate the technical solution of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.

Fig. 1 is a schematic flowchart of a video target tracking method proposed in the present application;

fig. 2 is a schematic structural diagram of a basic monitoring unit in a specific application scenario according to an embodiment of the present application;

fig. 3 is a flowchart of a video target tracking method according to an embodiment of the present application;

fig. 4 is a schematic structural diagram of a video target tracking device according to the present application.

Detailed Description

As described in the background art, in the prior art, although the tracking problem in the provider hyperspace is processed, a plurality of cameras are used for processing the positioning information of each layer based on the positioning information of a plurality of selected height layers by using a shortest path algorithm, so as to obtain a multi-layer tracking track, and the three-dimensional tracking of multiple targets is completed by combining the processing result of foreground extraction. But still cause the problem of ID confusion when processing the mutual cross and mutual occlusion of multiple targets.

In view of the foregoing problems, embodiments of the present invention provide a cross-scene multi-target real-time tracking method, which is applied to a basic spatial layout including multiple object placement structures and channel structures therebetween in a large site. The technical solution of the present invention will be clearly and completely described below with reference to the accompanying drawings.

In the embodiment of the present invention, the shelves and the aisles may be distributed in the implementation scenario according to the description of the implementation scenario, or other types of devices or areas similar to the shelves and the aisles may be correspondingly changed in one or more devices different from the implementation scenario, and such changes do not affect the protection scope of the present invention. For example: shelves, bookshelves, wardrobes, aisles, corridors, etc.

In the application scenario shown in fig. 2, for a large-scale place such as a supermarket, a shopping mall, a library, or the like, since the area of the shopping mall or the supermarket is generally large, the area is divided into different basic monitoring units according to the embodiment of the present invention, and the basic layout of the area is mostly composed of two shelves and a channel between the two shelves.

Based on the arrangement, the technical scheme of the application sets a plurality of video acquisition devices in the basic detection unit. In a specific application scenario, a technician may use Tx1 or Tx2 of NVIDIA (england) as a hardware core, where each Tx1 or Tx2 links four gigabit cameras to form one detection unit. The purpose of setting the detection unit is to completely and clearly cover all spaces in the whole basic monitoring unit, and no matter what kind of video acquisition equipment is set, several sets of video acquisition equipment are set, and how the video acquisition equipment is set, and other setting modes which can achieve the purpose of the detection unit are within the protection scope of the application.

Specifically, in the detection unit in the embodiment of the present invention, the camera a and the camera C are respectively located at two ends of the channel, and shoot along the channel. In order to ensure the best shooting effect of the cameras A and C, the cameras A and C are required to be arranged on the center line of the channel, and the shooting direction is basically parallel to the length direction of the channel. The cameras B and D shoot from two sides of the channel and perpendicular to the channel respectively, and the field ranges of the cameras B and D are required to cover different positions of the channel and the whole middle channel area. Preferably, for better handling the occlusion problem, the distance between the four cameras and the ground is at least 1.5 times the average height of an adult, so as to ensure that the head regions are not occluded with each other.

In the above application scenario, as shown in fig. 1, the video target tracking method specifically includes the following steps:

s101, setting the middle channel region as a target candidate region, wherein the target candidate region is composed of a plurality of grids with the same size.

This step is intended to divide all the spaces in the candidate area in many ways, for example: the method for dividing the space into quadrangles, pentagons and the like belongs to the protection scope of the application as long as the dividing mode can effectively divide the space.

In a specific application scene, a channel plane is uniformly divided into small grids of N x N (the number of N can be set according to the channel length and the occupation space of an adult), cooperation marks (the cooperation marks are common terms in computer vision and photogrammetry and are usually marked by a checkerboard, the arrangement positions are 1 perpendicular to the ground and are arranged at different positions of a channel, and 2 are horizontally arranged on the ground) are placed on the ground at different distances by utilizing the approximate parallelism of the bottom edges of two storage racks and the asymptotic principle (namely the asymptotic principle of camera imaging), and the imaging models of the four cameras can be respectively marked.

S102, obtaining identification features of a target to be tracked in a current frame and corresponding grid positions, wherein the grid positions are generated according to video images of the current frame synchronously acquired by all video acquisition equipment, and the identification features are generated according to the video images of the current frame synchronously acquired by all the video acquisition equipment.

This step is intended to determine the identification feature of the target to be tracked in the current frame and the corresponding grid position by using an image acquisition device, where the image acquisition device may be any image acquisition device, and the identification feature may be any identification feature for object detection, for example: the method for determining the identification feature and the corresponding grid position of the target to be tracked is within the protection scope of the present application, as long as the method can be achieved, such as HOG (Histogram of Oriented Gradient), LBP (Local Binary Pattern), Haar-like feature (Haar feature), and the like.

Specifically, in order to better acquire the identification feature of the tracking target in the current frame, the following steps are preferably performed:

(1) and respectively carrying out feature detection on the target to be tracked through the video images acquired by the video acquisition equipment in the current frame.

(2) And constructing the identification features according to feature detection results of the video acquisition devices.

Specifically, in order to better acquire the grid position corresponding to the target to be tracked in the current frame, the preferable steps are as follows:

(1) and detecting the target to be tracked in the upper area and the lower area of the middle channel area through the video images of the current frame synchronously acquired by all the video acquisition equipment.

(2) And establishing a corresponding relation between the upper part and the lower part according to the detection results of the upper part region and the lower part region.

(3) And determining the grid position of the target to be tracked in the middle channel region according to the corresponding relation and the position of the upper part in each video image.

Meanwhile, the preferred embodiment aims to determine the grid position of the target to be tracked corresponding to the current frame, and various methods for fixing the position of the area of the target to be tracked in the selected upper area and the lower area of the target to be tracked do not influence the protection range of the present application.

In a specific application scene, feature detection is carried out on the target to be tracked through video images acquired by the four cameras at the current frame. And a target candidate region is obtained by using a background subtraction method; within the candidate regions, regions are detected using haar operators (haar operators) to extrapolate grid positions in N x N space. Then, HOG features are constructed for each target according to feature detection results of the four cameras.

Preferably, the video images of the current frame synchronously acquired by four cameras detect the head region and the body region of the target to be tracked in the middle channel region by using a multi-forge HOG + SVM (multi-forge combined pedestrian detection) method, establish a corresponding relationship between the head region and the body region according to the detection results of the head region and the body region, and then determine the grid position of the target to be tracked in the middle channel region according to the corresponding relationship and the position of the head region in each video image. For example: dividing the channel ground into grids according to 30cm by 30cm, judging the positions of the targets in the channel width direction by the camera A and the camera C in the graph 2, judging the positions of the targets in the channel length direction by the camera B and the camera D, voting in the grids by combining the two position information, and finishing positioning.

In order to obtain a better tracking effect, when the mutual shielding degree of a plurality of pedestrians is serious, a plurality of cameras at different angles cannot distinguish human targets. In a specific application scene, a plurality of human body models (average male: 175cm, average female height 165cm) can be adopted and placed at different positions of a channel to simulate the mutual shielding condition, so that the spatial corresponding relation between the head area of the human body and the area of the human body is obtained. Furthermore, according to the position of the head region on the image, the imaging model of four cameras is combined to deduce which grid of N × N the single human body is positioned.

S103, determining the position information of the target to be tracked according to the identification feature and the corresponding grid position of the target to be tracked in the previous frame of the current frame, and the identification feature and the corresponding grid position of the target to be tracked in the current frame.

The method is not limited to the comparison between the current frame and the previous frame, and various comparison methods, different identification features and different grid division methods are used, and the method is within the protection scope of the application as long as the purpose is to continuously track the target to be tracked accurately by comparing the identification features and determining the grid position.

Specifically, in order to accurately and continuously track the target to be tracked, it is preferable that, if the current frame is the first frame, a tracking identifier ID corresponding to the target to be tracked is allocated to the identification feature.

Specifically, in order to determine the position information of the target to be tracked, the following steps are preferably performed:

(1) and constructing a characteristic energy function according to the grid position and the identification characteristic of the target to be tracked in the previous frame and the grid position and the identification characteristic of the current frame.

(2) And performing energy minimization processing on the characteristic energy function, and acquiring the position information of the target to be tracked in the current frame according to the energy minimization processing result.

In a specific application scenario, if the current frame is the first frame, a Track ID (monitoring identifier) corresponding to the target to be tracked is allocated to each HOG feature. If the current frame is not the first frame, roughly determining the grid position of the current frame according to the Track ID corresponding to the tracking target, and constructing a distance ' + ' HOG ' characteristic energy function according to the grid position and the HOG characteristic of the target to be tracked in the previous frame and the grid position and the HOG characteristic of the current frame; and performing energy minimization processing on the distance ++ "HOG characteristic energy function, and acquiring accurate position information of the target to be tracked in the current frame according to an energy minimization processing result. The energy function is as follows:

wherein alpha is 0.2 and beta is 0.55. And each time the target newly enters the visual field, a target Track ID is allocated, and in the subsequent tracking, the target Track ID in the subsequent frame is combed once according to the energy minimization.

And S104, tracking each target to be tracked in the monitoring unit area according to the position information of each target to be tracked in the monitoring unit area.

Preferably, whether a target entering the monitoring unit area or a target leaving the monitoring unit area exists is judged according to the position information of each target to be tracked; if the target entering the monitoring unit area exists, distributing a new tracking ID for the target or appointing the currently existing tracking ID for the target; and if the target leaves the monitoring unit area, notifying the identification characteristics of the target and the corresponding tracking ID of the target to other monitoring unit areas.

Specifically, in order to assign a new tracking ID to the target or to designate a currently existing tracking ID for the target, the following steps are preferably performed:

(1) and matching the identification features of all the targets to be tracked in the detection area according to the identification features of the targets.

(2) And if the target to be tracked with the characteristic similarity higher than the preset threshold exists, designating the tracking ID of the target to be tracked for the target.

(3) And if the target to be tracked with the characteristic similarity higher than the preset threshold does not exist, distributing a new tracking ID for the target.

In a specific application scenario, for a current basic monitoring unit, the conditions that a newly added target, other basic unit targets enter the current unit, targets leave the current unit and enter other units and the like are judged by using multi-body forge HOG target characteristics and combining upper and lower frame contents, target characteristic information of each basic monitoring unit and the like.

Specifically, whether a target entering the monitoring unit area or a target leaving the monitoring unit area exists is judged according to the position information of each target to be tracked, and the specific mode is as follows:

(1) and if the target entering the monitoring unit area exists, the monitoring unit area is monitored.

Allocating a new Track ID for the target or appointing the currently existing Track ID for the target; matching in the HOG characteristics of all the targets to be tracked in the detection area according to the HOG characteristics of the targets; if the target to be tracked with the characteristic similarity higher than the preset threshold exists, designating the Track ID of the target to be tracked for the target; and if the target to be tracked with the characteristic similarity higher than the preset threshold value does not exist, distributing a new Track ID for the target.

(2) If there is an object leaving the area of the monitoring unit.

And informing other monitoring unit areas of the HOG characteristics of the target and the Track ID corresponding to the HOG characteristics.

In order to further illustrate the technical idea of the present invention, the technical solution of the present invention will now be described with reference to specific application scenarios.

Referring to fig. 3, which shows a flowchart of a video target tracking method proposed in an embodiment of the present application, as shown in fig. 3, the method may include:

s301: and judging whether the current image is the first frame image.

If yes, go to step S302;

if not, go to step S303.

S302: acquiring a target candidate region by using four cameras and a background subtraction method; and in the candidate regions, detecting the regions by using a haar operator, and reversely deducing the N x N space grid positions. For each camera, each target builds the HOG feature and assigns a Track ID to each target. Then, the next frame of image is read, and step S301 is executed.

S303: acquiring a target candidate region by using four cameras and a background subtraction method; and in the candidate regions, detecting the regions by using a haar operator, and reversely deducing the N x N space grid positions. For each camera, each target builds a HOG feature. Step S304 is then performed.

S304: and according to the grid position and the HOG characteristic of the target in the previous frame, minimizing by using an energy function to obtain the position information of the target in the current frame.

S305: and judging whether a new target enters the basic monitoring unit.

If yes, go to step S306;

if not, go to step S309.

S306: and judging whether the target is a newly entered target or not according to the existing targets in all the basic monitoring units.

If yes, go to step S307;

if not, go to step S308.

S307: a new Track ID is assigned to the newly entered target. Step S309 is then performed.

S308: and establishing the original Track ID for the target according to the characteristic similarity. Step S309 is then performed.

S309: and judging whether a target leaves the basic monitoring unit.

If yes, go to step S310;

if not, go to step S311.

S310: and reserving off-field target information, and transmitting the information to other basic monitoring units for judging the ID information of the target Track. Step S311 is then performed.

S311: asking whether to accept the task.

If yes, ending the current task;

if not, the next frame of image is read, and step S301 is continued.

In order to achieve the above technical object, the present application further provides a video target tracking device, as shown in fig. 4, the device is applied to a detection area including a plurality of monitoring unit areas, each of which is provided with a shelf respectively on both sides of the monitoring unit area, video acquisition devices are respectively provided on front and rear sides of a middle channel area of the shelf, each of which is provided with a video acquisition device respectively above the shelf, and is located in a view field area of all the video acquisition devices above the shelf, the device includes:

a setting module 410, configured to set the middle channel region as a target candidate region, where the target candidate region is composed of multiple grids with the same size;

the acquiring module 420 is configured to acquire an identification feature of a target to be tracked in a current frame and a corresponding grid position, where the grid position is generated according to video images of the current frame synchronously acquired by all video acquisition devices, and the identification feature is generated according to video images of the current frame synchronously acquired by each video acquisition device;

a determining module 430, configured to determine position information of the target to be tracked according to the identification feature and the corresponding grid position of the target to be tracked in the previous frame of the current frame, and the identification feature and the corresponding grid position of the target to be tracked in the current frame;

the tracking module 440 tracks each target to be tracked in the monitoring unit area according to the position information of each target to be tracked in the monitoring unit area.

In a specific application scenario, the obtaining module 420 obtains the identification feature of the tracking target in the current frame specifically by:

In a specific application scenario, the obtaining module 420 obtains a grid position corresponding to the target to be tracked in the current frame specifically by the following means:

In a specific application scenario, the determining module 430 determines the position information of the target to be tracked, specifically:

In a specific application scenario, the method further includes:

the allocating module 450 allocates a tracking identifier ID corresponding to the target to be tracked to the identification feature if the current frame is the first frame.

In a specific application scenario, the tracking module 440 tracks each target to be tracked in the monitoring unit area according to the position information of each target to be tracked in the monitoring unit area, specifically:

In a specific application scenario, the tracking module 440 allocates a new tracking ID to the target or specifies a currently existing tracking ID for the target, specifically:

Through the above description of the embodiments, those skilled in the art will clearly understand that the present invention may be implemented by hardware, or by software plus a necessary general hardware platform. Based on such understanding, the technical solution of the present invention can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, a usb disk, a removable hard disk, etc.), and includes several instructions for enabling a computer device (which can be a personal computer, a server, or a network device, etc.) to execute the method according to the implementation scenarios of the present invention.

Those skilled in the art will appreciate that the figures are merely schematic representations of one preferred implementation scenario and that the blocks or flow diagrams in the figures are not necessarily required to practice the present invention.

Those skilled in the art will appreciate that the modules in the devices in the implementation scenario may be distributed in the devices in the implementation scenario according to the description of the implementation scenario, or may be located in one or more devices different from the present implementation scenario with corresponding changes. The modules of the implementation scenario may be combined into one module, or may be further split into a plurality of sub-modules.

The above-mentioned invention numbers are merely for description and do not represent the merits of the implementation scenarios.

The above disclosure is only a few specific implementation scenarios of the present invention, however, the present invention is not limited thereto, and any variations that can be made by those skilled in the art are intended to fall within the scope of the present invention.

Claims

1. A video target tracking method is characterized in that the method is applied to a detection area comprising a plurality of monitoring unit areas, shelves are respectively arranged on two sides of each monitoring unit area, video acquisition equipment is respectively arranged on the front side and the rear side of a middle channel area of each shelf, video acquisition equipment is respectively arranged above each shelf, and the field of view of all the video acquisition equipment above the shelves covers the middle channel area, and the method comprises the following steps:

determining the position information of the target to be tracked, specifically:

performing energy minimization processing on the characteristic energy function, and acquiring the position information of the target to be tracked in the current frame according to the energy minimization processing result;

2. The method of claim 1, wherein the identifying characteristic of the tracking target in the current frame is obtained by:

3. The method of claim 2, wherein the grid position corresponding to the target to be tracked in the current frame is obtained by:

establishing a corresponding relation between the upper region and the lower region according to the detection results of the upper region and the lower region;

4. The method of claim 1, further comprising:

5. The method according to any one of claims 1 to 4, wherein tracking each target to be tracked in the monitoring unit area according to the position information of each target to be tracked in the monitoring unit area specifically comprises:

6. The method of claim 5, wherein assigning a new tracking ID to the target or assigning a currently existing tracking ID to the target is by:

7. The utility model provides a video target tracking equipment, its characterized in that, equipment is applied to the detection area who contains a plurality of monitoring unit areas, each monitoring unit area's both sides are equipped with the goods shelves respectively, the side is equipped with video acquisition equipment respectively around the middle passageway region of goods shelves, each the top of goods shelves is equipped with video acquisition equipment respectively, and is located the field of view scope of all video acquisition equipment above the goods shelves covers middle passageway region, equipment includes:

determining the position information of the target to be tracked, specifically:

8. A computer-readable storage medium having stored therein instructions that, when run on a terminal device, cause the terminal device to perform the video object tracking method of any one of claims 1-6.

9. A computer program product, which, when run on a terminal device, causes the terminal device to perform the video object tracking method of any one of claims 1-6.