CN112862864A

CN112862864A - Multi-pedestrian tracking method and device, electronic equipment and storage medium

Info

Publication number: CN112862864A
Application number: CN202110245128.4A
Authority: CN
Inventors: 秦豪; 赵明
Original assignee: Shanghai Yogo Robot Co Ltd
Current assignee: Shanghai Yogo Robot Co Ltd
Priority date: 2021-03-05
Filing date: 2021-03-05
Publication date: 2021-05-28
Anticipated expiration: 2041-03-05

Abstract

The application relates to a multi-pedestrian tracking method, a multi-pedestrian tracking device, electronic equipment and a storage medium, wherein an SSD (solid State disk) target detection algorithm is used for detecting an image to obtain information of all pedestrians in the image; the method comprises the steps that information of all pedestrians is reserved in a to-be-matched pedestrian information pool; predicting the information of the pedestrian of the next frame through a Kalman filter according to a pedestrian historical information pool to obtain a pedestrian prediction information pool; whether the ith candidate in the pedestrian information pool to be matched is matched with the jth predicted candidate in the pedestrian prediction information pool or not is judged through a valence matching matrix, so that a Kalman filtering modeling method based on pedestrian spatial position prediction is adopted in the method, a 2d modeling method based on square frame movement in the prior art is abandoned, the actual situation that a pedestrian moves in the space is better met by the multi-pedestrian tracking method, and the phenomenon that the pedestrian is lost with the pedestrian is avoided.

Description

Multi-pedestrian tracking method and device, electronic equipment and storage medium

Technical Field

The present application relates to the field of artificial intelligence technologies, and in particular, to a method and an apparatus for tracking multiple pedestrians, an electronic device, and a storage medium.

Background

The multi-target tracking is a technology for establishing association between front and rear frames of a video target, records the historical track of the target and predicts the possible future trend. Generally, a multi-target tracking algorithm predicts the position where a target of a next frame may appear in the future by using a kalman filtering mode, calculates the similarity of each target when the front and back correlations are matched, and finally establishes the data correlation.

In the robot industry, multi-target tracking is widely applied. When the intelligent robot moves in an indoor environment and meets moving targets such as pedestrians, the robot needs to predict the moving trend of the pedestrians, and therefore the robot is required to establish association between front and rear frames of a camera picture and record the moving track of the moving target.

However, when the multi-target tracking technology is actually applied, due to the special visual angle of the robot, when a pedestrian moves nearby, the pedestrian picture captured by the camera changes greatly, and the phenomenon of tracking loss exists.

Disclosure of Invention

In order to overcome the problems in the related art, the application provides a multi-pedestrian tracking method, a multi-pedestrian tracking device, electronic equipment and a storage medium, and aims to provide the multi-pedestrian tracking method, which can realize the target tracking of multiple pedestrians by a robot detection system, establish the pedestrian matching relation of front and back image sequences and provide bottom-layer algorithm support for the subsequent pedestrian trajectory prediction; meanwhile, the multi-pedestrian tracking method can keep a good tracking effect in a scene with large pedestrian variation nearby, and reduces the risk of pedestrian tracking loss.

The technical scheme for solving the technical problems is as follows: a multi-pedestrian tracking method, comprising the steps of: step 1, reading an image captured by a camera of a robot; step 2, detecting the image through an SSD target detection algorithm to obtain the information of all pedestrians in the image; step 3, keeping the information of all pedestrians in a pedestrian information pool to be matched; step 4, predicting the information of the pedestrian of the next frame through a Kalman filter according to a pedestrian historical information pool to obtain a pedestrian prediction information pool; and 5, judging whether the ith candidate in the pedestrian information pool to be matched is matched with the jth predicted candidate in the pedestrian prediction information pool or not through a price matching matrix.

Preferably, after step 5, the method further comprises: and when the ith candidate in the to-be-matched pedestrian information pool is matched with the jth predicted candidate in the pedestrian prediction information pool, putting the information of the candidate into the pedestrian history information pool, and updating the parameters of the pedestrian history information pool.

Preferably, the pedestrian information includes a pedestrian frame (u)₁,v₁,u₁,u₂) And a pedestrian feature vector f, wherein the pedestrian's box size is s and the pedestrian's box proportion is r, the function of (s, r) being:

preferably, after step 2, the method further comprises: according to the internal and external parameters of the camera device and the height of the camera device, obtaining the relative coordinates (x, y) of the pedestrian in the space through a space estimation algorithm; and according to the positioning coordinates of the robot in the space, converting the relative coordinates (X, Y) of the pedestrian in the space into absolute coordinates (X, Y) of the pedestrian in the space to obtain the spatial position of the pedestrian.

Preferably, the predicting the pedestrian information of the next frame by the kalman filter according to a pedestrian history information pool to obtain a pedestrian prediction information pool specifically includes:

predicting the information of the pedestrian of the next frame through a Kalman filter according to a pedestrian historical information pool to obtain a pedestrian information module I (X, Y, s, r, v)_x,v_y,v_s,v_r)；

According to a state transition matrix F and the pedestrian information module I (X, Y, s, r, v)_x,v_y,v_s,v_r) Obtaining the predicted information state I of the pedestrian in the next frame_preIn which I_pre＝F*I。

Preferably, before step 5, the method further comprises: obtaining a similarity cost according to the similarity between the ith candidate in the pedestrian information pool to be matched and the jth predicted candidate in the pedestrian prediction information pool; obtaining a distance cost according to a GIOU distance measurement algorithm; and obtaining the cost matching matrix according to the similarity cost and the distance cost.

Preferably, the function of the cost matching matrix is:

Cost＝2Cost_similarity+Cost_GIOU；

where 0.5 is the threshold for similarity cost, 1/3 is the threshold for distance cost, f_iAs feature vector of the ith candidate, f_jThe feature vector for the jth predicted candidate.

A second aspect of embodiments of the present application provides a multi-pedestrian tracking apparatus, comprising: the camera device is used for acquiring an image; the pedestrian information extraction module is used for calling an SSD target detection algorithm to detect the image and obtain the information of all pedestrians in the image; the space estimation module is used for calculating the relative coordinates of the pedestrian in the space according to the internal and external parameters of the camera device and the height of the camera device; the pedestrian information pool to be matched is used for storing information of all pedestrians in the image, and the information of the pedestrians comprises a square frame of the pedestrians and a feature vector of the pedestrians; the Kalman filtering estimation module is used for predicting the information of the pedestrian of the next frame according to a pedestrian historical information pool to obtain a pedestrian prediction information pool; and the matching module is used for judging whether the ith candidate in the pedestrian information pool to be matched is matched with the jth predicted candidate in the pedestrian prediction information pool or not.

Preferably, the matching module includes a cost matching matrix, and the cost matching matrix function is:

Cost＝2Cost_similarity+Cost_GIOU

among them, Cost_similarityCost for similarity, Cost_GIOUFor distance cost, 0.5 is threshold for similarity cost, 1/3 is threshold for distance cost, f_iAs feature vector of the ith candidate, f_jThe feature vector for the jth predicted candidate.

A third aspect of an embodiment of the present application provides an electronic device, including:

a processor; and one or more processors; one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs including instructions for performing the methods described above.

A fourth aspect of the present application provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method as described above.

The application provides a method and a device for tracking multiple pedestrians, electronic equipment and a storage medium, wherein an SSD target detection algorithm is used for detecting an image to obtain information of all pedestrians in the image; the method comprises the steps that information of all pedestrians is reserved in a to-be-matched pedestrian information pool; predicting the information of the pedestrian of the next frame through a Kalman filter according to a pedestrian historical information pool to obtain a pedestrian prediction information pool; judging whether the ith candidate in the pedestrian information pool to be matched is matched with the jth predicted candidate in the pedestrian prediction information pool through a valence matching matrix, so that a Kalman filtering modeling method based on pedestrian spatial position prediction is adopted in the method, a 2d modeling method based on square frame movement in the prior art is abandoned, the multi-pedestrian tracking method is more in line with the actual situation that a pedestrian moves in space, and the phenomenon that the pedestrian is lost along with the pedestrian is avoided; therefore, the method and the device adopt a strategy based on GIOU box distance measurement, and improve the effect of a multi-pedestrian tracking algorithm on close pedestrians.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.

Drawings

The foregoing and other objects, features and advantages of the application will be apparent from the following more particular descriptions of exemplary embodiments of the application, as illustrated in the accompanying drawings wherein like reference numbers generally represent like parts throughout the exemplary embodiments of the application.

FIG. 1 is a schematic flow chart diagram illustrating a multi-pedestrian tracking method according to an embodiment of the present application;

FIG. 2 is a block diagram of a matching loop of a multi-pedestrian tracking method, shown in an embodiment of the present application;

FIG. 3 is another schematic flow diagram of a multi-pedestrian tracking method according to an embodiment of the present application;

FIG. 4 is a schematic diagram illustrating a pedestrian frame according to an embodiment of the present application;

FIG. 5 is a schematic diagram of a multi-row person tracking device shown in an embodiment of the present application;

fig. 6 is a schematic structural diagram of an electronic device shown in an embodiment of the present application.

Detailed Description

Preferred embodiments of the present application will be described in more detail below with reference to the accompanying drawings. While the preferred embodiments of the present application are shown in the drawings, it should be understood that the present application may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.

It should be understood that although the terms "first," "second," "third," etc. may be used herein to describe various information, these information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present application. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present application, "a plurality" means two or more unless specifically limited otherwise.

The robot of embodiments of the present invention may be configured in any suitable shape to perform a particular business function operation, for example, the robot of embodiments of the present invention may be a delivery robot, a transfer robot, a care robot, and the like. The robot generally includes a housing, a sensor unit, a drive wheel assembly, a memory assembly, and a controller. The housing may be substantially circular in shape, and in some embodiments, the housing may be substantially oval, triangular, D-shaped, cylindrical, or otherwise shaped. The sensor unit is used for collecting some motion parameters of the robot and various data of the environment space. In some embodiments, the sensor unit includes a lidar mounted above the housing at a mounting height greater than a top deck height of the housing, the lidar configured to detect an obstacle distance between obstacles of the robot. In some embodiments, the sensor unit may also include an Inertial Measurement Unit (IMU), a gyroscope, a magnetic field meter, an accelerometer or speedometer, an optical camera, and so forth. The driving wheel component is arranged on the shell and drives the robot to move on various spaces, and in some embodiments, the driving wheel component comprises a left driving wheel, a right driving wheel and an omnidirectional wheel, and the left driving wheel and the right driving wheel are respectively arranged on two opposite sides of the shell. The left and right drive wheels are configured to be at least partially extendable and retractable into the bottom of the housing. The omni-directional wheel is arranged at the position, close to the front, of the bottom of the shell and is a movable caster wheel which can rotate 360 degrees horizontally, so that the robot can flexibly steer. The left driving wheel, the right driving wheel and the omnidirectional wheel are arranged to form a triangle, so that the walking stability of the robot is improved. Of course, in some embodiments, the driving wheel component may also adopt other structures, for example, the omni wheel may be omitted, and only the left driving wheel and the right driving wheel may be left to drive the robot to normally walk. In some embodiments, the robot is further configured with a storage component that is mounted within the receiving slot to accomplish a delivery task or the like. The controller is respectively and electrically connected with the left driving wheel, the right driving wheel, the omnidirectional wheel and the laser radar. The controller is used as a control core of the robot and is used for controlling the robot to walk, retreat and some business logic processing.

In some embodiments, the controller may be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA), a single chip, an ar (acorn RISC machine) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination of these components. Also, the controller may be any conventional processor, controller, microcontroller, or state machine. A controller may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP, and/or any other such configuration. In some embodiments, during the movement of the robot, the controller employs SLAM (simultaneous localization and mapping) technology to construct a map and a position according to the environmental data, so as to move to a target position to complete a delivery task, a cleaning task, and the like. The controller instructs the robot to completely traverse an environmental space through a full coverage path planning algorithm based on the established map and the position of the robot. For example, during the robot traversal, the sensor unit acquires an image of a traversal region, wherein the image of the traversal region may be an image of the entire traversal region or an image of a local traversal region in the entire traversal region. The controller generates a map from the image of the traversal area, the map having indicated an area that the robot needs to traverse and coordinate locations at which obstacles located in the traversal area are located. After each location or area traversed by the robot, the robot marks that the location or area has been traversed based on the map. In addition, as the obstacle is marked in a coordinate mode in the map, when the robot passes, the distance between the robot and the obstacle can be judged according to the coordinate point corresponding to the current position and the coordinate point related to the obstacle, and therefore the robot can pass around the obstacle. Similarly, after the position or the area is traversed and marked, when the next position of the robot moves to the position or the area, the robot makes a strategy of turning around or stopping traversing based on the map and the mark of the position or the area. It will be appreciated that the controller may also identify traversed locations or areas, or identify obstacles, in a variety of ways to develop a control strategy that meets product needs.

The technical solutions of the embodiments of the present application are described in detail below with reference to the accompanying drawings.

Referring to fig. 1 and 2, fig. 1 is a schematic flow chart of a multi-pedestrian tracking method according to a first embodiment of the present application, and fig. 2 is a block diagram of a matching cycle of the multi-pedestrian tracking method according to the first embodiment of the present application, as shown in fig. 1 and 2, the method includes the following steps:

step S1, reading an image captured by a camera of a robot;

specifically, in this example, the camera mechanism may be a camera, or other devices capable of acquiring an image of an object; in the present embodiment, the robot moves indoors, and the imaging device is configured to capture an image of the indoor environment, where a plurality of objects such as pedestrians are included in the image.

Step S2, detecting the image through an SSD target detection algorithm to obtain the information of all pedestrians in the image;

specifically, the pedestrian detection module in this embodiment adopts a destination detection algorithm SSD based on a deep neural network, and detects the image by calling the SSD destination detection algorithm, so as to detect all pedestrians in the image, thereby obtaining all pedestrian information in the image.

In one embodiment, the pedestrian information includes a pedestrian box (u)₁,v₁,u₁,u₂) And a pedestrian feature vector f, wherein the pedestrian's box size is s, the pedestrian's box proportion is r, and the function of (s, r) is:

step S3, keeping the information of all pedestrians in a pedestrian information pool to be matched;

specifically, the pedestrian information pool to be matched includes: a pedestrian characteristic vector, a pedestrian spatial position, and a pedestrian frame. Wherein the pedestrian spatial position is the absolute coordinate of the pedestrian in space. The absolute coordinates of the pedestrian in space are converted from the relative coordinates of the pedestrian in space.

In one embodiment, the following steps are further included after step S2:

step S201, obtaining relative coordinates (x, y) of the pedestrian in the space through a space estimation algorithm according to the internal and external parameters of the camera device and the height of the camera device;

specifically, the calculation formula of the relative coordinates (x, y) of the pedestrian in space is as follows:

wherein (f)_x,f_y,c_x,c_y) And Hc is the height of the camera device.

And S202, converting the relative coordinates (X, Y) of the pedestrian in the space into absolute coordinates (X, Y) of the pedestrian in the space according to the positioning coordinates of the robot in the space, and obtaining the space position of the pedestrian.

Step S4, predicting the information of the pedestrian of the next frame through a Kalman filter according to a pedestrian historical information pool to obtain a pedestrian prediction information pool;

specifically, the pedestrian history information pool includes: historical pedestrian characteristic vectors, historical pedestrian spatial positions and historical pedestrian frames; the pedestrian prediction information pool includes: predicting a pedestrian feature vector, predicting a pedestrian spatial position, and predicting a pedestrian box.

In one embodiment, the step of predicting the pedestrian information of the next frame through a kalman filter according to a pedestrian history information pool to obtain a pedestrian prediction information pool specifically comprises the following steps:

step S401, according to a pedestrian history information pool, predicting the information of the pedestrian of the next frame through a Kalman filter to obtain a pedestrian information module I (X, Y, S, r, v)_x,v_y,v_s,v_r)；

Step S402, according to the state transition matrix F and the pedestrian information module I (X, Y, S, r, v)_x,v_y,v_s,v_r) Obtaining the predicted information state I of the pedestrian in the next frame_preIn which I_preF × I, to obtain

In the embodiment, the pedestrian information module I (X, Y, s, r, v) is established by predicting the information of the pedestrian of the next frame in the pedestrian history information pool according to a Kalman filter_x,v_y,v_s,v_r) (X, Y) is the absolute position of the pedestrian in space, s is the size of the pedestrian's square frame, r is the proportion of the pedestrian's square frame, (v)_x,v_y) Is the moving speed of the pedestrian, (v)_s,v_r) Is the rate of change of the box. Pedestrian frame prediction (u)_pres,v_pre1,u_pre2,v_pre2) The calculation formula is as follows:

w＝s_prer_pre；

v_pre1＝v_pre2-h；

wherein s is_preTo predict the size of the pedestrian's box, r_preTo predict the proportion of the pedestrian's box, (f)_x,f_y,c_x,c_y) Hc is the height of the camera for the internal and external parameters of the camera, (X)_pre,Y_pre) To predict the absolute position of a pedestrian in space.

And step S5, judging whether the ith candidate in the pedestrian information pool to be matched is matched with the jth predicted candidate in the pedestrian prediction information pool or not through a price matching matrix.

In one embodiment, after step 5, the method further comprises: and when the ith candidate in the to-be-matched pedestrian information pool is matched with the jth predicted candidate in the pedestrian prediction information pool, putting the information of the candidate into the pedestrian history information pool, and updating the parameters of the pedestrian history information pool.

In one embodiment, please refer to fig. 3, fig. 3 is another flow chart of the multi-pedestrian tracking method according to the first embodiment of the present application, in which the following additional steps are added.

The method further comprises the following steps before the step S5:

step S501, obtaining a similarity cost according to the similarity between the ith candidate in the to-be-matched pedestrian information pool and the jth predicted candidate in the pedestrian prediction information pool;

specifically, M candidate persons exist in a pedestrian information pool to be matched, N candidate persons exist in a pedestrian prediction information pool, a matching Cost matrix Cost of the two information pools is constructed, and obviously the Cost is an M × N matrix. Cost [ i ]][j]And representing the matching cost of the ith pedestrian to be matched and the jth predicted pedestrian. The matching cost consists of two parts, a similarity cost similarity and a distance cost dist. Similarity cost is determined by the feature vector f of the ith pedestrian to be matched_iAnd the feature vector f of the jth predicted pedestrian_jThe calculation formula is as follows:

step S502, obtaining distance cost according to a GIOU distance measurement algorithm;

specifically, considering that the frame changes greatly when the pedestrian is close, there is a case where the matching frames do not intersect. Under such weak matching conditions, using an intersection-and-parallel ratio (IOU) based approach may result in a matching failure.

Referring to fig. 4, fig. 4 is a schematic diagram illustrating a state of a pedestrian frame according to an embodiment of the present application. In this embodiment, a method based on the GIOU distance measurement is adopted, and two boxes a and B are considered, which intersect with each other to form a box C and a minimum bounding box D, and the GIOU calculation method is as follows:

and S503, obtaining the cost matching matrix according to the similarity cost and the distance cost.

Specifically, the function of the cost matching matrix is:

Cost＝2Cost_similarity+Cost_GIOU；

where 0.5 is the threshold for similarity cost, 1/3 is the threshold for distance cost, f_iAs feature vector of the ith candidate, f_jThe feature vector for the jth predicted candidate. le4 denotes a matching pair cost that exceeds a threshold.

According to the obtained matching Cost matrix function Cost, converting the matching problem into the following convex optimization assignment problem, wherein a global optimal solution exists, and the convex optimization assignment problem has the following functions:

the functional goal of the convex optimization assignment problem is to solve the target minimum Sum _ match, where x [ i ] [ j ] ═ 1 indicates that the ith candidate matches the jth predicted candidate successfully, otherwise, they do not match, and each candidate matches only one predicted candidate. If x [ i ] [ j ] is equal to 1 and Cost [ i ] [ j ] <1e4 indicates that the final matching is successful, putting the detected candidate information into a pedestrian history information pool and updating parameters of the pedestrian history information pool, wherein the parameters comprise: pedestrian feature vector, pedestrian square frame, pedestrian spatial location.

In the embodiment, the image is detected through an SSD target detection algorithm, and information of all pedestrians in the image is obtained; the method comprises the steps that information of all pedestrians is reserved in a to-be-matched pedestrian information pool; predicting the information of the pedestrian of the next frame through a Kalman filter according to a pedestrian historical information pool to obtain a pedestrian prediction information pool; judging whether the ith candidate in the pedestrian information pool to be matched is matched with the jth predicted candidate in the pedestrian prediction information pool through a valence matching matrix, so that a Kalman filtering modeling method based on pedestrian spatial position prediction is adopted in the method, a 2d modeling method based on square frame movement in the prior art is abandoned, the multi-pedestrian tracking method is more in line with the actual situation that a pedestrian moves in space, and the phenomenon that the pedestrian is lost along with the pedestrian is avoided; therefore, the method and the device adopt a strategy based on GIOU box distance measurement, and improve the effect of a multi-pedestrian tracking algorithm on close pedestrians.

Referring to fig. 5, fig. 5 is a schematic view of a multi-pedestrian tracking apparatus according to a second embodiment of the present application, and the present embodiment provides a corresponding multi-pedestrian tracking apparatus based on the above-described method.

The multi-row person tracking device includes: the pedestrian information matching system comprises a camera device, a pedestrian information extraction module, a space estimation module, a pedestrian information pool to be matched, a Kalman filtering estimation module and a matching module. The camera device is used for acquiring images; the pedestrian information extraction module is used for calling an SSD target detection algorithm to detect the image and obtain the information of all pedestrians in the image; the space estimation module is used for calculating the relative coordinates of the pedestrians in the space according to the internal and external parameters of the camera device and the height of the camera device; the pedestrian information pool to be matched is used for storing information of all pedestrians in the image, and the information of the pedestrians comprises a square frame of the pedestrians and a feature vector of the pedestrians; the Kalman filtering estimation module is used for predicting the information of the pedestrian of the next frame according to a pedestrian historical information pool to obtain a pedestrian prediction information pool; and the matching module is used for judging whether the ith candidate in the pedestrian information pool to be matched is matched with the jth predicted candidate in the pedestrian prediction information pool or not.

As shown in fig. 4, in this embodiment, a method based on the GIOU distance measurement is adopted, and two boxes a and B are considered, which intersect with a box C and a minimum bounding box D, and the GIOU calculation method is as follows:

and obtaining the cost matching matrix according to the similarity cost and the distance cost.

Specifically, the function of the cost matching matrix is:

Cost＝2Cost_similarity+Cost_GIOU；

In the embodiment, the multi-pedestrian tracking device is formed by the camera device, the pedestrian information extraction module, the space estimation module, the to-be-matched pedestrian information pool, the Kalman filtering estimation module and the matching module, so that the robot detection system is obtained for target tracking of multiple pedestrians, the multi-pedestrian tracking device can establish a pedestrian matching relation of a front picture sequence and a rear picture sequence, and a bottom-layer algorithm support is provided for follow-up pedestrian trajectory prediction. Meanwhile, the good tracking effect can be kept in the scene that the pedestrian changes greatly nearby, and the risk of pedestrian loss is reduced. Therefore, the method and the device adopt a strategy based on GIOU box distance measurement, and improve the effect of a multi-pedestrian tracking algorithm on close pedestrians.

Referring to fig. 6, the electronic device 400 includes a memory 410 and a processor 420.

The Processor 420 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 410 may include various types of storage units, such as system memory, Read Only Memory (ROM), and permanent storage. Wherein the ROM may store static data or instructions that are needed by the processor 1020 or other modules of the computer. The persistent storage device may be a read-write storage device. The persistent storage may be a non-volatile storage device that does not lose stored instructions and data even after the computer is powered off. In some embodiments, the persistent storage device employs a mass storage device (e.g., magnetic or optical disk, flash memory) as the persistent storage device. In other embodiments, the permanent storage may be a removable storage device (e.g., floppy disk, optical drive). The system memory may be a read-write memory device or a volatile read-write memory device, such as a dynamic random access memory. The system memory may store instructions and data that some or all of the processors require at runtime. Further, the memory 410 may include any combination of computer-readable storage media, including various types of semiconductor memory chips (DRAM, SRAM, SDRAM, flash memory, programmable read-only memory), magnetic and/or optical disks, may also be employed. In some embodiments, memory 410 may include a removable storage device that is readable and/or writable, such as a Compact Disc (CD), a read-only digital versatile disc (e.g., DVD-ROM, dual layer DVD-ROM), a read-only Blu-ray disc, an ultra-density optical disc, a flash memory card (e.g., SD card, min SD card, Micro-SD card, etc.), a magnetic floppy disc, or the like. Computer-readable storage media do not contain carrier waves or transitory electronic signals transmitted by wireless or wired means.

The memory 410 has stored thereon executable code that, when processed by the processor 420, may cause the processor 420 to perform some or all of the methods described above.

The aspects of the present application have been described in detail hereinabove with reference to the accompanying drawings. In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments. Those skilled in the art should also appreciate that the acts and modules referred to in the specification are not necessarily required in the present application. In addition, it can be understood that the steps in the method of the embodiment of the present application may be sequentially adjusted, combined, and deleted according to actual needs, and the modules in the device of the embodiment of the present application may be combined, divided, and deleted according to actual needs.

Furthermore, the method according to the present application may also be implemented as a computer program or computer program product comprising computer program code instructions for performing some or all of the steps of the above-described method of the present application.

Alternatively, the present application may also be embodied as a non-transitory machine-readable storage medium (or computer-readable storage medium, or machine-readable storage medium) having stored thereon executable code (or a computer program, or computer instruction code) which, when executed by a processor of an electronic device (or electronic device, server, etc.), causes the processor to perform part or all of the various steps of the above-described method according to the present application.

Those of skill would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the applications disclosed herein may be implemented as electronic hardware, computer software, or combinations of both.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems and methods according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Having described embodiments of the present application, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the disclosed embodiments. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims

1. A method of multi-pedestrian tracking, comprising the steps of:

step 1, reading an image captured by a camera of a robot;

step 2, detecting the image through an SSD target detection algorithm to obtain the information of all pedestrians in the image;

step 3, keeping the information of all pedestrians in a pedestrian information pool to be matched;

step 4, predicting the information of the pedestrian of the next frame through a Kalman filter according to a pedestrian historical information pool to obtain a pedestrian prediction information pool;

and 5, judging whether the ith candidate in the pedestrian information pool to be matched is matched with the jth predicted candidate in the pedestrian prediction information pool or not through a price matching matrix.

2. The multi-row person tracking method according to claim 1, further comprising, after step 5: and when the ith candidate in the to-be-matched pedestrian information pool is matched with the jth predicted candidate in the pedestrian prediction information pool, putting the information of the candidate into the pedestrian history information pool, and updating the parameters of the pedestrian history information pool.

3. The method according to claim 2, characterized in that the pedestrian information comprises a pedestrian's box (u)₁,v₁,u₁,u₂) And a pedestrian feature vector f, wherein the pedestrian's box size is s and the pedestrian's box proportion is r, the function of (s, r) being:

4. the multi-row person tracking method of claim 3, further comprising, after step 2:

according to the internal and external parameters of the camera device and the height of the camera device, obtaining the relative coordinates (x, y) of the pedestrian in the space through a space estimation algorithm;

and according to the positioning coordinates of the robot in the space, converting the relative coordinates (X, Y) of the pedestrian in the space into absolute coordinates (X, Y) of the pedestrian in the space to obtain the spatial position of the pedestrian.

5. The method according to claim 4, wherein predicting the pedestrian information of the next frame through a Kalman filter according to a pedestrian history information pool to obtain a pedestrian prediction information pool specifically comprises:

6. The multi-row person tracking method of claim 2, further comprising, prior to step 5:

obtaining a similarity cost according to the similarity between the ith candidate in the pedestrian information pool to be matched and the jth predicted candidate in the pedestrian prediction information pool;

obtaining a distance cost according to a GIOU distance measurement algorithm;

7. The method of multi-row person tracking according to claim 7, wherein the function of the cost matching matrix is:

Cost＝2Cost_similarity+Cost_GIOU；

8. A multi-pedestrian tracking apparatus, comprising:

the camera device is used for acquiring an image;

the pedestrian information extraction module is used for calling an SSD target detection algorithm to detect the image and obtain the information of all pedestrians in the image;

the space estimation module is used for calculating the relative coordinates of the pedestrian in the space according to the internal and external parameters of the camera device and the height of the camera device;

the pedestrian information pool to be matched is used for storing information of all pedestrians in the image, and the information of the pedestrians comprises a square frame of the pedestrians and a feature vector of the pedestrians;

the Kalman filtering estimation module is used for predicting the information of the pedestrian of the next frame according to a pedestrian historical information pool to obtain a pedestrian prediction information pool;

and the matching module is used for judging whether the ith candidate in the pedestrian information pool to be matched is matched with the jth predicted candidate in the pedestrian prediction information pool or not.

9. The multi-pedestrian tracking apparatus of claim 8, wherein the matching module comprises a cost matching matrix, the cost matching matrix function being:

Cost＝2Cost_similarity+Cost_GIOU

10. An electronic device, comprising: a memory; one or more processors; one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs comprising instructions for performing any of the methods of claims 1-7.

11. A storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the multi-row person tracking method of any one of claims 1-7.