CN112862864A - Multi-pedestrian tracking method and device, electronic equipment and storage medium - Google Patents

Multi-pedestrian tracking method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN112862864A
CN112862864A CN202110245128.4A CN202110245128A CN112862864A CN 112862864 A CN112862864 A CN 112862864A CN 202110245128 A CN202110245128 A CN 202110245128A CN 112862864 A CN112862864 A CN 112862864A
Authority
CN
China
Prior art keywords
pedestrian
information
cost
information pool
pool
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110245128.4A
Other languages
Chinese (zh)
Other versions
CN112862864B (en
Inventor
秦豪
赵明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Yogo Robot Co Ltd
Original Assignee
Shanghai Yogo Robot Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Yogo Robot Co Ltd filed Critical Shanghai Yogo Robot Co Ltd
Priority to CN202110245128.4A priority Critical patent/CN112862864B/en
Priority claimed from CN202110245128.4A external-priority patent/CN112862864B/en
Publication of CN112862864A publication Critical patent/CN112862864A/en
Application granted granted Critical
Publication of CN112862864B publication Critical patent/CN112862864B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Traffic Control Systems (AREA)

Abstract

The application relates to a multi-pedestrian tracking method, a multi-pedestrian tracking device, electronic equipment and a storage medium, wherein an SSD (solid State disk) target detection algorithm is used for detecting an image to obtain information of all pedestrians in the image; the method comprises the steps that information of all pedestrians is reserved in a to-be-matched pedestrian information pool; predicting the information of the pedestrian of the next frame through a Kalman filter according to a pedestrian historical information pool to obtain a pedestrian prediction information pool; whether the ith candidate in the pedestrian information pool to be matched is matched with the jth predicted candidate in the pedestrian prediction information pool or not is judged through a valence matching matrix, so that a Kalman filtering modeling method based on pedestrian spatial position prediction is adopted in the method, a 2d modeling method based on square frame movement in the prior art is abandoned, the actual situation that a pedestrian moves in the space is better met by the multi-pedestrian tracking method, and the phenomenon that the pedestrian is lost with the pedestrian is avoided.

Description

Multi-pedestrian tracking method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of artificial intelligence technologies, and in particular, to a method and an apparatus for tracking multiple pedestrians, an electronic device, and a storage medium.
Background
The multi-target tracking is a technology for establishing association between front and rear frames of a video target, records the historical track of the target and predicts the possible future trend. Generally, a multi-target tracking algorithm predicts the position where a target of a next frame may appear in the future by using a kalman filtering mode, calculates the similarity of each target when the front and back correlations are matched, and finally establishes the data correlation.
In the robot industry, multi-target tracking is widely applied. When the intelligent robot moves in an indoor environment and meets moving targets such as pedestrians, the robot needs to predict the moving trend of the pedestrians, and therefore the robot is required to establish association between front and rear frames of a camera picture and record the moving track of the moving target.
However, when the multi-target tracking technology is actually applied, due to the special visual angle of the robot, when a pedestrian moves nearby, the pedestrian picture captured by the camera changes greatly, and the phenomenon of tracking loss exists.
Disclosure of Invention
In order to overcome the problems in the related art, the application provides a multi-pedestrian tracking method, a multi-pedestrian tracking device, electronic equipment and a storage medium, and aims to provide the multi-pedestrian tracking method, which can realize the target tracking of multiple pedestrians by a robot detection system, establish the pedestrian matching relation of front and back image sequences and provide bottom-layer algorithm support for the subsequent pedestrian trajectory prediction; meanwhile, the multi-pedestrian tracking method can keep a good tracking effect in a scene with large pedestrian variation nearby, and reduces the risk of pedestrian tracking loss.
The technical scheme for solving the technical problems is as follows: a multi-pedestrian tracking method, comprising the steps of: step 1, reading an image captured by a camera of a robot; step 2, detecting the image through an SSD target detection algorithm to obtain the information of all pedestrians in the image; step 3, keeping the information of all pedestrians in a pedestrian information pool to be matched; step 4, predicting the information of the pedestrian of the next frame through a Kalman filter according to a pedestrian historical information pool to obtain a pedestrian prediction information pool; and 5, judging whether the ith candidate in the pedestrian information pool to be matched is matched with the jth predicted candidate in the pedestrian prediction information pool or not through a price matching matrix.
Preferably, after step 5, the method further comprises: and when the ith candidate in the to-be-matched pedestrian information pool is matched with the jth predicted candidate in the pedestrian prediction information pool, putting the information of the candidate into the pedestrian history information pool, and updating the parameters of the pedestrian history information pool.
Preferably, the pedestrian information includes a pedestrian frame (u)1,v1,u1,u2) And a pedestrian feature vector f, wherein the pedestrian's box size is s and the pedestrian's box proportion is r, the function of (s, r) being:
Figure BDA0002963821140000021
preferably, after step 2, the method further comprises: according to the internal and external parameters of the camera device and the height of the camera device, obtaining the relative coordinates (x, y) of the pedestrian in the space through a space estimation algorithm; and according to the positioning coordinates of the robot in the space, converting the relative coordinates (X, Y) of the pedestrian in the space into absolute coordinates (X, Y) of the pedestrian in the space to obtain the spatial position of the pedestrian.
Preferably, the predicting the pedestrian information of the next frame by the kalman filter according to a pedestrian history information pool to obtain a pedestrian prediction information pool specifically includes:
predicting the information of the pedestrian of the next frame through a Kalman filter according to a pedestrian historical information pool to obtain a pedestrian information module I (X, Y, s, r, v)x,vy,vs,vr);
According to a state transition matrix F and the pedestrian information module I (X, Y, s, r, v)x,vy,vs,vr) Obtaining the predicted information state I of the pedestrian in the next framepreIn which Ipre=F*I。
Preferably, before step 5, the method further comprises: obtaining a similarity cost according to the similarity between the ith candidate in the pedestrian information pool to be matched and the jth predicted candidate in the pedestrian prediction information pool; obtaining a distance cost according to a GIOU distance measurement algorithm; and obtaining the cost matching matrix according to the similarity cost and the distance cost.
Preferably, the function of the cost matching matrix is:
Cost=2Costsimilarity+CostGIOU
Figure BDA0002963821140000031
Figure BDA0002963821140000032
Figure BDA0002963821140000033
where 0.5 is the threshold for similarity cost, 1/3 is the threshold for distance cost, fiAs feature vector of the ith candidate, fjThe feature vector for the jth predicted candidate.
A second aspect of embodiments of the present application provides a multi-pedestrian tracking apparatus, comprising: the camera device is used for acquiring an image; the pedestrian information extraction module is used for calling an SSD target detection algorithm to detect the image and obtain the information of all pedestrians in the image; the space estimation module is used for calculating the relative coordinates of the pedestrian in the space according to the internal and external parameters of the camera device and the height of the camera device; the pedestrian information pool to be matched is used for storing information of all pedestrians in the image, and the information of the pedestrians comprises a square frame of the pedestrians and a feature vector of the pedestrians; the Kalman filtering estimation module is used for predicting the information of the pedestrian of the next frame according to a pedestrian historical information pool to obtain a pedestrian prediction information pool; and the matching module is used for judging whether the ith candidate in the pedestrian information pool to be matched is matched with the jth predicted candidate in the pedestrian prediction information pool or not.
Preferably, the matching module includes a cost matching matrix, and the cost matching matrix function is:
Cost=2Costsimilarity+CostGIOU
Figure BDA0002963821140000034
Figure BDA0002963821140000035
Figure BDA0002963821140000041
among them, CostsimilarityCost for similarity, CostGIOUFor distance cost, 0.5 is threshold for similarity cost, 1/3 is threshold for distance cost, fiAs feature vector of the ith candidate, fjThe feature vector for the jth predicted candidate.
A third aspect of an embodiment of the present application provides an electronic device, including:
a processor; and one or more processors; one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs including instructions for performing the methods described above.
A fourth aspect of the present application provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method as described above.
The application provides a method and a device for tracking multiple pedestrians, electronic equipment and a storage medium, wherein an SSD target detection algorithm is used for detecting an image to obtain information of all pedestrians in the image; the method comprises the steps that information of all pedestrians is reserved in a to-be-matched pedestrian information pool; predicting the information of the pedestrian of the next frame through a Kalman filter according to a pedestrian historical information pool to obtain a pedestrian prediction information pool; judging whether the ith candidate in the pedestrian information pool to be matched is matched with the jth predicted candidate in the pedestrian prediction information pool through a valence matching matrix, so that a Kalman filtering modeling method based on pedestrian spatial position prediction is adopted in the method, a 2d modeling method based on square frame movement in the prior art is abandoned, the multi-pedestrian tracking method is more in line with the actual situation that a pedestrian moves in space, and the phenomenon that the pedestrian is lost along with the pedestrian is avoided; therefore, the method and the device adopt a strategy based on GIOU box distance measurement, and improve the effect of a multi-pedestrian tracking algorithm on close pedestrians.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Drawings
The foregoing and other objects, features and advantages of the application will be apparent from the following more particular descriptions of exemplary embodiments of the application, as illustrated in the accompanying drawings wherein like reference numbers generally represent like parts throughout the exemplary embodiments of the application.
FIG. 1 is a schematic flow chart diagram illustrating a multi-pedestrian tracking method according to an embodiment of the present application;
FIG. 2 is a block diagram of a matching loop of a multi-pedestrian tracking method, shown in an embodiment of the present application;
FIG. 3 is another schematic flow diagram of a multi-pedestrian tracking method according to an embodiment of the present application;
FIG. 4 is a schematic diagram illustrating a pedestrian frame according to an embodiment of the present application;
FIG. 5 is a schematic diagram of a multi-row person tracking device shown in an embodiment of the present application;
fig. 6 is a schematic structural diagram of an electronic device shown in an embodiment of the present application.
Detailed Description
Preferred embodiments of the present application will be described in more detail below with reference to the accompanying drawings. While the preferred embodiments of the present application are shown in the drawings, it should be understood that the present application may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It should be understood that although the terms "first," "second," "third," etc. may be used herein to describe various information, these information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present application. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present application, "a plurality" means two or more unless specifically limited otherwise.
The robot of embodiments of the present invention may be configured in any suitable shape to perform a particular business function operation, for example, the robot of embodiments of the present invention may be a delivery robot, a transfer robot, a care robot, and the like. The robot generally includes a housing, a sensor unit, a drive wheel assembly, a memory assembly, and a controller. The housing may be substantially circular in shape, and in some embodiments, the housing may be substantially oval, triangular, D-shaped, cylindrical, or otherwise shaped. The sensor unit is used for collecting some motion parameters of the robot and various data of the environment space. In some embodiments, the sensor unit includes a lidar mounted above the housing at a mounting height greater than a top deck height of the housing, the lidar configured to detect an obstacle distance between obstacles of the robot. In some embodiments, the sensor unit may also include an Inertial Measurement Unit (IMU), a gyroscope, a magnetic field meter, an accelerometer or speedometer, an optical camera, and so forth. The driving wheel component is arranged on the shell and drives the robot to move on various spaces, and in some embodiments, the driving wheel component comprises a left driving wheel, a right driving wheel and an omnidirectional wheel, and the left driving wheel and the right driving wheel are respectively arranged on two opposite sides of the shell. The left and right drive wheels are configured to be at least partially extendable and retractable into the bottom of the housing. The omni-directional wheel is arranged at the position, close to the front, of the bottom of the shell and is a movable caster wheel which can rotate 360 degrees horizontally, so that the robot can flexibly steer. The left driving wheel, the right driving wheel and the omnidirectional wheel are arranged to form a triangle, so that the walking stability of the robot is improved. Of course, in some embodiments, the driving wheel component may also adopt other structures, for example, the omni wheel may be omitted, and only the left driving wheel and the right driving wheel may be left to drive the robot to normally walk. In some embodiments, the robot is further configured with a storage component that is mounted within the receiving slot to accomplish a delivery task or the like. The controller is respectively and electrically connected with the left driving wheel, the right driving wheel, the omnidirectional wheel and the laser radar. The controller is used as a control core of the robot and is used for controlling the robot to walk, retreat and some business logic processing.
In some embodiments, the controller may be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA), a single chip, an ar (acorn RISC machine) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination of these components. Also, the controller may be any conventional processor, controller, microcontroller, or state machine. A controller may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP, and/or any other such configuration. In some embodiments, during the movement of the robot, the controller employs SLAM (simultaneous localization and mapping) technology to construct a map and a position according to the environmental data, so as to move to a target position to complete a delivery task, a cleaning task, and the like. The controller instructs the robot to completely traverse an environmental space through a full coverage path planning algorithm based on the established map and the position of the robot. For example, during the robot traversal, the sensor unit acquires an image of a traversal region, wherein the image of the traversal region may be an image of the entire traversal region or an image of a local traversal region in the entire traversal region. The controller generates a map from the image of the traversal area, the map having indicated an area that the robot needs to traverse and coordinate locations at which obstacles located in the traversal area are located. After each location or area traversed by the robot, the robot marks that the location or area has been traversed based on the map. In addition, as the obstacle is marked in a coordinate mode in the map, when the robot passes, the distance between the robot and the obstacle can be judged according to the coordinate point corresponding to the current position and the coordinate point related to the obstacle, and therefore the robot can pass around the obstacle. Similarly, after the position or the area is traversed and marked, when the next position of the robot moves to the position or the area, the robot makes a strategy of turning around or stopping traversing based on the map and the mark of the position or the area. It will be appreciated that the controller may also identify traversed locations or areas, or identify obstacles, in a variety of ways to develop a control strategy that meets product needs.
The technical solutions of the embodiments of the present application are described in detail below with reference to the accompanying drawings.
Referring to fig. 1 and 2, fig. 1 is a schematic flow chart of a multi-pedestrian tracking method according to a first embodiment of the present application, and fig. 2 is a block diagram of a matching cycle of the multi-pedestrian tracking method according to the first embodiment of the present application, as shown in fig. 1 and 2, the method includes the following steps:
step S1, reading an image captured by a camera of a robot;
specifically, in this example, the camera mechanism may be a camera, or other devices capable of acquiring an image of an object; in the present embodiment, the robot moves indoors, and the imaging device is configured to capture an image of the indoor environment, where a plurality of objects such as pedestrians are included in the image.
Step S2, detecting the image through an SSD target detection algorithm to obtain the information of all pedestrians in the image;
specifically, the pedestrian detection module in this embodiment adopts a destination detection algorithm SSD based on a deep neural network, and detects the image by calling the SSD destination detection algorithm, so as to detect all pedestrians in the image, thereby obtaining all pedestrian information in the image.
In one embodiment, the pedestrian information includes a pedestrian box (u)1,v1,u1,u2) And a pedestrian feature vector f, wherein the pedestrian's box size is s, the pedestrian's box proportion is r, and the function of (s, r) is:
Figure BDA0002963821140000081
step S3, keeping the information of all pedestrians in a pedestrian information pool to be matched;
specifically, the pedestrian information pool to be matched includes: a pedestrian characteristic vector, a pedestrian spatial position, and a pedestrian frame. Wherein the pedestrian spatial position is the absolute coordinate of the pedestrian in space. The absolute coordinates of the pedestrian in space are converted from the relative coordinates of the pedestrian in space.
In one embodiment, the following steps are further included after step S2:
step S201, obtaining relative coordinates (x, y) of the pedestrian in the space through a space estimation algorithm according to the internal and external parameters of the camera device and the height of the camera device;
specifically, the calculation formula of the relative coordinates (x, y) of the pedestrian in space is as follows:
Figure BDA0002963821140000082
wherein (f)x,fy,cx,cy) And Hc is the height of the camera device.
And S202, converting the relative coordinates (X, Y) of the pedestrian in the space into absolute coordinates (X, Y) of the pedestrian in the space according to the positioning coordinates of the robot in the space, and obtaining the space position of the pedestrian.
Step S4, predicting the information of the pedestrian of the next frame through a Kalman filter according to a pedestrian historical information pool to obtain a pedestrian prediction information pool;
specifically, the pedestrian history information pool includes: historical pedestrian characteristic vectors, historical pedestrian spatial positions and historical pedestrian frames; the pedestrian prediction information pool includes: predicting a pedestrian feature vector, predicting a pedestrian spatial position, and predicting a pedestrian box.
In one embodiment, the step of predicting the pedestrian information of the next frame through a kalman filter according to a pedestrian history information pool to obtain a pedestrian prediction information pool specifically comprises the following steps:
step S401, according to a pedestrian history information pool, predicting the information of the pedestrian of the next frame through a Kalman filter to obtain a pedestrian information module I (X, Y, S, r, v)x,vy,vs,vr);
Step S402, according to the state transition matrix F and the pedestrian information module I (X, Y, S, r, v)x,vy,vs,vr) Obtaining the predicted information state I of the pedestrian in the next framepreIn which IpreF × I, to obtain
Figure BDA0002963821140000095
In the embodiment, the pedestrian information module I (X, Y, s, r, v) is established by predicting the information of the pedestrian of the next frame in the pedestrian history information pool according to a Kalman filterx,vy,vs,vr) (X, Y) is the absolute position of the pedestrian in space, s is the size of the pedestrian's square frame, r is the proportion of the pedestrian's square frame, (v)x,vy) Is the moving speed of the pedestrian, (v)s,vr) Is the rate of change of the box. Pedestrian frame prediction (u)pres,vpre1,upre2,vpre2) The calculation formula is as follows:
w=sprerpre
Figure BDA0002963821140000091
Figure BDA0002963821140000092
vpre1=vpre2-h;
Figure BDA0002963821140000093
Figure BDA0002963821140000094
wherein s ispreTo predict the size of the pedestrian's box, rpreTo predict the proportion of the pedestrian's box, (f)x,fy,cx,cy) Hc is the height of the camera for the internal and external parameters of the camera, (X)pre,Ypre) To predict the absolute position of a pedestrian in space.
And step S5, judging whether the ith candidate in the pedestrian information pool to be matched is matched with the jth predicted candidate in the pedestrian prediction information pool or not through a price matching matrix.
In one embodiment, after step 5, the method further comprises: and when the ith candidate in the to-be-matched pedestrian information pool is matched with the jth predicted candidate in the pedestrian prediction information pool, putting the information of the candidate into the pedestrian history information pool, and updating the parameters of the pedestrian history information pool.
In one embodiment, please refer to fig. 3, fig. 3 is another flow chart of the multi-pedestrian tracking method according to the first embodiment of the present application, in which the following additional steps are added.
The method further comprises the following steps before the step S5:
step S501, obtaining a similarity cost according to the similarity between the ith candidate in the to-be-matched pedestrian information pool and the jth predicted candidate in the pedestrian prediction information pool;
specifically, M candidate persons exist in a pedestrian information pool to be matched, N candidate persons exist in a pedestrian prediction information pool, a matching Cost matrix Cost of the two information pools is constructed, and obviously the Cost is an M × N matrix. Cost [ i ]][j]And representing the matching cost of the ith pedestrian to be matched and the jth predicted pedestrian. The matching cost consists of two parts, a similarity cost similarity and a distance cost dist. Similarity cost is determined by the feature vector f of the ith pedestrian to be matchediAnd the feature vector f of the jth predicted pedestrianjThe calculation formula is as follows:
Figure BDA0002963821140000101
step S502, obtaining distance cost according to a GIOU distance measurement algorithm;
specifically, considering that the frame changes greatly when the pedestrian is close, there is a case where the matching frames do not intersect. Under such weak matching conditions, using an intersection-and-parallel ratio (IOU) based approach may result in a matching failure.
Referring to fig. 4, fig. 4 is a schematic diagram illustrating a state of a pedestrian frame according to an embodiment of the present application. In this embodiment, a method based on the GIOU distance measurement is adopted, and two boxes a and B are considered, which intersect with each other to form a box C and a minimum bounding box D, and the GIOU calculation method is as follows:
Figure BDA0002963821140000102
and S503, obtaining the cost matching matrix according to the similarity cost and the distance cost.
Specifically, the function of the cost matching matrix is:
Cost=2Costsimilarity+CostGIOU
Figure BDA0002963821140000103
Figure BDA0002963821140000111
where 0.5 is the threshold for similarity cost, 1/3 is the threshold for distance cost, fiAs feature vector of the ith candidate, fjThe feature vector for the jth predicted candidate. le4 denotes a matching pair cost that exceeds a threshold.
According to the obtained matching Cost matrix function Cost, converting the matching problem into the following convex optimization assignment problem, wherein a global optimal solution exists, and the convex optimization assignment problem has the following functions:
Figure BDA0002963821140000112
the functional goal of the convex optimization assignment problem is to solve the target minimum Sum _ match, where x [ i ] [ j ] ═ 1 indicates that the ith candidate matches the jth predicted candidate successfully, otherwise, they do not match, and each candidate matches only one predicted candidate. If x [ i ] [ j ] is equal to 1 and Cost [ i ] [ j ] <1e4 indicates that the final matching is successful, putting the detected candidate information into a pedestrian history information pool and updating parameters of the pedestrian history information pool, wherein the parameters comprise: pedestrian feature vector, pedestrian square frame, pedestrian spatial location.
In the embodiment, the image is detected through an SSD target detection algorithm, and information of all pedestrians in the image is obtained; the method comprises the steps that information of all pedestrians is reserved in a to-be-matched pedestrian information pool; predicting the information of the pedestrian of the next frame through a Kalman filter according to a pedestrian historical information pool to obtain a pedestrian prediction information pool; judging whether the ith candidate in the pedestrian information pool to be matched is matched with the jth predicted candidate in the pedestrian prediction information pool through a valence matching matrix, so that a Kalman filtering modeling method based on pedestrian spatial position prediction is adopted in the method, a 2d modeling method based on square frame movement in the prior art is abandoned, the multi-pedestrian tracking method is more in line with the actual situation that a pedestrian moves in space, and the phenomenon that the pedestrian is lost along with the pedestrian is avoided; therefore, the method and the device adopt a strategy based on GIOU box distance measurement, and improve the effect of a multi-pedestrian tracking algorithm on close pedestrians.
Referring to fig. 5, fig. 5 is a schematic view of a multi-pedestrian tracking apparatus according to a second embodiment of the present application, and the present embodiment provides a corresponding multi-pedestrian tracking apparatus based on the above-described method.
The multi-row person tracking device includes: the pedestrian information matching system comprises a camera device, a pedestrian information extraction module, a space estimation module, a pedestrian information pool to be matched, a Kalman filtering estimation module and a matching module. The camera device is used for acquiring images; the pedestrian information extraction module is used for calling an SSD target detection algorithm to detect the image and obtain the information of all pedestrians in the image; the space estimation module is used for calculating the relative coordinates of the pedestrians in the space according to the internal and external parameters of the camera device and the height of the camera device; the pedestrian information pool to be matched is used for storing information of all pedestrians in the image, and the information of the pedestrians comprises a square frame of the pedestrians and a feature vector of the pedestrians; the Kalman filtering estimation module is used for predicting the information of the pedestrian of the next frame according to a pedestrian historical information pool to obtain a pedestrian prediction information pool; and the matching module is used for judging whether the ith candidate in the pedestrian information pool to be matched is matched with the jth predicted candidate in the pedestrian prediction information pool or not.
Specifically, M candidate persons exist in a pedestrian information pool to be matched, N candidate persons exist in a pedestrian prediction information pool, a matching Cost matrix Cost of the two information pools is constructed, and obviously the Cost is an M × N matrix. Cost [ i ]][j]And representing the matching cost of the ith pedestrian to be matched and the jth predicted pedestrian. The matching cost consists of two parts, a similarity cost similarity and a distance cost dist. Similarity cost is determined by the feature vector f of the ith pedestrian to be matchediAnd the feature vector f of the jth predicted pedestrianjThe calculation formula is as follows:
Figure BDA0002963821140000121
specifically, considering that the frame changes greatly when the pedestrian is close, there is a case where the matching frames do not intersect. Under such weak matching conditions, using an intersection-and-parallel ratio (IOU) based approach may result in a matching failure.
As shown in fig. 4, in this embodiment, a method based on the GIOU distance measurement is adopted, and two boxes a and B are considered, which intersect with a box C and a minimum bounding box D, and the GIOU calculation method is as follows:
Figure BDA0002963821140000122
and obtaining the cost matching matrix according to the similarity cost and the distance cost.
Specifically, the function of the cost matching matrix is:
Cost=2Costsimilarity+CostGIOU
Figure BDA0002963821140000131
Figure BDA0002963821140000132
where 0.5 is the threshold for similarity cost, 1/3 is the threshold for distance cost, fiAs feature vector of the ith candidate, fjThe feature vector for the jth predicted candidate. le4 denotes a matching pair cost that exceeds a threshold.
According to the obtained matching Cost matrix function Cost, converting the matching problem into the following convex optimization assignment problem, wherein a global optimal solution exists, and the convex optimization assignment problem has the following functions:
Figure BDA0002963821140000133
the functional goal of the convex optimization assignment problem is to solve the target minimum Sum _ match, where x [ i ] [ j ] ═ 1 indicates that the ith candidate matches the jth predicted candidate successfully, otherwise, they do not match, and each candidate matches only one predicted candidate. If x [ i ] [ j ] is equal to 1 and Cost [ i ] [ j ] <1e4 indicates that the final matching is successful, putting the detected candidate information into a pedestrian history information pool and updating parameters of the pedestrian history information pool, wherein the parameters comprise: pedestrian feature vector, pedestrian square frame, pedestrian spatial location.
In the embodiment, the multi-pedestrian tracking device is formed by the camera device, the pedestrian information extraction module, the space estimation module, the to-be-matched pedestrian information pool, the Kalman filtering estimation module and the matching module, so that the robot detection system is obtained for target tracking of multiple pedestrians, the multi-pedestrian tracking device can establish a pedestrian matching relation of a front picture sequence and a rear picture sequence, and a bottom-layer algorithm support is provided for follow-up pedestrian trajectory prediction. Meanwhile, the good tracking effect can be kept in the scene that the pedestrian changes greatly nearby, and the risk of pedestrian loss is reduced. Therefore, the method and the device adopt a strategy based on GIOU box distance measurement, and improve the effect of a multi-pedestrian tracking algorithm on close pedestrians.
Fig. 6 is a schematic structural diagram of an electronic device shown in an embodiment of the present application.
Referring to fig. 6, the electronic device 400 includes a memory 410 and a processor 420.
The Processor 420 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 410 may include various types of storage units, such as system memory, Read Only Memory (ROM), and permanent storage. Wherein the ROM may store static data or instructions that are needed by the processor 1020 or other modules of the computer. The persistent storage device may be a read-write storage device. The persistent storage may be a non-volatile storage device that does not lose stored instructions and data even after the computer is powered off. In some embodiments, the persistent storage device employs a mass storage device (e.g., magnetic or optical disk, flash memory) as the persistent storage device. In other embodiments, the permanent storage may be a removable storage device (e.g., floppy disk, optical drive). The system memory may be a read-write memory device or a volatile read-write memory device, such as a dynamic random access memory. The system memory may store instructions and data that some or all of the processors require at runtime. Further, the memory 410 may include any combination of computer-readable storage media, including various types of semiconductor memory chips (DRAM, SRAM, SDRAM, flash memory, programmable read-only memory), magnetic and/or optical disks, may also be employed. In some embodiments, memory 410 may include a removable storage device that is readable and/or writable, such as a Compact Disc (CD), a read-only digital versatile disc (e.g., DVD-ROM, dual layer DVD-ROM), a read-only Blu-ray disc, an ultra-density optical disc, a flash memory card (e.g., SD card, min SD card, Micro-SD card, etc.), a magnetic floppy disc, or the like. Computer-readable storage media do not contain carrier waves or transitory electronic signals transmitted by wireless or wired means.
The memory 410 has stored thereon executable code that, when processed by the processor 420, may cause the processor 420 to perform some or all of the methods described above.
The aspects of the present application have been described in detail hereinabove with reference to the accompanying drawings. In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments. Those skilled in the art should also appreciate that the acts and modules referred to in the specification are not necessarily required in the present application. In addition, it can be understood that the steps in the method of the embodiment of the present application may be sequentially adjusted, combined, and deleted according to actual needs, and the modules in the device of the embodiment of the present application may be combined, divided, and deleted according to actual needs.
Furthermore, the method according to the present application may also be implemented as a computer program or computer program product comprising computer program code instructions for performing some or all of the steps of the above-described method of the present application.
Alternatively, the present application may also be embodied as a non-transitory machine-readable storage medium (or computer-readable storage medium, or machine-readable storage medium) having stored thereon executable code (or a computer program, or computer instruction code) which, when executed by a processor of an electronic device (or electronic device, server, etc.), causes the processor to perform part or all of the various steps of the above-described method according to the present application.
Those of skill would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the applications disclosed herein may be implemented as electronic hardware, computer software, or combinations of both.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems and methods according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Having described embodiments of the present application, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the disclosed embodiments. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims (11)

1. A method of multi-pedestrian tracking, comprising the steps of:
step 1, reading an image captured by a camera of a robot;
step 2, detecting the image through an SSD target detection algorithm to obtain the information of all pedestrians in the image;
step 3, keeping the information of all pedestrians in a pedestrian information pool to be matched;
step 4, predicting the information of the pedestrian of the next frame through a Kalman filter according to a pedestrian historical information pool to obtain a pedestrian prediction information pool;
and 5, judging whether the ith candidate in the pedestrian information pool to be matched is matched with the jth predicted candidate in the pedestrian prediction information pool or not through a price matching matrix.
2. The multi-row person tracking method according to claim 1, further comprising, after step 5: and when the ith candidate in the to-be-matched pedestrian information pool is matched with the jth predicted candidate in the pedestrian prediction information pool, putting the information of the candidate into the pedestrian history information pool, and updating the parameters of the pedestrian history information pool.
3. The method according to claim 2, characterized in that the pedestrian information comprises a pedestrian's box (u)1,v1,u1,u2) And a pedestrian feature vector f, wherein the pedestrian's box size is s and the pedestrian's box proportion is r, the function of (s, r) being:
Figure FDA0002963821130000011
4. the multi-row person tracking method of claim 3, further comprising, after step 2:
according to the internal and external parameters of the camera device and the height of the camera device, obtaining the relative coordinates (x, y) of the pedestrian in the space through a space estimation algorithm;
and according to the positioning coordinates of the robot in the space, converting the relative coordinates (X, Y) of the pedestrian in the space into absolute coordinates (X, Y) of the pedestrian in the space to obtain the spatial position of the pedestrian.
5. The method according to claim 4, wherein predicting the pedestrian information of the next frame through a Kalman filter according to a pedestrian history information pool to obtain a pedestrian prediction information pool specifically comprises:
predicting the information of the pedestrian of the next frame through a Kalman filter according to a pedestrian historical information pool to obtain a pedestrian information module I (X, Y, s, r, v)x,vy,vs,vr);
According to a state transition matrix F and the pedestrian information module I (X, Y, s, r, v)x,vy,vs,vr) Obtaining the predicted information state I of the pedestrian in the next framepreIn which Ipre=F*I。
6. The multi-row person tracking method of claim 2, further comprising, prior to step 5:
obtaining a similarity cost according to the similarity between the ith candidate in the pedestrian information pool to be matched and the jth predicted candidate in the pedestrian prediction information pool;
obtaining a distance cost according to a GIOU distance measurement algorithm;
and obtaining the cost matching matrix according to the similarity cost and the distance cost.
7. The method of multi-row person tracking according to claim 7, wherein the function of the cost matching matrix is:
Cost=2Costsimilarity+CostGIOU
Figure FDA0002963821130000021
Figure FDA0002963821130000022
Figure FDA0002963821130000023
where 0.5 is the threshold for similarity cost, 1/3 is the threshold for distance cost, fiAs feature vector of the ith candidate, fjThe feature vector for the jth predicted candidate.
8. A multi-pedestrian tracking apparatus, comprising:
the camera device is used for acquiring an image;
the pedestrian information extraction module is used for calling an SSD target detection algorithm to detect the image and obtain the information of all pedestrians in the image;
the space estimation module is used for calculating the relative coordinates of the pedestrian in the space according to the internal and external parameters of the camera device and the height of the camera device;
the pedestrian information pool to be matched is used for storing information of all pedestrians in the image, and the information of the pedestrians comprises a square frame of the pedestrians and a feature vector of the pedestrians;
the Kalman filtering estimation module is used for predicting the information of the pedestrian of the next frame according to a pedestrian historical information pool to obtain a pedestrian prediction information pool;
and the matching module is used for judging whether the ith candidate in the pedestrian information pool to be matched is matched with the jth predicted candidate in the pedestrian prediction information pool or not.
9. The multi-pedestrian tracking apparatus of claim 8, wherein the matching module comprises a cost matching matrix, the cost matching matrix function being:
Cost=2Costsimilarity+CostGIOU
Figure FDA0002963821130000031
Figure FDA0002963821130000032
Figure FDA0002963821130000033
among them, CostsimilarityCost for similarity, CostGIOUFor distance cost, 0.5 is threshold for similarity cost, 1/3 is threshold for distance cost, fiAs feature vector of the ith candidate, fjThe feature vector for the jth predicted candidate.
10. An electronic device, comprising: a memory; one or more processors; one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs comprising instructions for performing any of the methods of claims 1-7.
11. A storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the multi-row person tracking method of any one of claims 1-7.
CN202110245128.4A 2021-03-05 Multi-pedestrian tracking method and device, electronic equipment and storage medium Active CN112862864B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110245128.4A CN112862864B (en) 2021-03-05 Multi-pedestrian tracking method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110245128.4A CN112862864B (en) 2021-03-05 Multi-pedestrian tracking method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112862864A true CN112862864A (en) 2021-05-28
CN112862864B CN112862864B (en) 2024-07-02

Family

ID=

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110298867A (en) * 2019-06-21 2019-10-01 江西洪都航空工业集团有限责任公司 A kind of video target tracking method
US20200285845A1 (en) * 2017-09-27 2020-09-10 Nec Corporation Information processing apparatus, control method, and program

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200285845A1 (en) * 2017-09-27 2020-09-10 Nec Corporation Information processing apparatus, control method, and program
CN110298867A (en) * 2019-06-21 2019-10-01 江西洪都航空工业集团有限责任公司 A kind of video target tracking method

Similar Documents

Publication Publication Date Title
CN107907131B (en) positioning system, method and applicable robot
US11400600B2 (en) Mobile robot and method of controlling the same
KR102508843B1 (en) Method and device for the estimation of car egomotion from surround view images
KR101725060B1 (en) Apparatus for recognizing location mobile robot using key point based on gradient and method thereof
KR101776622B1 (en) Apparatus for recognizing location mobile robot using edge based refinement and method thereof
KR101708659B1 (en) Apparatus for recognizing location mobile robot using search based correlative matching and method thereof
US20190332115A1 (en) Method of controlling mobile robot
US8787614B2 (en) System and method building a map
EP2460629B1 (en) Control method for localization and navigation of mobile robot and mobile robot using same
KR101784183B1 (en) APPARATUS FOR RECOGNIZING LOCATION MOBILE ROBOT USING KEY POINT BASED ON ADoG AND METHOD THEREOF
US11436815B2 (en) Method for limiting object detection area in a mobile system equipped with a rotation sensor or a position sensor with an image sensor, and apparatus for performing the same
CN110874100A (en) System and method for autonomous navigation using visual sparse maps
Goedemé et al. Feature based omnidirectional sparse visual path following
US11348276B2 (en) Mobile robot control method
CN108481327A (en) A kind of positioning device, localization method and the robot of enhancing vision
JP7063760B2 (en) Mobile
Carrera et al. Lightweight SLAM and Navigation with a Multi-Camera Rig.
Hakeem et al. Estimating geospatial trajectory of a moving camera
CN111598911B (en) Autonomous line patrol method and device for robot platform and storage medium
CN112862864B (en) Multi-pedestrian tracking method and device, electronic equipment and storage medium
CN112862864A (en) Multi-pedestrian tracking method and device, electronic equipment and storage medium
Kassir et al. Qualitative vision-based navigation based on sloped funnel lane concept
JP7064948B2 (en) Autonomous mobile devices and autonomous mobile systems
US11662739B2 (en) Method, system and apparatus for adaptive ceiling-based localization
Bonin-Font et al. A monocular mobile robot reactive navigation approach based on the inverse perspective transformation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant