CN114066974A

CN114066974A - Target track generation method and device, electronic equipment and medium

Info

Publication number: CN114066974A
Application number: CN202111359855.XA
Authority: CN
Inventors: 宋荣; 刘晓东
Original assignee: Shanghai Goldway Intelligent Transportation System Co Ltd
Current assignee: Shanghai Goldway Intelligent Transportation System Co Ltd
Priority date: 2021-11-17
Filing date: 2021-11-17
Publication date: 2022-02-18
Also published as: WO2023087860A1; CN115908545A

Abstract

The embodiment of the invention provides a method, a device, electronic equipment and a medium for generating a target track, wherein the method comprises the following steps: aiming at an image group acquired by each image acquisition device, acquiring the image position of each target in each image frame in the image group; converting the image position into a position under a world coordinate system according to a preset conversion relation to obtain a world position corresponding to the image position; for each image frame set, fusing the world positions of the same targets in each image frame in the image frame set to obtain the fusion positions of the targets; and for each target, associating the fusion position of the target according to the acquisition time sequence of the image frame set corresponding to the target to generate a fusion track of the target. By adopting the method, the tracks of a plurality of targets can be generated, and the method can meet the track generation requirements of a plurality of complex targets in the scenes such as urban traffic and the like.

Description

Target track generation method and device, electronic equipment and medium

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to a method and an apparatus for generating a target trajectory, an electronic device, and a medium.

Background

The multiple cameras are generally arranged in areas such as intersections and used for shooting monitoring videos of targets such as vehicles and pedestrians at the intersections, action tracks of the targets such as the vehicles and the pedestrians can be generated according to the shot monitoring videos, and then whether illegal behaviors such as red light running exist in the targets or not can be determined according to the action tracks.

The method for generating the action track of the target mainly comprises the following steps: the method comprises the steps of detecting pedestrians in a monitoring video, then constructing a pedestrian re-identification model for matching the initial positions of a certain pedestrian to be identified in other cameras, and finally obtaining the action track of the pedestrian by analyzing the pedestrian track in the forward and reverse directions.

However, the above method can only generate a specific pedestrian trajectory, and cannot meet the trajectory generation requirements of multiple complex targets in an urban traffic scene.

Disclosure of Invention

The embodiment of the invention aims to provide a method, a device, an electronic device and a medium for generating target tracks so as to generate action tracks of multiple targets such as crossing vehicles and pedestrians.

In a first aspect, an embodiment of the present invention provides a method for generating a target trajectory, including:

aiming at an image group acquired by each image acquisition device, acquiring the image position of each target in each image frame in the image group;

converting the image position into a position under a world coordinate system according to a preset conversion relation to obtain a world position corresponding to the image position;

for each image frame set, fusing the world positions of the same targets in each image frame in the image frame set to obtain the fusion positions of the targets; the image frame set is an image set formed by all image frames with synchronous acquisition time;

and for each target, associating the fusion position of the target according to the acquisition time sequence of the image frame set corresponding to the target to generate a fusion track of the target.

Optionally, the number of image acquisition devices is at least 3;

the fusing the world positions of each identical target in each image frame in the image frame set to obtain the fused position of the target includes:

for any two image frames in the image frame set, constructing a similarity matrix of each target in the two image frames according to the world position of each target in the image frame set;

resolving the similarity matrix according to a Hungarian algorithm; if the similarity between the two targets which are obtained by calculation and located at the corresponding positions in the similarity matrix and located in different image frames is larger than a preset similarity threshold value, determining that the two targets are the same target;

and fusing the world positions of the same target in the image frame set to obtain a fused position, and taking the fused position as the fused position of the fused target corresponding to the same target.

Optionally, the constructing, for any two image frames in the image frame set, a similarity matrix of each object in the two image frames according to the world position of each object in the image frame set includes:

determining a velocity direction of each target in the set of image frames based on the world location of the target;

and for any two image frames in the image frame set, constructing a similarity matrix of each target in the two image frames based on the world positions and the speed directions of the targets in the two image frames.

Optionally, after the fusing the world positions of the same target in the image frame set to obtain a fused position, and taking the fused position as the fused position of the fused target corresponding to the same target, the method further includes:

for each fusion target, if at least two targets from the same image group exist in a plurality of same targets corresponding to the fusion target, removing the target with the distance which is not the minimum distance from the fusion target from the at least two targets; and updates the fusion position of the fusion target.

Optionally, before the associating, for each target, the fusion position of the target according to the acquisition time sequence of the image frame set corresponding to the target, and generating the fusion trajectory of the target, the method further includes:

for each image group, generating a single-camera trajectory for each target in the image group based on the plurality of world locations of the target;

for each target, associating the fusion position of the target according to the acquisition time sequence of the image frame set corresponding to the target to generate a fusion track of the target, including:

for each fusion target, if the current fusion position of the fusion target and the source of the fusion position in the existing track both comprise positions in the same single-camera track, determining that the current fusion position of the fusion target is associated with the existing track; the existing track is formed by associating the fusion position of each target with the acquisition time before the current fusion position of the fusion target;

if the current fusion position of the fusion target does not comprise a position in the same single-camera track with the source of the fusion position in any one existing track, and the current fusion position of the fusion target is in a preset motion state, determining that the current fusion position of the fusion target is not related to the existing track;

if the current fusion position of the fusion target does not comprise a position in the same single-camera track with the source of the fusion position in any existing track, and the current fusion position of the fusion target is not in a preset motion state, constructing a correlation matrix between each fusion target in the image frame corresponding to the fusion target and the existing track;

calculating the association matrix according to a Hungarian algorithm, and if the correlation between the fusion target and the existing track at the corresponding position in the association matrix is larger than a preset correlation threshold value, determining that the current fusion position of the fusion target is associated with the existing track;

and if the fusion positions corresponding to the fusion target are all associated, generating a fusion track of the fusion target based on all the associated fusion positions of the fusion target.

Optionally, after determining that the fusion target is associated with the existing track, the method further includes:

the existing trajectory is updated based on the current fusion location of the fusion target.

Optionally, before the fusing, for each image frame set, the world position of each identical object in each image frame in the image frame set to obtain the fused position of the object, the method further includes:

matching the single camera tracks in any two image groups;

optionally selecting a group of matched single-camera tracks, and respectively calculating the distance from the starting position and the ending position of the first single-camera track in the group to each position of the second single-camera track; and calculating the distance between the start position and the end position of the second single-camera trajectory in the group to each position of the first single-camera trajectory, respectively;

if a position exists in the second single-camera track, wherein the distance between the second single-camera track and the starting position or the ending position of the first single-camera track is smaller than a preset distance threshold, and a position exists in the first single-camera track, wherein the distance between the first single-camera track and the starting position or the ending position of the second single-camera track is smaller than a preset distance threshold, the group of matched single-camera tracks are reserved; otherwise, deleting the single-camera track of the group;

aiming at each group of reserved matched single-camera tracks, selecting a shortest distance from the starting position and the ending position of a first single-camera track in the group to each position of a second single-camera track and the distance from the starting position and the ending position of the second single-camera track in the group to each position of the first single-camera track, and determining image frames where two positions corresponding to the shortest distance are respectively located as primary synchronous image frames;

and calculating the average time of the acquisition time of each pair of primary synchronous image frames in any two image groups, and taking the image frame with the acquisition time of the average time in the two image groups as the final synchronous image frame of the two image groups.

In a second aspect, an embodiment of the present invention further provides a device for generating a target trajectory, including:

the position acquisition module is used for acquiring the image position of each target in each image frame in the image group aiming at the image group acquired by each image acquisition device;

the position conversion module is used for converting the image position into a position under a world coordinate system according to a preset conversion relation to obtain a world position corresponding to the image position;

the position fusion module is used for fusing the world positions of the same targets in the image frames in each image frame set aiming at each image frame set to obtain the fusion positions of the targets; the image frame set is an image set formed by all image frames with synchronous acquisition time;

and the track generation module is used for associating the fusion positions of the targets according to the acquisition time sequence of the image frame set corresponding to the targets so as to generate the fusion track of the targets.

Optionally, the number of image acquisition devices is at least 3;

the location fusion module includes:

the similarity matrix determination submodule is used for constructing a similarity matrix of each target in any two image frames according to the world position of each target in the image frame set aiming at the two image frames in the image frame set;

the target determining submodule is used for calculating the similarity matrix according to the Hungarian algorithm, and if the similarity between two targets which are located at corresponding positions in the similarity matrix and are located in different image frames obtained through calculation is larger than a preset similarity threshold value, the two targets are determined to be the same target;

and the position fusion submodule is used for fusing the world positions of the same target in the image frame set to obtain a fused position, and taking the fused position as the fused position of the fused target corresponding to the same target.

Optionally, the similarity matrix determining submodule is specifically configured to determine a speed direction of each target based on a world position of the target in the image frame set; and for any two image frames in the image frame set, constructing a similarity matrix of each target in the two image frames based on the world positions and the speed directions of the targets in the two image frames.

Optionally, the apparatus further comprises:

the fusion position updating module is used for eliminating a target with a distance which is not the minimum distance with the fusion target from a plurality of identical targets corresponding to the fusion target if the plurality of identical targets exist from the same image group aiming at each fusion target; and updates the fusion position of the fusion target.

Optionally, the apparatus further comprises:

a single camera trajectory generation module to generate, for each image group, a single camera trajectory for each target in the image group based on the plurality of world locations of the target;

the track generation module is specifically used for determining that the current fusion position of the fusion target is associated with the existing track if the current fusion position of the fusion target and the source of the fusion position in the existing track both comprise positions in the same single-camera track; the existing track is formed by associating the fusion position of each target with the acquisition time before the current fusion position of the fusion target; if the current fusion position of the fusion target does not comprise a position in the same single-camera track with the source of the fusion position in any existing track, and the current fusion position of the fusion target is in a preset motion state, constructing a correlation matrix between each fusion target in the image frame corresponding to the fusion target and the existing track; calculating the association matrix according to a Hungarian algorithm, and if the correlation between the fusion target and the existing track at the corresponding position in the association matrix is larger than a preset correlation threshold value, determining that the current fusion position of the fusion target is associated with the existing track; and if the fusion positions corresponding to the fusion target are all associated, generating a fusion track of the fusion target based on all the associated fusion positions of the fusion target.

Optionally, the apparatus further comprises:

and the track updating module is used for updating the existing track based on the current fusion position of the fusion target.

Optionally, the apparatus further comprises:

a synchronized time determination module to generate, for each image group, a single camera trajectory for each target in the image group based on the plurality of world locations of the target; matching the single camera tracks in any two image groups; optionally selecting a group of matched single-camera tracks, and respectively calculating the distance from the starting position and the ending position of the first single-camera track in the group to each position of the second single-camera track; and calculating the distance between the start position and the end position of the second single-camera trajectory in the group to each position of the first single-camera trajectory, respectively; if a position exists in the second single-camera track, wherein the distance between the second single-camera track and the starting position or the ending position of the first single-camera track is smaller than a preset distance threshold, and a position exists in the first single-camera track, wherein the distance between the first single-camera track and the starting position or the ending position of the second single-camera track is smaller than a preset distance threshold, the group of matched single-camera tracks are reserved; otherwise, deleting the single-camera track of the group; aiming at each group of reserved matched single-camera tracks, selecting a shortest distance from the starting position and the ending position of a first single-camera track in the group to each position of a second single-camera track and the distance from the starting position and the ending position of the second single-camera track in the group to each position of the first single-camera track, and determining image frames where two positions corresponding to the shortest distance are respectively located as primary synchronous image frames; and calculating the average time of the acquisition time of each pair of primary synchronous image frames in any two image groups, and taking the image frame with the acquisition time of the average time in the two image groups as the final synchronous image frame of the two image groups.

In a third aspect, an embodiment of the present invention provides an electronic device, including a processor, a communication interface, a memory, and a communication bus, where the processor and the communication interface complete communication between the memory and the processor through the communication bus;

a memory for storing a computer program;

a processor adapted to perform the method steps of any of the above first aspects when executing a program stored in the memory.

In a fourth aspect, the present invention provides a computer-readable storage medium, in which a computer program is stored, and the computer program, when executed by a processor, implements the method steps of any one of the above first aspects.

The embodiment of the invention has the following beneficial effects:

by adopting the method provided by the embodiment of the invention, the image position of each target in each image frame in the image group is acquired aiming at the image group acquired by each image acquisition device; converting the image position into a position under a world coordinate system according to a preset conversion relation to obtain a world position corresponding to the image position; for each image frame set, fusing the world positions of the same targets in each image frame in the image frame set to obtain the fusion positions of the targets; and for each target, associating the fusion position of the target according to the acquisition time sequence of the image frame set corresponding to the target to generate a fusion track of the target. The method provided by the embodiment of the invention can generate the tracks of a plurality of targets, and can meet the track generation requirements of a plurality of complex targets in scenes such as urban traffic and the like.

Of course, not all of the advantages described above need to be achieved at the same time in the practice of any one product or method of the invention.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other embodiments can be obtained by referring to these drawings.

Fig. 1 is a flowchart of a target trajectory generation method according to an embodiment of the present invention;

FIG. 2 is a flow chart of an automatic calibration method according to an embodiment of the present invention;

FIG. 3 is a flowchart of a location fusion method according to an embodiment of the present invention;

FIG. 4 is a flowchart of generating a fused track of a target according to an embodiment of the present invention;

FIG. 5 is a schematic flow chart of a trajectory rectification method according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of a target track generation apparatus according to an embodiment of the present invention;

fig. 7 is another schematic structural diagram of a target trajectory generation apparatus according to an embodiment of the present invention;

fig. 8 is a schematic structural diagram of a target trajectory generation apparatus according to an embodiment of the present invention;

fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived from the embodiments given herein by one of ordinary skill in the art, are within the scope of the invention.

In order to generate action tracks of multiple targets such as vehicles and pedestrians at an intersection and meet the track generation requirements of multiple complex targets in scenes such as urban traffic, the embodiment of the invention provides a target track generation method, a target track generation device, electronic equipment, a storage medium and a computer program product.

First, a method for generating a target trajectory according to an embodiment of the present invention is described below. The method for generating the target trajectory provided by the embodiment of the present invention may be applied to any electronic device with an image processing function, and is not specifically limited herein.

Fig. 1 is a flowchart of a method for generating a target trajectory according to an embodiment of the present invention, as shown in fig. 1, where the method includes:

s101, aiming at the image group collected by each image collecting device, acquiring the image position of each target in each image frame in the image group.

And S102, converting the image position into a position under a world coordinate system according to a preset conversion relation, and obtaining a world position corresponding to the image position.

S103, for each image frame set, fusing the world positions of the same targets in the image frames in the image frame set to obtain the fused positions of the targets.

The image frame set is an image set formed by all image frames with synchronous acquisition time.

And S104, for each target, associating the fusion position of the target according to the acquisition time sequence of the image frame set corresponding to the target, and generating a fusion track of the target.

The embodiment of the invention can be applied to scenes such as crossroads and the like. In the embodiment of the invention, a plurality of image acquisition devices are arranged in a specific application scene, the position and the angle of each image acquisition device are different, and each image acquisition device can acquire images at different angles in the application scene. Each image acquisition device can acquire a plurality of frames of images as one image group. One or more targets in the application scene can be included in each frame of image, and the targets can be specifically pedestrians and vehicles in the application scene. The image capturing device may be a camera, a video recorder, or the like.

For each image acquisition device, because the angles of the image acquisition devices in the actual application scenes are different, the image position of the target in the image acquired by the image acquisition device is not uniform with the corresponding position of the target in the real world, and the track of the target needs to be displayed on a high-precision map in the application scenes of urban traffic and the like, so that the target is accurately positioned on the high-precision map, the image position of each target in each image frame needs to be converted into the position under a world coordinate system, and the world position of the target is obtained.

In the embodiment of the present invention, for each image capturing device, a conversion relationship between an image position of a target captured by the image capturing device and a world position in a world coordinate system may be determined, and specifically, world position coordinates of a set of specified targets in an image frame captured by the image capturing device in the world coordinate system may be obtained, for example, the world position coordinates of the set of specified targets in the world coordinate system may be obtained through actual distance measurement or from a high-precision map. Wherein, the designated target can select a target with obvious characteristics, such as a lane line stop point, a guide arrow, a road fixing facility and the like. Then, for each image capturing device, a coordinate transformation matrix corresponding to the image capturing device may be calculated according to the selected world position coordinates of the plurality of designated objects corresponding to the image capturing device in the world coordinate system and the selected image position coordinates of the plurality of designated objects in the image frame by using the following formula:

wherein the content of the first and second substances,

a coordinate transformation matrix corresponding to the image pickup device₁₁-a₃₃X ', y ' and w ' are respectively an abscissa, an ordinate and a vertical coordinate in world position coordinates of the specified target, and u and v are respectively an abscissa and a vertical coordinate in image position coordinates of the specified target. Here, since the plane height directions before and after the position coordinate conversion are normalized, a33 is 1, and w' is 1.

In the process of actually calculating the coordinate transformation matrix corresponding to the image acquisition device, since the image acquisition device may have a certain distortion, the coordinate transformation matrix needs to be corrected by combining parameters of the image acquisition device, and specifically, the parameters of the image acquisition device may be multiplied on the basis of the coordinate transformation matrix calculated based on a group of specified targets to obtain a new coordinate transformation matrix. In addition, for an application scene with a more complex road condition, a coordinate transformation matrix corresponding to each group of designated targets can be respectively calculated according to a plurality of groups of designated targets in an image frame acquired by image acquisition equipment, and then a matrix corresponding to an average value of the coordinate transformation matrices corresponding to the plurality of groups of designated targets is calculated and used as a coordinate transformation matrix corresponding to the image acquisition equipment. The information of the coordinate transformation matrix corresponding to the multiple groups of specified targets is combined to be more complete, and the mapping relation between the image position coordinates and the world position coordinates can be represented more accurately.

In the embodiment of the invention, after the coordinate conversion matrix corresponding to each image acquisition device is obtained in advance, the coordinate conversion matrix can be used as a preset conversion relation, and the image position coordinates of each target in the image frame acquired by the image acquisition device are converted into world position coordinates in a world coordinate system. For example, the following formula can be used to convert the image position into a position in a world coordinate system, and a world position corresponding to the image position is obtained:

wherein the content of the first and second substances,

a coordinate transformation matrix corresponding to the image pickup device₁₁-a₃₃X ', y ' and w ' are respectively an abscissa, an ordinate and an ordinate in a world position, and u and v are respectively an abscissa and an ordinate in an image position.

In the embodiment of the invention, image frame synchronization processing needs to be carried out on different image acquisition devices so as to ensure that the positions of all the same targets for subsequent position fusion are at the same time. Specifically, time synchronization correction can be performed on each image acquisition device in a manual correction mode, so that it is ensured that each image acquisition device starts to acquire images at the same time, and the frequency of acquiring the images is the same. In the embodiment of the present invention, a set formed by image frames acquired by different image acquisition devices at the same time may also be determined as an image frame set according to timestamps of the image frames acquired by the image acquisition devices. In a possible implementation manner, an automatic correction manner may also be adopted to perform synchronous correction on each image capturing device, specifically, fig. 2 is a flowchart of the automatic correction manner provided by the embodiment of the present invention, and as shown in fig. 2, the correction method includes:

s201, aiming at each image group, generating a single-camera track of each target based on a plurality of world positions of the target in the image group.

Each image group is formed by a plurality of image frames acquired by the same image acquisition equipment.

Specifically, the single-camera trajectory of the target may be generated using the following steps a1-a 2:

step A1: for each image group, the respective targets in the first image frame in the image group may be regarded as single-camera trajectories, respectively, and single-camera trajectory identifications of the respective single-camera trajectories may be generated.

Step A2: for each target in the next image frame in the image group, the minimum distance between the target and each single-camera track in the previous image frame in the image group can be calculated according to the world position of the target; and if the minimum distance is smaller than a preset threshold value, determining that the target in the image frame is the same as the target in the single-camera track, taking the world position of the target as a track point of the single-camera track, and updating the single-camera track.

The preset threshold may be set according to practical applications, and is not specifically limited herein.

In the embodiment of the invention, a single-camera track of the target can also be generated by adopting a deep sort (multi-target analysis algorithm), a FairMot (multi-target analysis algorithm) and the like.

S202, single camera tracks in any two image groups are matched.

In this embodiment, if the target has a license plate number, the single-camera tracks in any two image groups can be matched according to the license plate number of the target, and the single-camera tracks of the target having the same license plate number are used as a group of matched single-camera tracks.

In the present embodiment, the matching degree between the single camera trajectories in any two image groups may be calculated from the distance between the single camera trajectories, and a pair of single camera trajectories with the highest matching degree may be set as a set of matched single camera trajectories. For example, for any two image groups, which are respectively an image group a composed of image frames acquired by the image acquisition device 1 and an image group B composed of image frames acquired by the image acquisition device 2, for each single-camera trajectory in the image group a, a cosine similarity of a distance between the single-camera trajectory and each single-camera trajectory in the image group B may be calculated as a matching degree; and then selecting a single camera track with the highest matching degree with the single camera track in the image group B as an initial matching single camera track of the single camera track, wherein the matching degree between the initial matching single camera track of the single camera track and the single camera track reaches a preset matching degree threshold, and the initial matching single camera track of the single camera track and the single camera track can be determined as a group of matched single camera tracks. The preset matching degree threshold may be set according to an actual application scenario, and is not specifically limited herein. In this embodiment, after the matching degree between the preliminarily matched single-camera trajectory of the single-camera trajectory and the single-camera trajectory reaches the preset matching degree threshold, it may be further determined whether the lane number of the lane where the single-camera trajectory is located is consistent with the lane number of the lane where the preliminarily matched single-camera trajectory of the single-camera trajectory is located, and if so, the preliminarily matched single-camera trajectory of the single-camera trajectory and the single-camera trajectory may be determined as a group of matched single-camera trajectories.

A set of matching single camera trajectories includes a first single camera trajectory and a second single camera trajectory from different sets of images.

S203, selecting a group of matched single-camera tracks, and respectively calculating the distance from the starting position and the ending position of the first single-camera track in the group to each position of the second single-camera track; and calculating the distance between the start position and the end position of the second single-camera track in the group to each position of the first single-camera track respectively.

Specifically, the distance of world position coordinates between the positions is calculated in this step.

For example, a single camera trajectory a in image group 1 matches a single camera trajectory B in image group 2, where the single camera trajectory a is a first single camera trajectory and the single camera trajectory B is a second single camera trajectory. In this step, the distance between the start position and the end position of the single-camera trajectory a and each position of the single-camera trajectory B may be calculated, and the distance between the start position and the end position of the single-camera trajectory B and each position of the single-camera trajectory a may be calculated.

S204, if a position exists in the second single-camera track, wherein the distance between the second single-camera track and the starting position or the ending position of the first single-camera track is smaller than a preset distance threshold, and a position exists in the first single-camera track, wherein the distance between the first single-camera track and the starting position or the ending position of the second single-camera track is smaller than a preset distance threshold, the matched single-camera track of the group is reserved; otherwise, the single camera track of the group is deleted.

In the embodiment of the present invention, the preset distance threshold may be set to 1 meter or 2 meters, etc.

For example, if the preset distance threshold is set to 1 meter, for the matched single-camera track a and single-camera track B, if the distance between the third position in the single-camera track B and the start position of the single-camera track a is less than 1 meter, and the distance between the fourth position in the single-camera track a and the start position of the single-camera track B is less than 1 meter, the matched single-camera track a and the matched single-camera track B are retained.

And if the distance between any position in the single-camera track B and the initial position of the single-camera track A is not less than 1 meter, the distance between any position in the single-camera track B and the end position of the single-camera track A is not less than 1 meter, the distance between any position in the single-camera track A and the initial position of the single-camera track B is not less than 1 meter, and the distance between any position in the single-camera track A and the end position of the single-camera track B is not less than 1 meter, deleting the matched single-camera track A and the matched single-camera track B.

S205, aiming at each group of the reserved matched single-camera tracks, selecting the shortest distance from the starting position and the ending position of the first single-camera track in the group to each position of the second single-camera track and the distance from the starting position and the ending position of the second single-camera track in the group to each position of the first single-camera track, and determining the image frames of the two positions corresponding to the shortest distance as primary synchronous image frames.

For example, if the matched single camera track a and the matched single camera track B are reserved, and for the matched single camera track a and the matched single camera track B, if a distance between a start position and an end position of the single camera track a to each position of the single camera track B, and a distance between a start position of the single camera track a and a third position of the single camera track B among distances between a start position and an end position of the single camera track B to each position of the single camera track a, is the shortest, an image frame in which the start position of the single camera track a is located and an image frame in which the third position of the single camera track B is located may be the primary synchronization image frame.

S206, calculating the average time of the acquisition time of each pair of primary synchronous image frames in any two image groups, and taking the image frame with the acquisition time being the average time in the two image groups as the final synchronous image frame of the two image groups.

For example, for image group 1 and image group 2, if image frame a1 in image group 1 and image frame B1 in image group 2 are primary synchronization image frames, and image frame a2 in image group 1 and image frame B2 in image group 2 are primary synchronization image frames, the acquisition time of image frame a1 is t_A1The acquisition time of the image frame A2 is t_A2The acquisition time of the image frame B1 is t_B1The acquisition time of the image frame B2 is t_B2. The average time t can be calculated_Average＝(t_A1+t_A2+t_B1+t_B2)/4. The acquisition time in image set 1 can be t_AverageImage frame a3 and image group 2 acquisition time t_AverageAs the final synchronized image frame of the two image groups, image frame B3. Also, the image group 1 is the same number of frames from the image frame A3 after the image frame A3 in synchronization with the acquisition time between the image frames from the image frame B3 after the image frame B3 in the image group 2.

Based on the method described in fig. 2, image frames with synchronized acquisition time between each image group can be automatically determined.

If multiple groups of matched single-camera tracks exist between any two image groups, the frame number of the time-synchronized image frames can be calculated according to each group of matched single-camera tracks, and the average value of the frame numbers of the time-synchronized image frames determined by the multiple groups of matched single-camera tracks is used as the time synchronization frame. For example, a single camera track a in the image group 1 matches a single camera track B in the image group 2, and it is determined based on the single camera track a and the single camera track B that the acquisition time of a first image frame in the image group 1 is synchronized with the acquisition time of a third image frame in the image group 2, a single camera track C in the image group 1 matches a single camera track D in the image group 2, and it is determined based on the single camera track C and the single camera track D that the acquisition time of a first image frame in the image group 1 is synchronized with the acquisition time of a fifth image frame in the image group 2, then an average value of the number of matching frames may be taken, and finally the acquisition time synchronization of the first image frame in the image group 1 and the acquisition time synchronization of a fourth image frame in the image group 2 is obtained.

By adopting the method provided by the embodiment of the invention, aiming at each target, the fusion positions of the targets are associated according to the acquisition time sequence of the image frame set corresponding to the target, and the fusion track of the target is generated. In addition, the method provided by the embodiment of the invention provides more reliable space-time information for the track fusion of the target by synchronizing the image frames acquired by a plurality of image acquisition devices.

In the embodiment of the present invention, the number of the image capturing devices may be 2, 3, or more.

In a possible implementation manner, fig. 3 is a flowchart of a location fusion method provided by an embodiment of the present invention, and as shown in fig. 3, for an application scenario in which the number of image capturing devices is 3 and more than 3, the fusing the world locations of each identical object in each image frame in the image frame set to obtain a fused location of the object includes:

s301, for any two image frames in the image frame set, according to the world position of each target in the image frame set, constructing a similarity matrix of each target in the two image frames.

In this embodiment of the present invention, for any two image frames in the image frame set, the similarity matrix of each object in the two image frames is constructed according to the world position of each object in the image frame set, specifically, the following steps B1-B2 may be adopted:

step B1, based on the world location of each object in the set of image frames, determines a velocity direction of the object.

Specifically, for each target, a position difference value between a world position corresponding to the image position of the target in the current image frame and a world position corresponding to the image position of the target in the image frame before the current image frame may be calculated, a ratio between the position difference value and the acquisition time difference of two image frames is determined as a speed of the target, and a speed direction of the target is represented based on the positive and negative of the speed.

And step B2, constructing a similarity matrix of each object in the two image frames based on the world positions and the speed directions of each object in the two image frames for any two image frames in the image frame set.

In order to avoid missing the track of the target, in the present embodiment, the targets acquired by every two arbitrary image acquisition devices may be matched first to determine the same target in the targets acquired by the two image acquisition devices.

Specifically, for any two image frames in the image frame set, such as the image frame a and the image frame B, if the image frame a includes m objects and the image frame B includes n objects, the cosine similarity between the velocity vectors of the objects in the image frame a and the image frame B may be calculated according to the world positions of the objects in the image frame a and the image frame B and the velocity directions of the objects in the image frame a and the image frame B, and the cosine similarity is used as the similarity, so as to obtain an m × n similarity matrix.

In the embodiment of the present invention, not only the manner mentioned in the above steps is used, but also the actual application scenario may be combined with the cosine similarity between the calculated velocity vectors of the targets, and the appearance similarity, the posture similarity, the target type similarity, and the like between the targets may be determined by combining various features of the appearance, the posture, the target type, and the like of the targets, and the cosine similarity, the appearance similarity, the posture similarity, and the target type similarity between the velocity vectors of the targets may be superimposed to obtain the corresponding m × n similarity matrix. The target type can be a pedestrian, a vehicle, a sign board or the like; the appearance characteristics of the target can be the license plate number of the vehicle, the facial characteristics of the pedestrian and the like; the attitude characteristics of the target may be the world location, speed, and speed direction of the target, etc.

In the embodiment of the invention, if the target is a vehicle, the license plate number of the target can be acquired, and the similarity between any two targets positioned in different image frames in the two image frames is calculated according to the license plate number.

S302, resolving the similarity matrix according to a Hungarian algorithm; and if the calculated similarity between the two targets which are located at the corresponding positions in the similarity matrix and are located in different image frames is larger than a preset similarity threshold, determining that the two targets are the same target.

In the embodiment of the invention, after the similarity matrix of each target in the two image frames is obtained, the similarity matrix can be solved by using the Hungarian algorithm to obtain the one-to-one matching relation among the targets. For example, if image frame a1 in the set of image frames includes 5 objects: a1, a2, a3, a4, and a5, image frame B1 includes 3 targets: b1, B2 and B3, wherein the similarity matrix of each object in the image frame A1 and the image frame B1 is a matrix of 5 multiplied by 3, and the similarity matrix can be solved by using Hungarian algorithm to obtain a one-to-one matching relation among the objects: a1 matched b2, a2 matched b1, a4 and b 3.

In this embodiment of the present invention, for each image group, a single-camera track of each target in the image group may be generated based on a plurality of world positions of the target, and specifically, the single-camera track of the target may be generated by using the above steps a1-a2, which are not described herein again.

For any two image frames in the image frame set, after a similarity matrix of each object in the two image frames is constructed according to the world position of each object in the image frame set, if historical association exists between a pair of objects respectively located in the two image frames, the similarity between the two objects can be directly set to be 1. For example, if image frame a1 in the set of image frames includes 5 objects: a1, a2, a3, a4, and a5, image frame B1 includes 3 targets: b1, b2 and b 3. If the object in the single-camera trajectory to which a1 belongs in the image frame a1 that is previous to the image frame a is matched with the object in the single-camera trajectory to which B2 belongs in the image frame B1 that is previous to the image frame a, the similarity between the object a1 and the object B2 can be directly set to 1.

After the similarity matrix is solved, if the similarity between two targets located at corresponding positions in the similarity matrix and in different image frames obtained through calculation is greater than a preset similarity threshold, the two targets are determined to be the same target. For example, the solution results in that a1 is matched with b2, a2 is matched with b1, a4 and b3, namely a1 and b2 are at corresponding positions in the similarity matrix, a2 and b1 are at corresponding positions in the similarity matrix, and a4 and b3 are at corresponding positions in the similarity matrix, and then it may be further determined whether the similarity between a1 and b2 is greater than a preset similarity threshold, the similarity between a2 and b1 is greater than a preset similarity threshold, and the similarity between a4 and b3 is greater than a preset similarity threshold. If the similarity between a1 and b2 is greater than a preset similarity threshold, the similarity between a2 and b1 is not greater than the preset similarity threshold, and the similarity between a4 and b3 is greater than the preset similarity threshold, then a1 and b2 may be determined as the same target, a4 and b3 may be determined as the same target, and a2 and b1 may not be the same target. The preset similarity threshold may be set according to an actual application scenario, and is not specifically limited herein.

And S303, fusing the world positions of the same target in the image frame set to obtain a fused position, and taking the fused position as the fused position of the fused target corresponding to the same target.

Specifically, an average value of world coordinates of world positions of the same target in the image frame set may be calculated, the average value may be taken as the indicated position as the fused position, and the fused position may be taken as the fused position of the fusion target corresponding to the same target.

In the step, matching results of any two targets positioned in different image frames in the image frame set are integrated to obtain the fusion positions of the fusion targets corresponding to all the same targets in the image frame set.

In this embodiment, in order to ensure that each fusion target can only match at most 1 target in the image frames acquired by each image acquisition device, that is, the target acquired by each image acquisition device can only appear in one fusion target, after the world positions of the same targets in the image frame set are fused to obtain a fused position, and the fused position is taken as the fused position of the fusion target corresponding to the same target, the method further includes: for each fusion target, if at least two targets from the same image group exist in a plurality of same targets corresponding to the fusion target, removing the target with the distance which is not the minimum distance from the fusion target from the at least two targets; and updates the fusion position of the fusion target. That is, the distance between the at least two targets and the fusion target may be calculated, the target with the smallest distance between the at least two targets and the fusion target is retained, other targets are removed, and then the average value of the world coordinates of the world positions of the remaining same targets after the targets are removed is calculated as the updated fusion position of the fusion target.

After the fusion objectives are obtained, the features of each fusion objective may be updated. Specifically, the characteristics of the fusion target may include: world coordinates, appearance characteristics, object type, license plate number, etc. of the fusion location of the fusion object. The world coordinate of the fusion target is an average value of the world coordinates of the world of the target in each image acquisition device of the fusion source; the target type and the license plate number of the fusion target adopt the type and the lane number of a target which is closest to the fusion target in the targets in each image acquisition device of the fusion source; the fusion target appearance characteristic is an average value of appearance characteristics of targets in the image acquisition devices which are fusion sources of the fusion target appearance characteristic.

By adopting the method provided by the embodiment of the invention, aiming at each target, the fusion positions of the targets are associated according to the acquisition time sequence of the image frame set corresponding to the target, and the fusion track of the target is generated. In addition, the method provided by the embodiment of the invention provides more reliable space-time information for the track fusion of the target by synchronizing the image frames acquired by a plurality of image acquisition devices. And moreover, the position association is realized by combining the characteristics of world coordinates, license plate numbers, movement speed, appearance and the like, so that the association result is more reliable and has wide application range, and the method is not limited to targets of specific categories.

In another possible embodiment, before the fusion position of each target is associated according to the acquisition time sequence of the image frame set corresponding to the target to generate the fusion track of the target, for each target, a single-camera track of each target in the image group may be generated based on a plurality of world positions of the target. Specifically, the single-camera track of the target may be generated by the steps a1-a2, which are not described herein again. Also, a single camera track identification may also be generated for each single camera track.

Fig. 4 is a flowchart of generating a fusion trajectory of a target according to an embodiment of the present invention, and as shown in fig. 4, the step of associating the fusion positions of the target according to the acquisition time sequence of the image frame set corresponding to the target to generate the fusion trajectory of the target may specifically include:

s401, for each fusion target, if the current fusion position of the fusion target and the source of the fusion position in the existing track both comprise positions in the same single-camera track, determining that the current fusion position of the fusion target is associated with the existing track.

And the existing track is formed by associating the fusion positions of all targets before the current fusion position of the fusion target in the acquisition time.

If the current fusion position of the fusion target is obtained by fusing a group of image frames with the earliest acquisition time in each group of synchronous image frames, the fusion target can be directly used as an existing track, and a uniform fusion track identifier is given to the existing track.

And according to the acquisition time sequence, for each fusion target with the acquisition time after the first fusion target, if the current fusion position of the fusion target and the source of the fusion position in the existing track both comprise the position in the same single-camera track, determining that the current fusion position of the fusion target is associated with the existing track. For example, if the sources of the fusion position in the existing trajectory a include: a position in the single camera trajectory 1 of the image capturing device 1, a position in the single camera trajectory 2 of the image capturing device 2, a position in the single camera trajectory 3 of the image capturing device 3 and a position in the single camera trajectory 4 of the image capturing device 4; the sources of the current fusion position of the fusion target a include: a position in the single camera trajectory 1 of the image capturing device 1, a position in the single camera trajectory 5 of the image capturing device 2, a position in the single camera trajectory 6 of the image capturing device 3 and a position in the single camera trajectory 7 of the image capturing device 4; because the sources of the existing track a and the current fusion position of the fusion target a both include the position in the single-camera track 1 of the image capturing device 1, it can be directly determined that the current fusion position of the fusion target a is associated with the existing track, and the existing track a can be updated based on the current fusion position of the fusion target a, that is, the current fusion position of the fusion target a is used as a new track point of the existing track a, and a new fusion track identifier is given to the updated existing track a.

Specifically, in this embodiment, the current fusion position of the fusion target and the source of the fusion position in the existing trajectory may be determined according to the single-camera trajectory identifier and the fusion trajectory identifier of the single-camera trajectory.

S402, if the current fusion position of the fusion target and the source of the fusion position in any one existing track do not comprise the position in the same single-camera track, and the current fusion position of the fusion target is in a preset motion state, determining that the current fusion position of the fusion target is not related to the existing track.

The preset motion state is a state that the motion state of the fusion target is in a state of driving away from the intersection.

In this embodiment, if the single-camera track identifier of the existing track source is in a lost state and the existing track is in a motion state of driving away from the intersection, the existing track is prohibited from being associated with any fusion target; and if the single-camera track mark of the existing track source is in a cancelled state and the existing track is in a motion state of driving away from the intersection, directly cancelling the existing track, namely deleting the existing track.

S403, if the current fusion position of the fusion target does not include the position in the same single-camera track with the source of the fusion position in any existing track, and the current fusion position of the fusion target is not in a preset motion state, constructing a correlation matrix between each fusion target and the existing track in the image frame corresponding to the fusion target.

If the current fusion position of the fusion target does not include a position in the same single-camera track as the source of the fusion position in any existing track, the identifier along the single-camera track may be set to 0. Wherein, following a single camera trajectory with a designation of 0 represents: the existing track associated with the current fusion position of the fusion target is not required to be determined according to the track of the single camera; the label along the single camera trajectory is 1: the existing track associated with the current fusion position of the fusion target needs to be determined according to whether the source of the current fusion position of the fusion target is the same as the source of the single-camera track.

Specifically, the similarity between the world coordinate distances of the fusion target and the existing track can be calculated, and a correlation matrix between each fusion target and the existing track in the image frame corresponding to the fusion target is constructed. And performing local association processing and global association processing of the fusion target and the existing track through the association degree matrix between each fusion target and the existing track in the image frame corresponding to the fusion target. The local association processing specifically comprises the following steps: based on the incidence matrix, the Hungarian algorithm can be adopted to calculate the incidence result of the fusion target and the existing track. The global association processing specifically includes: based on the incidence matrix, the Hungarian algorithm can be adopted to calculate the incidence result of the existing track with the track mark of the single camera fusing the target and the source in the lost state. According to the association result of the local association and the global association, a new fusion track identifier can be given to the matched fusion target, and the track characteristic of the fusion target is updated; for the unmatched fusion target, if the unmatched fusion target meets the track new-building condition, building a track for the fusion target as an existing track; and for the existing track which is not matched with the upper fusion target, setting the existing track into a lost state. And if the unmatched fusion target is obtained by fusing a group of image frames with the earliest acquisition time in each group of synchronous image frames, determining that the fusion target meets the new track building condition.

S404, calculating the association matrix according to the Hungarian algorithm, and if the association between the fusion target and the existing track at the corresponding position in the association matrix obtained through calculation is larger than a preset association threshold value, determining that the current fusion position of the fusion target is associated with the existing track.

The preset association threshold may be set according to an actual application situation, and is not specifically limited herein.

In the embodiment of the invention, the incidence matrix can be solved by using a Hungarian algorithm to obtain the one-to-one matching relation between each fusion target and the existing track in the image frame corresponding to the fusion target. For example, if the fusion target corresponds to 3 targets in the image frame: c1, c2, and c3, the existing trajectories include: the existing track A and the existing track B have a correlation matrix of 3 multiplied by 2, and the correlation matrix can be solved by using a Hungarian algorithm to obtain a one-to-one matching relation between each fusion target and the existing track in the image frame corresponding to the fusion target: c1 is matched with the existing track B, and c2 is matched with the existing track A.

If the association degree between c1 and the existing track B is greater than the preset association degree threshold, it may be determined that the current fusion position of the fusion target c1 is associated with the existing track B. If the association degree between c2 and the existing track A is not greater than the preset association degree threshold, it may be determined that the current fusion position of the fusion target c2 is not associated with the existing track A.

After determining that the fusion target is associated with the existing track, the existing track may be updated based on the current fusion position of the fusion target, that is, the current fusion position of the fusion target is used as a new track point of the existing track, and a new fusion track identifier is assigned to the updated existing track.

S405, if the fusion positions corresponding to the fusion target are all associated, generating a fusion track of the fusion target based on all the associated fusion positions of the fusion target.

By adopting the method provided by the embodiment of the invention, aiming at each target, the fusion positions of the targets are associated according to the acquisition time sequence of the image frame set corresponding to the target, and the fusion track of the target is generated. In addition, the fusion track of the target can be generated in real time, namely, the method provided by the embodiment of the invention is suitable for scenes with high real-time requirements, such as smart intersections and the like. In addition, the method provided by the embodiment of the invention provides more reliable space-time information for the track fusion of the target by synchronizing the image frames acquired by a plurality of image acquisition devices. And moreover, the position association is realized by combining the characteristics of world coordinates, lane numbers, motion speeds, appearances and the like, so that the association result is more reliable and has wide application range, and the method is not limited to specific types of targets. In addition, the method provided by the embodiment of the invention utilizes the single-camera track to perform track fusion of the target, so that the fusion track is more reliable, and the phenomenon that the fusion track mark is brought back is effectively reduced. In addition, the method provided by the embodiment of the invention does not need to manually acquire the relation matrix of the region acquired by the image acquisition equipment, and has lower use cost and better universality.

The method provided by the embodiment of the invention can be applied to a real-time scene to generate the fusion track of the target in real time.

In a possible implementation mode, for scenes which do not require complete real-time, the deviation rectification processing on the fusion track can be realized in a track delay deviation rectification mode, so that the jump of the fusion track is reduced. Fig. 5 is a schematic flow chart of a trajectory rectification method according to an embodiment of the present invention, as shown in fig. 5: for an application scene adopting delay correction, track delay caching can be carried out, then track smoothing is carried out, and finally a smoothed track is output, specifically, a multi-frame fusion track of a fixed time period can be cached, and specifically n can be set to be 10 frames or 20 frames and the like; then, successively calculating the average value of the world coordinates of the fusion track of the previous i frames, wherein i is 0, 1 and 2 … n; and then, outputting the current smooth fusion track according to the frame, realizing the deviation rectification processing of the fusion track, and enabling the output fusion track to be smoother and smoother. Aiming at the application scene without adopting time delay correction, the fusion track can be directly output.

By adopting the method provided by the embodiment of the invention, aiming at each target, the fusion positions of the targets are associated according to the acquisition time sequence of the image frame set corresponding to the target, and the fusion track of the target is generated. In addition, the fusion track of the target can be generated in real time, namely, the method provided by the embodiment of the invention is suitable for scenes with high real-time requirements, such as smart intersections and the like. In addition, the method provided by the embodiment of the invention provides more reliable space-time information for the track fusion of the target by synchronizing the image frames acquired by a plurality of image acquisition devices. And moreover, the position association is realized by combining the characteristics of world coordinates, lane numbers, motion speeds, appearances and the like, so that the association result is more reliable and has wide application range, and the method is not limited to specific types of targets.

For an application scenario in which a trajectory is generated for a specific target, the method provided by the embodiment of the present invention provides more reliable spatiotemporal information for trajectory fusion of the target by synchronizing image frames acquired by a plurality of image acquisition devices, and realizes position association by combining features such as world coordinates, lane numbers, motion speeds, and appearances, so that the association result is more reliable and the application range is wider.

Corresponding to the target track generation method, the embodiment of the invention also provides a target track generation device. The following describes a target trajectory generation apparatus provided in an embodiment of the present invention. Fig. 6 is a schematic structural diagram of an apparatus for generating a target track according to an embodiment of the present invention, as shown in fig. 6, the apparatus includes:

the position acquisition module 601 is configured to acquire, for an image group acquired by each image acquisition device, an image position where each target in each image frame in the image group is located;

a position conversion module 602, configured to convert the image position into a position in a world coordinate system according to a preset conversion relationship, so as to obtain a world position corresponding to the image position;

a position fusion module 603, configured to fuse, for each image frame set, world positions of each identical target in each image frame in the image frame set to obtain a fusion position of the target; the image frame set is an image set formed by all image frames with synchronous acquisition time;

a track generating module 604, configured to associate, for each target, a fusion position of the target according to the acquisition time sequence of the image frame set corresponding to the target, and generate a fusion track of the target.

By adopting the device provided by the embodiment of the invention, the image position of each target in each image frame in the image group is acquired aiming at the image group acquired by each image acquisition device; converting the image position into a position under a world coordinate system according to a preset conversion relation to obtain a world position corresponding to the image position; for each image frame set, fusing the world positions of the same targets in each image frame in the image frame set to obtain the fusion positions of the targets; and for each target, associating the fusion position of the target according to the acquisition time sequence of the image frame set corresponding to the target to generate a fusion track of the target. The device provided by the embodiment of the invention can generate the tracks of a plurality of targets, and can meet the track generation requirements of a plurality of complex targets in scenes such as urban traffic and the like.

Optionally, the number of image acquisition devices is at least 3;

referring to fig. 7, the location fusion module 603 includes:

the similarity matrix determination submodule 701 is used for constructing a similarity matrix of each target in any two image frames in the image frame set according to the world position of each target in the image frame set;

the target determination submodule 702 is configured to calculate the similarity matrix according to the hungarian algorithm, and determine that two targets located at corresponding positions in the similarity matrix and located in different image frames are the same target if the calculated similarity between the two targets is greater than a preset similarity threshold;

a position fusion sub-module 703, configured to fuse the world positions of the same target in the image frame set to obtain a fused position, and use the fused position as the fusion position of the fusion target corresponding to the same target.

Optionally, the similarity matrix determining submodule 701 is specifically configured to determine a speed direction of each target in the image frame set based on a world position of the target; and for any two image frames in the image frame set, constructing a similarity matrix of each target in the two image frames based on the world positions and the speed directions of the targets in the two image frames.

Optionally, referring to fig. 8, the apparatus further includes:

a fusion location updating module 801, configured to, for each fusion target, if at least two targets from the same image group exist in multiple same targets corresponding to the fusion target, remove a target with a distance different from the minimum distance between the at least two targets and the fusion target; and updates the fusion position of the fusion target.

Optionally, referring to fig. 8, the apparatus further includes:

a single camera trajectory generation module 802 for generating, for each image group, a single camera trajectory for each target in the image group based on the plurality of world locations of the target;

the track generating module 604 is specifically configured to, for each fusion target, determine that the current fusion position of the fusion target is associated with the existing track if the current fusion position of the fusion target and the source of the fusion position in the existing track both include positions in the same single-camera track; the existing track is formed by associating the fusion position of each target with the acquisition time before the current fusion position of the fusion target; if the current fusion position of the fusion target does not comprise a position in the same single-camera track with the source of the fusion position in any one existing track, and the current fusion position of the fusion target is in a preset motion state, determining that the current fusion position of the fusion target is not related to the existing track; if the current fusion position of the fusion target does not comprise a position in the same single-camera track with the source of the fusion position in any existing track, and the current fusion position of the fusion target is not in a preset motion state, constructing a correlation matrix between each fusion target in the image frame corresponding to the fusion target and the existing track; calculating the association matrix according to a Hungarian algorithm, and if the correlation between the fusion target and the existing track at the corresponding position in the association matrix is larger than a preset correlation threshold value, determining that the current fusion position of the fusion target is associated with the existing track; and if the fusion positions corresponding to the fusion target are all associated, generating a fusion track of the fusion target based on all the associated fusion positions of the fusion target.

Optionally, referring to fig. 8, the apparatus further includes:

a track update module 803, configured to update the existing track based on the current fusion position of the fusion target.

Optionally, referring to fig. 8, the apparatus further includes:

a synchronized time determination module 804 for generating, for each image group, a single camera trajectory for each target in the image group based on the plurality of world locations of the target; matching the single camera tracks in any two image groups; optionally selecting a group of matched single-camera tracks, and respectively calculating the distance from the starting position and the ending position of the first single-camera track in the group to each position of the second single-camera track; and calculating the distance between the start position and the end position of the second single-camera trajectory in the group to each position of the first single-camera trajectory, respectively; if a position exists in the second single-camera track, wherein the distance between the second single-camera track and the starting position or the ending position of the first single-camera track is smaller than a preset distance threshold, and a position exists in the first single-camera track, wherein the distance between the first single-camera track and the starting position or the ending position of the second single-camera track is smaller than a preset distance threshold, the group of matched single-camera tracks are reserved; otherwise, deleting the single-camera track of the group; aiming at each group of reserved matched single-camera tracks, selecting a shortest distance from the starting position and the ending position of a first single-camera track in the group to each position of a second single-camera track and the distance from the starting position and the ending position of the second single-camera track in the group to each position of the first single-camera track, and determining image frames where two positions corresponding to the shortest distance are respectively located as primary synchronous image frames; and calculating the average time of the acquisition time of each pair of primary synchronous image frames in any two image groups, and taking the image frame with the acquisition time of the average time in the two image groups as the final synchronous image frame of the two image groups.

By adopting the device provided by the embodiment of the invention, aiming at each target, the fusion positions of the targets are associated according to the acquisition time sequence of the image frame set corresponding to the target, and the fusion track of the target is generated. In addition, the fusion track of the target can be generated in real time, namely, the device provided by the embodiment of the invention is suitable for scenes with high real-time requirements, such as intelligent intersections and the like. In addition, the device provided by the embodiment of the invention provides more reliable space-time information for the track fusion of the target by synchronizing the image frames acquired by a plurality of image acquisition devices. And moreover, the position association is realized by combining the characteristics of world coordinates, lane numbers, motion speeds, appearances and the like, so that the association result is more reliable and has wide application range, and the method is not limited to specific types of targets. In addition, the device provided by the embodiment of the invention utilizes the single-camera track to perform track fusion of the target, so that the fusion track is more reliable, and the phenomenon that the fusion track mark is brought back is effectively reduced. In addition, the device provided by the embodiment of the invention does not need to manually acquire the relation matrix of the region acquired by the image acquisition equipment, and has lower use cost and better universality.

An embodiment of the present invention further provides an electronic device, as shown in fig. 9, which includes a processor 901, a communication interface 902, a memory 903, and a communication bus 904, where the processor 901, the communication interface 902, and the memory 903 complete mutual communication through the communication bus 904,

a memory 903 for storing computer programs;

the processor 901 is configured to implement the steps of the target trajectory generation method according to any of the above embodiments when executing the program stored in the memory 903.

The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.

The communication interface is used for communication between the electronic equipment and other equipment.

The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.

The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.

In yet another embodiment of the present invention, a computer-readable storage medium is further provided, in which a computer program is stored, and the computer program, when executed by a processor, implements the steps of any one of the above-mentioned target trajectory generation methods.

In yet another embodiment of the present invention, a computer program product containing instructions is further provided, which when run on a computer causes the computer to execute the method for generating any one of the target trajectories in the above embodiments.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus, the electronic device, the computer-readable storage medium, and the computer program product embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiments.

The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims

1. A method for generating a target trajectory, comprising:

2. The method of claim 1, wherein the number of image acquisition devices is at least 3;

3. The method of claim 2, wherein for any two image frames in the image frame set, constructing a similarity matrix for each object in the two image frames according to the world position of each object in the image frame set comprises:

4. The method according to claim 2, wherein after the fusing the world positions of the same object in the image frame set to obtain a fused position and taking the fused position as the fused position of the fused object corresponding to the same object, the method further comprises:

5. The method according to claim 2, further comprising, before the associating, for each object, the fusion position of the object according to the acquisition time sequence of the image frame set corresponding to the object to generate the fusion track of the object, the steps of:

6. The method according to claim 1, further comprising, before said fusing, for each image frame set, the world positions of each identical object in the respective image frames in the image frame set to obtain the fused position of the object:

matching the single camera tracks in any two image groups;

7. An apparatus for generating a target trajectory, comprising:

8. The apparatus of claim 7, wherein the number of image capture devices is at least 3;

the location fusion module includes:

9. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;

a memory for storing a computer program;

a processor for implementing the method steps of any of claims 1-6 when executing a program stored in the memory.

10. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of claims 1 to 6.