WO2023050678A1 - Multi-target tracking method and apparatus, and electronic device, storage medium and program - Google Patents

Multi-target tracking method and apparatus, and electronic device, storage medium and program Download PDF

Info

Publication number
WO2023050678A1
WO2023050678A1 PCT/CN2022/075415 CN2022075415W WO2023050678A1 WO 2023050678 A1 WO2023050678 A1 WO 2023050678A1 CN 2022075415 W CN2022075415 W CN 2022075415W WO 2023050678 A1 WO2023050678 A1 WO 2023050678A1
Authority
WO
WIPO (PCT)
Prior art keywords
target object
target
detection
image
frame
Prior art date
Application number
PCT/CN2022/075415
Other languages
French (fr)
Chinese (zh)
Inventor
李震宇
李昂
Original Assignee
上海商汤智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海商汤智能科技有限公司 filed Critical 上海商汤智能科技有限公司
Publication of WO2023050678A1 publication Critical patent/WO2023050678A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • G06T7/251Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • G06T7/248Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30248Vehicle exterior or interior
    • G06T2207/30252Vehicle exterior; Vicinity of vehicle

Definitions

  • the present disclosure relates to the technical field of image processing, and in particular, to a multi-target tracking method, device, electronic equipment, storage medium and program.
  • Multi-target tracking technology is a research hotspot in the field of computer vision.
  • Multi-target tracking refers to the use of a computer to determine the position, size and complete trajectory of each independent moving target of interest in a video sequence. It has a very wide range of applications in vehicle auxiliary systems, military fields and intelligent security fields.
  • Embodiments of the present disclosure at least provide a multi-target tracking method, device, electronic equipment, storage medium and program.
  • an embodiment of the present disclosure provides a multi-target tracking method applied to an electronic device, including:
  • the target tracking result is used to reflect a detection result of the first target object in the current frame image and the at least one frame image .
  • the extracted appearance features of the first target object can better represent the identity information of the first target object, using this better feature information and similarity, it is possible to deal with the reappearance of the target after being occluded.
  • the trajectory convergence can also reduce the probability of tracking instability caused by vehicle bumps, obtain more stable multi-target tracking results, and improve the stability of multi-target tracking.
  • an embodiment of the present disclosure provides a multi-target tracking device, which is applied to an electronic device, including:
  • the target detection module is configured to perform target detection on the current frame image, and obtain a first detection result of at least one detected first target object.
  • a feature extraction module configured to extract an appearance feature vector of the first target object.
  • the similarity calculation module is configured to calculate the similarity between the appearance feature vector of the first target object and the appearance feature vectors of each target object detected in at least one frame image before the current frame image.
  • the tracking result determining module is configured to determine a target tracking result for the first target object based on the similarity; the target tracking result is configured to reflect the first target object in the current frame image and the multiple Detection results in frame images.
  • an embodiment of the present disclosure provides an electronic device, including: a processor, a memory, and a bus, the memory stores machine-readable instructions executable by the processor, and when the electronic device is running, the processing The processor communicates with the memory through a bus, and when the machine-readable instructions are executed by the processor, the multi-target tracking method as described in the first aspect is executed.
  • an embodiment of the present disclosure provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is run by a processor, the multi-target tracking method as described in the first aspect is executed .
  • an embodiment of the present disclosure provides a computer program, including computer readable code, when the computer readable code is run in an electronic device, a processor in the electronic device executes any one of the above A multi-target tracking method.
  • FIG. 1 shows a schematic diagram of an execution body of a multi-target tracking method provided by an embodiment of the present disclosure
  • FIG. 2 shows a flow chart of a multi-target tracking method provided by an embodiment of the present disclosure
  • FIG. 3 shows a schematic diagram of a target detection result of a current frame image provided by an embodiment of the present disclosure
  • FIG. 4 shows a flow chart of a method for determining a target tracking result for a first target object provided by an embodiment of the present disclosure
  • FIG. 5 shows a schematic diagram of a tracking effect of a first target object provided by an embodiment of the present disclosure
  • FIG. 6 shows a schematic diagram of another tracking effect of a first target object provided by an embodiment of the present disclosure
  • FIG. 7 shows a flow chart of a method for training a re-identification model provided by an embodiment of the present disclosure
  • FIG. 8 shows a schematic diagram of a radar tracking result provided by an embodiment of the present disclosure
  • FIG. 9 shows a flow chart of a method for acquiring an image sample provided by an embodiment of the present disclosure.
  • FIG. 10 shows a schematic diagram of an image sample set provided by an embodiment of the present disclosure
  • Fig. 11 shows a schematic structural diagram of a multi-target tracking device provided by an embodiment of the present disclosure
  • Fig. 12 shows a schematic structural diagram of another multi-target tracking device provided by an embodiment of the present disclosure
  • Fig. 13 shows a schematic diagram of an electronic device provided by an embodiment of the present disclosure.
  • CNN Convolutional Neural Networks
  • the research found that due to frequent occlusions in the multi-target tracking process, when the target is occluded during the tracking process, the number of detected targets changes, and the occluded track of the tracked target cannot match the detected target of the current frame. It is necessary to stop tracking when judging whether the track disappears temporarily due to occlusion or leaves the detection area, and a part of the track that is occluded is terminated due to misjudgment. After the target occlusion ends, the originally tracked target reappears in the detection area. If the original tracking track has stopped tracking, the target will generate a new initial track at this time, resulting in a change in the target identity. In addition, when the vehicle is bumping, the distance between the detection results of the same target will be large, which will lead to low similarity, data association failure, and target tracking failure.
  • the present disclosure provides a multi-target tracking method, including: performing target detection on the current frame image to obtain a first detection result of at least one first target object detected; extracting the appearance feature vector of the first target object; calculating The similarity between the appearance feature vector of the first target object and the appearance feature vectors of each target object detected in at least one frame image before the current frame image; based on the similarity, determine the A target tracking result of the first target object; the target tracking result is used to reflect detection results of the first target object in the current frame image and the multiple frame images.
  • the extracted appearance features can better represent the identity information of the first target object, using this better feature information can handle the reappearance of the trajectory after the target is occluded, and can also reduce the factor
  • the probability of unstable tracking caused by vehicle bumps can be obtained to obtain a more stable multi-target tracking result and improve the stability of multi-target tracking.
  • the multi-target tracking method provided by the embodiments of the present disclosure can be applied to an automatic driving system to track target objects within the field of view of the vehicle, and accurately obtain object tracking results, which is helpful for developers to design related tracking strategies and alarm strategies.
  • the multi-target tracking method provided by the embodiments of the present disclosure will be introduced in detail.
  • FIG. 1 it is a schematic diagram of an execution body of a multi-target tracking method provided by an embodiment of the present disclosure.
  • the execution body of the method is an electronic device, wherein the electronic device may include a terminal and a server.
  • the method can be applied to a terminal.
  • the terminal can be a terminal device as shown in FIG. And robots, not limited here.
  • voice interactive devices include but are not limited to smart speakers and smart home appliances.
  • the method can also be applied to a server, or can be applied to an implementation environment composed of a terminal and a server.
  • the server can be an independent physical server, or a server cluster or distributed system composed of multiple physical servers, or it can provide basic cloud computing such as cloud services, cloud databases, cloud computing, cloud storage, big data and artificial intelligence platforms. Cloud server for the service.
  • the server may communicate with the terminal through a network.
  • a network may include various connection types such as wires, wireless communication links, or fiber optic cables, among others.
  • the multi-target tracking method can also be implemented by software running on a terminal or a server, for example, the multi-target tracking method provided by the embodiments of the present disclosure is implemented by using an application program with a multi-target tracking function.
  • the multi-target tracking method may be implemented by a processor invoking computer-readable instructions stored in a memory.
  • FIG. 2 it is a flowchart of a multi-target tracking method provided by an embodiment of the present disclosure.
  • the multi-target tracking method is applied to an electronic device and executed by the electronic device, and may include the following steps S101 to S104:
  • Step S101 Perform target detection on the current frame image to obtain a first detection result of at least one detected first target object.
  • an image for example, each frame image in a video
  • a closed area that is distinguished from the surrounding environment is often called an object.
  • detection The process of giving the location of an object in an image is called detection.
  • the trained target detection model or target detection network
  • the target detection technology can also be used to detect the target.
  • the target detection technology can use a variety of methods, such as: frame difference method, background subtraction method, optical flow method, directional gradient feature, etc., and can also be manually way to mark the initial position of the target.
  • any other suitable technology that can be used for target detection can also be used, which is not limited here, as long as the target detection can be realized.
  • FIG. 3 it is a schematic diagram of a target detection result of the current frame image provided in the embodiment of the present disclosure.
  • the current frame image T can be input to the target detection model for target detection, and the detected
  • the first detection result of at least one first target object 10 in this embodiment, the first detection result includes three first target objects 10, and the first target object is a vehicle.
  • the number of the first target objects 10 may also be 2 or 4, and the type of the first target objects 10 may also be other types (such as pedestrians), which are not limited here.
  • the first detection result also includes detection frame information of the first target object 10 (as shown in A in FIG. 3 ), the type of the first target object 10, and the first target object 10. At least one of the confidence levels of a detection result of a target object 10 .
  • the first detection result also includes at least one of the detection frame information of the first target object, the type of the first target object, and the confidence level of the detection result of the first target object This makes the content of the output of the first detection result more abundant, provides more basis for the calculation of the subsequent similarity, and can improve the accuracy of the similarity calculation.
  • the video to be detected needs to be acquired, and then the current frame image is acquired from the video to be detected according to a preset time interval or frame number interval.
  • the video to be detected is a video or a sequence of video frames to be detected.
  • the video to be detected may be a video or a video stream with a certain video frame length.
  • the target detection model after the target detection model acquires the video to be detected, it can acquire multiple frames of images to be detected at intervals from the video to be detected.
  • the video to be detected includes M frames of images to be detected, and the target detection model starts from M frames of images to be detected Acquiring at least one frame of the image to be detected according to a preset time interval or every interval of N frames.
  • multiple frames of images to be detected are obtained at intervals from the video to be detected, which can improve the processing speed of multi-target tracking in the video to be detected and increase the access of the video to be detected that can be processed road number.
  • the frame rate of the image to be detected in the video to be detected is generally more than 25 frames per second. If an electronic device (such as a server) detects each frame of the image to be detected, the amount of calculation will be too large, which will cause the server to overload and affect The processing speed of multi-target tracking and the number of access channels of the video to be detected. In this embodiment, after the electronic device acquires the video to be detected, it acquires multiple frames of images to be detected at intervals from the video to be detected, which can improve the processing speed of multi-target tracking in the video to be detected and increase the access of the video to be detected that can be processed road number.
  • Step S102 Extracting an appearance feature vector of the first target object.
  • the previously obtained image to be detected may be input to a pre-trained re-identification model to extract appearance features of the first target object to obtain an appearance feature vector.
  • the extracted appearance features are all 64-dimensional vectors.
  • the training method of the re-identification model will be described in detail later.
  • other methods may also be used to extract the appearance feature vector of the first target object, which is not limited here.
  • Step S103 Calculate the similarity between the appearance feature vector of the first target object and the appearance feature vectors of each target object detected in at least one frame image before the current frame image.
  • three first target objects 10 in FIG. 3 are taken as an example for illustration.
  • the current frame image also called the subsequent frame image in the table
  • the detected three first target objects 10 are assigned codes D0, D1 and D2 respectively
  • the multi-frame images before the current frame for the convenience of description, take one frame from the multi-frame images (also called the previous frame in the table image) as an example to calculate the similarity, and assign codes Q0, Q1, and Q2 to the target objects in the previous frame image respectively.
  • Table 1 The calculation results of the similarity are shown in Table 1 below:
  • the first target object D0 in the current frame image has the highest similarity with the target object Q0 in the previous frame image of the current frame image, and the similarity calculation result is 0.82;
  • the first target object D1 in the frame image has the highest similarity with the target object Q1 in the previous frame image of the current frame image, and the similarity calculation result is 0.91;
  • the first target object D2 in the current frame image and the previous frame image of the current frame image have the highest similarity.
  • the target object Q2 in the frame image has the highest similarity, and the similarity calculation result is 0.92.
  • the cache in order to reduce the impact of environmental mutations on the appearance features, you can use the cache to store the appearance features of the same object on the latest multi-frame (such as 100 frames) images, and use the maximum similarity as the calculation result, which can improve Reliability of object tracking. That is, the temporary storage strategy is used to store the appearance features of the same object on the most recent multi-frame images. This strategy reduces the probability of similarity jumps caused by appearance mutations, thereby obtaining better multi-target tracking results.
  • Step S104 Based on the similarity, determine a target tracking result for the first target object; the target tracking result is used to reflect that the first target object is in the current frame image and the at least one frame image test results.
  • FIG. 4 is a flow chart of a method for determining a target tracking result for a first target object provided by an embodiment of the present disclosure. Based on the similarity, determine the target tracking result for the first target object
  • determine the target tracking result for the first target object may include the following steps S1041 to S1042:
  • Step S1041 Based on the similarity, match the first detection result with the detection results of each target object in the at least one frame of image, and determine all the target objects in the at least one frame of image that match the first target object Describe the detection result of the first target object.
  • matching also known as data association
  • the Hungarian algorithm may be used to match the first detection result with the detection results of each target object in the at least one frame of image, so that the matching accuracy may be improved.
  • Step S1042 Determine a target tracking result for the first target object according to the detection result of the first target object in the at least one frame of images and the first detection result.
  • the extracted appearance features can be better Indicates the identity information of the first target object 10, wherein the address of the detection frame of one first target object 10 is id1, and the address of the detection frame of another first target object 10 is id2.
  • FIG. 7 is a flow chart of a method for training a re-identification model provided by an embodiment of the present disclosure.
  • the above-mentioned re-identification model can be obtained through training using the following method. Specifically, when training the re-identification model, the following steps S1021 to S1023 may be included:
  • Step S1021 Obtain an image sample set, the image sample set includes a plurality of image samples and annotation information of the image samples, and the annotation information is used to indicate the image samples corresponding to the same target object.
  • the image samples are taken in an automatic driving scene.
  • the trained re-identification model can better adapt to the driving scene, improving the accuracy of model recognition and the adaptability in the driving scene.
  • Step S1022 Based on the image sample set, train the re-identification model to be trained to obtain the re-identification model.
  • the re-recognition model to be trained is trained to obtain the re-recognition model, and the re-recognition model is used to extract the appearance feature vector of the first target object, so that the first target object can be improved.
  • the extraction accuracy of the appearance feature vector is used to extract the appearance feature vector of the first target object, so that the first target object can be improved.
  • the basic network to be trained can be determined according to specific needs.
  • a depth-separable convolutional network such as a mobileNetV2 network
  • a lightweight volume such as a mobileNetV1 network can also be selected.
  • the product neural network is not limited here. In this way, by using a lightweight convolutional neural network as the basic training network, the recognition efficiency of the trained re-recognition model can be improved, and the real-time performance is stronger.
  • the image samples in the image sample set can be respectively input to the basic network for feature extraction, and then the basic network is trained according to the classification result and loss function to obtain a re-identification model.
  • the specific model training method is similar to the existing model training method.
  • the tracking results of the lidar can be used to build a re-image sample set, as shown in Figure 8, through the tracking of the lidar, multiple detection frames can be obtained, such as point cloud detection frame 1, Point cloud detection frame 2, point cloud detection frame 17, point cloud detection frame 18, point cloud detection frame 19, etc., however, as shown in Figure 8, there will be some point cloud detection frames (such as point cloud detection frame 19 or point cloud The detection frame 17) is occluded, and the tracking frame of the lidar is not well fitted to the target object (such as a vehicle), and the image samples in the image sample set should not contain occluded objects, and the detection frame is required to fit The target object, otherwise unnecessary noise will be introduced, resulting in poor re-identification model training results.
  • point cloud detection frame 1 Point cloud detection frame 2
  • point cloud detection frame 17 point cloud detection frame 17
  • the tracking frame of the lidar is not well fitted to the target object (such as a vehicle)
  • the image samples in the image sample set should not contain occluded objects, and the detection frame is required
  • the method for acquiring an image sample set includes the following steps S10211 to S10214:
  • Step S10211 Acquire candidate images captured by the camera, perform target detection on the candidate images, and obtain a second detection result indicating at least one detected second target object.
  • Step S10212 Obtain point cloud data collected synchronously with the candidate image for the same scene, detect the point cloud data, and obtain at least one point cloud detection frame.
  • Step S10213 Determine the intersection-over-union ratio IOU between the detection frame of the second target object and the point cloud detection frame in the second detection result.
  • Step S10214 If the IOU between any point cloud detection frame and the detection frame of the second target object is greater than a preset threshold, determine the image sample based on the candidate image.
  • the laser radar multi-target tracking result is filtered , obtaining image samples can improve the acquisition accuracy of image samples, thereby improving the recognition accuracy of the trained re-identification model.
  • the candidate image collected by the camera can be obtained, and the point cloud data collected synchronously with the candidate image for the same scene can be obtained, and the point cloud data can be detected to obtain at least one point cloud detection frame; then use the
  • the second detection result obtained by detecting the candidate image is used to filter the lidar multi-target tracking result. That is, after obtaining the second detection result, determine the intersection-over-union ratio IOU between the detection frame of the second target object and the point cloud detection frame in the second detection result, and only keep the IOU greater than a preset threshold (such as 0.7 ) candidate images.
  • the position information of the detection frame of the second target object in the image sample can also be determined based on the position information of the point cloud detection frame, that is, the ID of the detection frame of the second target object is set as the point cloud of the lidar Detection box ID. Afterwards, the part corresponding to the detection frame of the second target object is cut out from the candidate image, and classified according to the ID.
  • FIG. 10 is a schematic diagram of an image sample set provided by an embodiment of the present disclosure.
  • the target object M corresponds to multiple image samples, but the IDs of the image samples corresponding to the target object M are the same, and the IDs are all 001, that is, the label information is the same;
  • the target object N corresponds to multiple image samples , but the IDs of the image samples corresponding to the target object N are the same, and the IDs are all 002, that is, the labeling information is the same.
  • the target object M is different from the target object N, the IDs of the image samples corresponding to the target object M and the image samples corresponding to the target object N are different.
  • the determining the image sample based on the candidate image includes: clipping a part of the image corresponding to the detection frame of the second target object from the candidate image to obtain the image sample .
  • the image sample is obtained by cutting the partial image corresponding to the detection frame of the second target object from the candidate image, which can reduce unnecessary noise introduced by the image sample.
  • the embodiment of the present disclosure also provides a multi-target tracking device corresponding to the multi-target tracking method. Since the problem-solving principle of the device in the embodiment of the present disclosure is similar to the above-mentioned multi-target tracking method in the embodiment of the present disclosure, therefore The implementation of the device can refer to the implementation of the method.
  • FIG. 11 it is a schematic diagram of a multi-target tracking device 500 provided by an embodiment of the present disclosure.
  • the device is applied to electronic equipment, including:
  • the target detection module 501 is configured to perform target detection on the current frame image, and obtain a first detection result of at least one detected first target object.
  • the feature extraction module 502 is configured to extract an appearance feature vector of the first target object.
  • the similarity calculation module 503 is configured to calculate the similarity between the appearance feature vector of the first target object and the appearance feature vectors of each target object detected in at least one frame image before the current frame image.
  • the tracking result determining module 504 is configured to determine a target tracking result for the first target object based on the similarity; the target tracking result is configured to reflect the first target object in the current frame image and the Detection results in at least one frame of images.
  • the tracking result determining module 504 is specifically configured as:
  • the first detection result further includes the detection frame information of the first target object, the type of the first target object, and the confidence of the detection result of the first target object. at least one of the degrees.
  • the device further includes a model training module 505, and the model training module 505 is configured to:
  • the image sample set includes a plurality of image samples and annotation information of the image samples, the annotation information is configured to indicate image samples corresponding to the same target object;
  • the re-identification model to be trained is trained to obtain the re-identification model.
  • the image sample is taken in a driving scene.
  • model training module 505 is specifically configured as:
  • the image sample is determined based on the candidate image when there is an IOU between any point cloud detection frame and the detection frame of the second target object that is greater than a preset threshold.
  • model training module 505 is specifically configured as:
  • the target detection module 501 is further configured to:
  • the current frame image is acquired from the video to be detected according to a preset time interval or frame number interval.
  • an embodiment of the present disclosure also provides an electronic device.
  • FIG. 13 it is a schematic structural diagram of an electronic device 700 provided by an embodiment of the present disclosure, including a processor 701 , a memory 702 , and a bus 703 .
  • the memory 702 is used to store execution instructions, including a memory 7021 and an external memory 7022; the memory 7021 here is also called an internal memory, and is used to temporarily store calculation data in the processor 701 and exchange data with an external memory 7022 such as a hard disk.
  • the processor 701 exchanges data with the external memory 7022 through the memory 7021 .
  • the memory 702 is specifically used to store application program codes for executing the solutions of the embodiments of the present disclosure, and the execution is controlled by the processor 701 . That is, when the electronic device 700 is running, the processor 701 communicates with the memory 702 through the bus 703, so that the processor 701 executes the application program code stored in the memory 702, and then executes the method described in any of the foregoing embodiments.
  • memory 702 can be, but not limited to, random access memory (Random Access Memory, RAM), read-only memory (Read Only Memory, ROM), programmable read-only memory (Programmable Read-Only Memory, PROM), can Erasable Programmable Read-Only Memory (EPROM), Electric Erasable Programmable Read-Only Memory (EEPROM), etc.
  • RAM Random Access Memory
  • ROM read-only memory
  • PROM programmable read-only memory
  • EPROM Erasable Programmable Read-Only Memory
  • EEPROM Electric Erasable Programmable Read-Only Memory
  • the processor 701 may be an integrated circuit chip with signal processing capability.
  • the above-mentioned processor can be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; it can also be a digital signal processor (DSP), an application-specific integrated circuit (ASIC) , field programmable gate array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
  • DSP digital signal processor
  • ASIC application-specific integrated circuit
  • FPGA field programmable gate array
  • a general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
  • the structure shown in the embodiment of the present disclosure does not constitute a specific limitation on the electronic device 700 .
  • the electronic device 700 may include more or fewer components than shown in the illustration, or combine certain components, or separate certain components, or arrange different components.
  • the illustrated components can be realized in hardware, software or a combination of software and hardware.
  • Embodiments of the present disclosure also provide a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the steps of the multi-target tracking method in the foregoing method embodiments are executed.
  • the storage medium may be a volatile computer-readable storage medium or a non-volatile computer-readable storage medium.
  • the embodiment of the present disclosure also provides a computer program product, the computer program product carries a program code, and the instructions included in the program code can be used to execute the steps of the multi-target tracking method in the above method embodiment, for details, please refer to the above method implementation example.
  • the above computer program product may be specifically implemented by means of hardware, software or a combination thereof.
  • the computer program product is embodied as a computer storage medium or a software product, such as a software development kit (Software Development Kit, SDK) and the like.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the functions are realized in the form of software function units and sold or used as independent products, they can be stored in a non-volatile computer-readable storage medium executable by a processor.
  • the technical solution of the present disclosure is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in various embodiments of the present disclosure.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disc and other media that can store program codes. .
  • Embodiments of the present disclosure provide a multi-target tracking method, device, electronic equipment, and storage medium.
  • the multi-target tracking method includes: performing target detection on the current frame image to obtain the first detection of at least one first target object detected Result; extract the appearance feature vector of the first target object; calculate the appearance feature vector of the first target object, and the difference between the appearance feature vectors of each target object detected in at least one frame image before the current frame image Based on the similarity, determine the target tracking result for the first target object; the target tracking result is used to reflect the first target object in the current frame image and the multi-frame images
  • the detection results in .
  • the embodiments of the present disclosure can improve the stability and precision of multi-target tracking.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

Provided in the present disclosure are a multi-target tracking method and apparatus, and an electronic device and a storage medium. The multi-target tracking method comprises: performing target detection on the current frame of image, so as to obtain a first detection result of at least one detected first target object; extracting an appearance feature vector of the first target object; calculating the similarity between the appearance feature vector of the first target object and an appearance feature vector of each target object detected in at least one frame of image before the current frame of image; and determining a target tracking result for the first target object on the basis of the similarity, wherein the target tracking result is used for reflecting detection results of the first target object in the current frame of image and a plurality of frames of images. By means of the embodiments of the present disclosure, the stability and precision of multi-target tracking can be improved.

Description

多目标跟踪方法、装置、电子设备、存储介质和程序Multi-target tracking method, device, electronic equipment, storage medium and program
相关申请的交叉引用Cross References to Related Applications
本专利申请要求2021年09月30日提交的中国专利申请号为202111165457.4、申请人为上海商汤临港智能科技有限公司,申请名称为“多目标跟踪方法、装置、电子设备及存储介质”的优先权,该申请的全文以引用的方式并入本申请中。This patent application requires that the Chinese patent application number submitted on September 30, 2021 is 202111165457.4, the applicant is Shanghai Shangtang Lingang Intelligent Technology Co., Ltd., and the application name is "multi-target tracking method, device, electronic equipment and storage medium" priority rights, the entirety of which is incorporated into this application by reference.
技术领域technical field
本公开涉及图像处理技术领域,具体而言,涉及一种多目标跟踪方法、装置、电子设备、存储介质和程序。The present disclosure relates to the technical field of image processing, and in particular, to a multi-target tracking method, device, electronic equipment, storage medium and program.
背景技术Background technique
多目标跟踪技术属于计算机视觉领域的一个研究热点。多目标跟踪是指利用计算机,在视频序列中确定感兴趣的、具有某种显著视觉特征的各个独立运动目标的位置、大小和各个目标完整的运动轨迹。在车载辅助系统、军事领域及智能安防领域都有着非常广泛的应用。Multi-target tracking technology is a research hotspot in the field of computer vision. Multi-target tracking refers to the use of a computer to determine the position, size and complete trajectory of each independent moving target of interest in a video sequence. It has a very wide range of applications in vehicle auxiliary systems, military fields and intelligent security fields.
然而,在多目标跟踪任务中经常会碰到目标的重叠问题。对于跟踪目标与其他目标发生重叠后,可能会发生目标跟踪轨迹匹配错误的情况,进而导致对多个目标进行跟踪的稳定性较差。However, overlapping objects are often encountered in multi-object tracking tasks. When the tracking target overlaps with other targets, the target tracking track matching error may occur, which leads to poor stability of tracking multiple targets.
发明内容Contents of the invention
本公开实施例至少提供一种多目标跟踪方法、装置、电子设备、存储介质和程序。Embodiments of the present disclosure at least provide a multi-target tracking method, device, electronic equipment, storage medium and program.
第一方面,本公开实施例提供了一种多目标跟踪方法,应用于电子设备中,包括:In a first aspect, an embodiment of the present disclosure provides a multi-target tracking method applied to an electronic device, including:
对当前帧图像进行目标检测,得到检测出的至少一个第一目标物体的第一检测结果;Perform target detection on the current frame image to obtain a first detection result of at least one detected first target object;
提取所述第一目标物体的外观特征向量;extracting an appearance feature vector of the first target object;
计算所述第一目标物体的外观特征向量,与所述当前帧图像之前的至少一帧图像中检测出的各目标物体的外观特征向量之间的相似度;calculating the similarity between the appearance feature vector of the first target object and the appearance feature vectors of each target object detected in at least one frame image before the current frame image;
基于所述相似度,确定针对所述第一目标物体的目标跟踪结果;所述目标跟踪结果用于反映所述第一目标物体在所述当前帧图像以及所述至少一帧图像中的检测结果。Based on the similarity, determine a target tracking result for the first target object; the target tracking result is used to reflect a detection result of the first target object in the current frame image and the at least one frame image .
本公开实施例中,由于提取的第一目标物体的外观特征能够更好地表示了第一目标物体的身份信息,利用这种更好的特征信息以及相似度,可以处理目标被遮挡后重现的轨迹衔接,还可以降低因车辆颠簸导致的追踪不稳定的情况发生的概率,得到更加平稳的多目标跟踪结果,提高多目标追踪的稳定性。In the embodiment of the present disclosure, since the extracted appearance features of the first target object can better represent the identity information of the first target object, using this better feature information and similarity, it is possible to deal with the reappearance of the target after being occluded. The trajectory convergence can also reduce the probability of tracking instability caused by vehicle bumps, obtain more stable multi-target tracking results, and improve the stability of multi-target tracking.
第二方面,本公开实施例提供了一种多目标跟踪装置,应用于电子设备中,包括:In the second aspect, an embodiment of the present disclosure provides a multi-target tracking device, which is applied to an electronic device, including:
目标检测模块,配置为对当前帧图像进行目标检测,得到检测出的至少一个第一目标物体的第一检测结果。The target detection module is configured to perform target detection on the current frame image, and obtain a first detection result of at least one detected first target object.
特征提取模块,配置为提取所述第一目标物体的外观特征向量。A feature extraction module configured to extract an appearance feature vector of the first target object.
相似度计算模块,配置为计算所述第一目标物体的外观特征向量,与所述当前帧图像之前的至少一帧图像中检测出的各目标物体的外观特征向量之间的相似度。The similarity calculation module is configured to calculate the similarity between the appearance feature vector of the first target object and the appearance feature vectors of each target object detected in at least one frame image before the current frame image.
跟踪结果确定模块,配置为基于所述相似度,确定针对所述第一目标物体的目标跟踪结果;所述目标跟踪结果配置为反映所述第一目标物体在所述当前帧图像以及所述多帧图像中的检测结果。The tracking result determining module is configured to determine a target tracking result for the first target object based on the similarity; the target tracking result is configured to reflect the first target object in the current frame image and the multiple Detection results in frame images.
第三方面,本公开实施例提供了一种电子设备,包括:处理器、存储器和总线,所述存储器存储有所述处理器可执行的机器可读指令,当电子设备运行时,所述处理器与所述存储器之间通过总线通信,所述机器可读指令被所述处理器执行时执行如第一方面所述的多目标跟踪方法。In a third aspect, an embodiment of the present disclosure provides an electronic device, including: a processor, a memory, and a bus, the memory stores machine-readable instructions executable by the processor, and when the electronic device is running, the processing The processor communicates with the memory through a bus, and when the machine-readable instructions are executed by the processor, the multi-target tracking method as described in the first aspect is executed.
第四方面,本公开实施例提供了一种计算机可读存储介质,该计算机可读存储介质 上存储有计算机程序,该计算机程序被处理器运行时执行如第一方面所述的多目标跟踪方法。In a fourth aspect, an embodiment of the present disclosure provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is run by a processor, the multi-target tracking method as described in the first aspect is executed .
第五方面,本公开实施例提供了一种计算机程序,包括计算机可读代码,当所述计算机可读代码在电子设备中运行时,所述电子设备中的处理器执行用于实现上述任意一种多目标跟踪方法。In a fifth aspect, an embodiment of the present disclosure provides a computer program, including computer readable code, when the computer readable code is run in an electronic device, a processor in the electronic device executes any one of the above A multi-target tracking method.
为使本公开的上述目的、特征和优点能更明显易懂,下文特举较佳实施例,并配合所附附图,作详细说明如下。In order to make the above-mentioned objects, features and advantages of the present disclosure more comprehensible, preferred embodiments will be described in detail below together with the accompanying drawings.
附图说明Description of drawings
图1示出了本公开实施例提供的一种多目标跟踪方法的执行主体的示意图;FIG. 1 shows a schematic diagram of an execution body of a multi-target tracking method provided by an embodiment of the present disclosure;
图2示出了本公开实施例提供的一种多目标跟踪方法的流程图;FIG. 2 shows a flow chart of a multi-target tracking method provided by an embodiment of the present disclosure;
图3示出了本公开实施例提供的一种对当前帧图像进行目标检测的结果示意图;FIG. 3 shows a schematic diagram of a target detection result of a current frame image provided by an embodiment of the present disclosure;
图4示出了本公开实施例提供的一种确定针对第一目标物体的目标跟踪结果的方法流程图;FIG. 4 shows a flow chart of a method for determining a target tracking result for a first target object provided by an embodiment of the present disclosure;
图5示出了本公开实施例提供的一种第一目标物体的追踪效果示意图;FIG. 5 shows a schematic diagram of a tracking effect of a first target object provided by an embodiment of the present disclosure;
图6示出了本公开实施例提供的另一种第一目标物体的追踪效果示意图;FIG. 6 shows a schematic diagram of another tracking effect of a first target object provided by an embodiment of the present disclosure;
图7示出了本公开实施例提供的一种重识别模型的训练方法的流程图;FIG. 7 shows a flow chart of a method for training a re-identification model provided by an embodiment of the present disclosure;
图8示出了本公开实施例提供的一种雷达追踪结果的示意图;FIG. 8 shows a schematic diagram of a radar tracking result provided by an embodiment of the present disclosure;
图9示出了本公开实施例提供的一种获取图像样本的方法流程图;FIG. 9 shows a flow chart of a method for acquiring an image sample provided by an embodiment of the present disclosure;
图10示出了本公开实施例提供的一种图像样本集合的示意图;FIG. 10 shows a schematic diagram of an image sample set provided by an embodiment of the present disclosure;
图11示出了本公开实施例提供的一种多目标跟踪装置的结构示意图;Fig. 11 shows a schematic structural diagram of a multi-target tracking device provided by an embodiment of the present disclosure;
图12示出了本公开实施例提供的另一种多目标跟踪装置的结构示意图;Fig. 12 shows a schematic structural diagram of another multi-target tracking device provided by an embodiment of the present disclosure;
图13示出了本公开实施例提供的一种电子设备的示意图。Fig. 13 shows a schematic diagram of an electronic device provided by an embodiment of the present disclosure.
具体实施方式Detailed ways
为使本公开实施例的目的、技术方案和优点更加清楚,下面将结合本公开实施例中附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本公开一部分实施例,而不是全部的实施例。通常在此处附图中描述和示出的本公开实施例的组件可以以各种不同的配置来布置和设计。因此,以下对在附图中提供的本公开的实施例的详细描述并非旨在限制要求保护的本公开的范围,而是仅仅表示本公开的选定实施例。基于本公开的实施例,本领域技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例,都属于本公开保护的范围。In order to make the purpose, technical solutions and advantages of the embodiments of the present disclosure clearer, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present disclosure. Obviously, the described embodiments are only It is a part of the embodiments of the present disclosure, but not all of them. The components of the disclosed embodiments generally described and illustrated in the figures herein may be arranged and designed in a variety of different configurations. Accordingly, the following detailed description of the embodiments of the present disclosure provided in the accompanying drawings is not intended to limit the scope of the claimed disclosure, but merely represents selected embodiments of the present disclosure. Based on the embodiments of the present disclosure, all other embodiments obtained by those skilled in the art without creative effort shall fall within the protection scope of the present disclosure.
应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步定义和解释。It should be noted that like numerals and letters denote similar items in the following figures, therefore, once an item is defined in one figure, it does not require further definition and explanation in subsequent figures.
本文中术语“和/或”,仅仅是描述一种关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中术语“至少一种”表示多种中的任意一种或多种中的至少两种的任意组合,例如,包括A、B、C中的至少一种,可以表示包括从A、B和C构成的集合中选择的任意一个或多个元素。The term "and/or" in this article only describes an association relationship, which means that there can be three kinds of relationships, for example, A and/or B can mean: there is A alone, A and B exist at the same time, and B exists alone. situation. In addition, the term "at least one" herein means any one of a variety or any combination of at least two of the more, for example, including at least one of A, B, and C, which may mean including from A, Any one or more elements selected from the set formed by B and C.
随着交通事故的频发,交通事故引发的一系列安全问题已经受到了社会的广泛关注,面对日益严峻的交通安全形势,发展汽车智能辅助驾驶系统已成为当前汽车行业的迫切要求;而车辆前向碰撞预警系统则是汽车智能辅助驾驶系统中最重要的一部分,车辆的检测与跟踪算法在汽车智能辅助驾驶系统中起到了至关重要的作用,对于在高速路段行驶的车辆而言,要求跟踪结果能够实时准确更新,然而使用传统机器视觉的方法往往在速度和准确性上都难以达到要求。With the frequent occurrence of traffic accidents, a series of safety issues caused by traffic accidents have received widespread attention from the society. Facing the increasingly severe traffic safety situation, the development of automotive intelligent assisted driving systems has become an urgent requirement of the current automotive industry; and vehicles The forward collision warning system is the most important part of the car intelligent assisted driving system. The vehicle detection and tracking algorithm plays a vital role in the car intelligent assisted driving system. For vehicles driving on high-speed roads, it is required The tracking results can be updated accurately in real time, but the methods using traditional machine vision are often difficult to meet the requirements in terms of speed and accuracy.
由于卷积神经网络(Convolutional Neural Networks,CNN)的发展和应用,许多计算机视觉领域的任务得到了较大的发展,随着深度学习技术的发展,基于卷积神经网络 的多目标跟踪算法取得了一定的突破,使其具备了远高于传统多目标跟踪方法的跟踪准确度。Due to the development and application of Convolutional Neural Networks (CNN), many tasks in the field of computer vision have been greatly developed. With the development of deep learning technology, the multi-target tracking algorithm based on Convolutional Neural Networks has achieved great progress Certain breakthroughs have enabled it to have a tracking accuracy much higher than traditional multi-target tracking methods.
然而,经研究发现,由于多目标跟踪过程中存在频繁的遮挡,造成跟踪过程中目标发生遮挡时,被检测到的目标数量发生变化,被遮挡的跟踪目标轨迹无法匹配现在帧的检测目标,无法判别该轨迹是因遮挡暂时消失还是离开检测区域需要停止跟踪,造成一部分被遮挡的轨迹因为误判而终止跟踪。在目标遮挡结束后,原先跟踪的目标再次出现在检测区域内,若原跟踪轨迹已停止跟踪,此时该目标会生成新的初始轨迹,从而导致目标身份发生变化。另外,当车辆颠簸时,会造成同一目标的检测结果间距离较大,进而导致相似度较低、数据关联失败、目标跟踪失败。However, the research found that due to frequent occlusions in the multi-target tracking process, when the target is occluded during the tracking process, the number of detected targets changes, and the occluded track of the tracked target cannot match the detected target of the current frame. It is necessary to stop tracking when judging whether the track disappears temporarily due to occlusion or leaves the detection area, and a part of the track that is occluded is terminated due to misjudgment. After the target occlusion ends, the originally tracked target reappears in the detection area. If the original tracking track has stopped tracking, the target will generate a new initial track at this time, resulting in a change in the target identity. In addition, when the vehicle is bumping, the distance between the detection results of the same target will be large, which will lead to low similarity, data association failure, and target tracking failure.
本公开提供了一种多目标跟踪方法,包括:对当前帧图像进行目标检测,得到检测出的至少一个第一目标物体的第一检测结果;提取所述第一目标物体的外观特征向量;计算所述第一目标物体的外观特征向量,与所述当前帧图像之前的至少一帧图像中检测出的各目标物体的外观特征向量之间的相似度;基于所述相似度,确定针对所述第一目标物体的目标跟踪结果;所述目标跟踪结果用于反映所述第一目标物体在所述当前帧图像以及所述多帧图像中的检测结果。The present disclosure provides a multi-target tracking method, including: performing target detection on the current frame image to obtain a first detection result of at least one first target object detected; extracting the appearance feature vector of the first target object; calculating The similarity between the appearance feature vector of the first target object and the appearance feature vectors of each target object detected in at least one frame image before the current frame image; based on the similarity, determine the A target tracking result of the first target object; the target tracking result is used to reflect detection results of the first target object in the current frame image and the multiple frame images.
本公开实施例中,由于提取的外观特征能够更好地表示了第一目标物体的身份信息,利用这种更好的特征信息,可以处理目标被遮挡后重现的轨迹衔接,还可以降低因车辆颠簸导致的追踪不稳定的情况发生的概率,得到更加平稳的多目标跟踪结果,提高多目标追踪的稳定性。In the embodiment of the present disclosure, since the extracted appearance features can better represent the identity information of the first target object, using this better feature information can handle the reappearance of the trajectory after the target is occluded, and can also reduce the factor The probability of unstable tracking caused by vehicle bumps can be obtained to obtain a more stable multi-target tracking result and improve the stability of multi-target tracking.
本公开实施例提供的多目标跟踪方法,可以应用于自动驾驶系统,跟踪车辆视野内的目标物体,准确地得到物体的跟踪结果,有助于开发人员设计相关的跟踪策略、报警策略。以下,对本公开实施例提供的多目标跟踪方法进行详细介绍。The multi-target tracking method provided by the embodiments of the present disclosure can be applied to an automatic driving system to track target objects within the field of view of the vehicle, and accurately obtain object tracking results, which is helpful for developers to design related tracking strategies and alarm strategies. Hereinafter, the multi-target tracking method provided by the embodiments of the present disclosure will be introduced in detail.
参见图1,为本公开实施例所提供的多目标跟踪方法的执行主体的示意图,该方法的执行主体为电子设备,其中,电子设备可以包括终端和服务器。例如,该方法可应用于终端中,终端可以是图1中所示终端设备,包含但不仅限于平板电脑、笔记本电脑、掌上电脑、手机、语音交互设备、个人电脑(personal computer,PC)、车辆以及机器人,此处不做限定。Referring to FIG. 1 , it is a schematic diagram of an execution body of a multi-target tracking method provided by an embodiment of the present disclosure. The execution body of the method is an electronic device, wherein the electronic device may include a terminal and a server. For example, the method can be applied to a terminal. The terminal can be a terminal device as shown in FIG. And robots, not limited here.
其中,语音交互设备包含但不仅限于智能音响以及智能家电等。该方法还可应用于服务器,或者可应用于由终端和服务器所组成的实施环境中。服务器可以是独立的物理服务器,也可以是多个物理服务器构成的服务器集群或者分布式系统,还可以是提供云服务、云数据库、云计算、云存储、大数据和人工智能平台等基础云计算服务的云服务器。Among them, voice interactive devices include but are not limited to smart speakers and smart home appliances. The method can also be applied to a server, or can be applied to an implementation environment composed of a terminal and a server. The server can be an independent physical server, or a server cluster or distributed system composed of multiple physical servers, or it can provide basic cloud computing such as cloud services, cloud databases, cloud computing, cloud storage, big data and artificial intelligence platforms. Cloud server for the service.
应理解,在一些实施方式中,服务器可以通网络与终端进行通信。网络可以包括各种连接类型,例如有线、无线通信链路或者光纤电缆等等。It should be understood that, in some implementation manners, the server may communicate with the terminal through a network. A network may include various connection types such as wires, wireless communication links, or fiber optic cables, among others.
此外,该多目标跟踪方法还可以通过运行于终端或服务器中的软体实现,例如,采用具有多目标跟踪功能的应用程序实现本公开实施例提供的多目标跟踪方法。在一些可能的实现方式中,该多目标跟踪方法可以通过处理器调用存储器中存储的计算机可读指令的方式来实现。In addition, the multi-target tracking method can also be implemented by software running on a terminal or a server, for example, the multi-target tracking method provided by the embodiments of the present disclosure is implemented by using an application program with a multi-target tracking function. In some possible implementation manners, the multi-target tracking method may be implemented by a processor invoking computer-readable instructions stored in a memory.
参见图2所示,为本公开实施例提供的多目标跟踪方法的流程图,该多目标跟踪方法应用于电子设备中,由电子设备执行,可以包括以下步骤S101至步骤S104:Referring to FIG. 2 , it is a flowchart of a multi-target tracking method provided by an embodiment of the present disclosure. The multi-target tracking method is applied to an electronic device and executed by the electronic device, and may include the following steps S101 to S104:
步骤S101:对当前帧图像进行目标检测,得到检测出的至少一个第一目标物体的第一检测结果。Step S101: Perform target detection on the current frame image to obtain a first detection result of at least one detected first target object.
在示例中,在一幅图像(例如,视频中的每一帧图像)中,区别于周围环境的闭合区域往往被称为目标。给出目标在图像中的位置的过程称为检测。例如,可以利用已经训练好的目标检测模型(或者目标检测网络),检测在当前帧图像里多个跟踪目标的位置 及其类别信息。In an example, in an image (for example, each frame image in a video), a closed area that is distinguished from the surrounding environment is often called an object. The process of giving the location of an object in an image is called detection. For example, the trained target detection model (or target detection network) can be used to detect the position and category information of multiple tracking targets in the current frame image.
在一些实施方式中,还可以采用目标检测技术对目标进行检测,该目标检测技术可以采用多种方法,例如:帧差法、背景减除法、光流法、方向梯度特征等,也可以通过手动方式标注目标的初始位置。当然,还可以采用任何其他合适的可用于目标检测的技术,此处不做限定,只要能实现对目标的检测即可。In some embodiments, the target detection technology can also be used to detect the target. The target detection technology can use a variety of methods, such as: frame difference method, background subtraction method, optical flow method, directional gradient feature, etc., and can also be manually way to mark the initial position of the target. Of course, any other suitable technology that can be used for target detection can also be used, which is not limited here, as long as the target detection can be realized.
参见图3所示,为本公开实施例中提供的一种对当前帧图像进行目标检测的结果示意图,在示例中,可以将当前帧图像T输入至目标检测模型进行目标检测,得到检测出的至少一个第一目标物体10的第一检测结果,本实施方式中,第一检测结果中包括三个第一目标物体10,且第一目标物体为车辆。其他实施方式中,第一目标物体10的数量还可以是2个或者4个,第一目标物体10的种类也可以是其他(比如行人),此处不做限定。Referring to FIG. 3 , it is a schematic diagram of a target detection result of the current frame image provided in the embodiment of the present disclosure. In an example, the current frame image T can be input to the target detection model for target detection, and the detected The first detection result of at least one first target object 10, in this embodiment, the first detection result includes three first target objects 10, and the first target object is a vehicle. In other embodiments, the number of the first target objects 10 may also be 2 or 4, and the type of the first target objects 10 may also be other types (such as pedestrians), which are not limited here.
在一些实施方式中,所述第一检测结果中还包括所述第一目标物体10的检测框信息(如图3中A所示)、所述第一目标物体10的种类、和所述第一目标物体10的检测结果的置信度中的至少一种。In some implementations, the first detection result also includes detection frame information of the first target object 10 (as shown in A in FIG. 3 ), the type of the first target object 10, and the first target object 10. At least one of the confidence levels of a detection result of a target object 10 .
本公开实施例中,由于第一检测结果中还包括所述第一目标物体的检测框信息、所述第一目标物体的种类和所述第一目标物体的检测结果的置信度中的至少一种,使得第一检测结果输出的内容更加丰富,为后续相似度的计算提供了更多的依据,进而能够提高相似度计算的精度。In the embodiment of the present disclosure, since the first detection result also includes at least one of the detection frame information of the first target object, the type of the first target object, and the confidence level of the detection result of the first target object This makes the content of the output of the first detection result more abundant, provides more basis for the calculation of the subsequent similarity, and can improve the accuracy of the similarity calculation.
应理解,在对当前帧图像进行目标检测之前,需要获取待检测视频,然后按照预设时间间隔或帧数间隔,从待检测视频中获取所述当前帧图像。其中,待检测视频为待检测的视频或视频帧序列。例如,待检测视频可以是一定视频帧长度的视频或视频流。It should be understood that before object detection is performed on the current frame image, the video to be detected needs to be acquired, and then the current frame image is acquired from the video to be detected according to a preset time interval or frame number interval. Wherein, the video to be detected is a video or a sequence of video frames to be detected. For example, the video to be detected may be a video or a video stream with a certain video frame length.
在一些实施方式中,目标检测模型获取待检测视频后,可以从待检测视频中间隔获取多帧待检测图像,例如,待检测视频包括M帧待检测图像,目标检测模型从M帧待检测图像中按照预设时间间隔或者每间隔N帧获取至少一帧待检测图像。In some implementations, after the target detection model acquires the video to be detected, it can acquire multiple frames of images to be detected at intervals from the video to be detected. For example, the video to be detected includes M frames of images to be detected, and the target detection model starts from M frames of images to be detected Acquiring at least one frame of the image to be detected according to a preset time interval or every interval of N frames.
本公开实施例中,电子设备获取待检测视频后,从待检测视频中间隔获取多帧待检测图像,能够提高待检测视频中多目标跟踪的处理速度和增加可以处理的待检测视频的接入路数。In the embodiment of the present disclosure, after the electronic device acquires the video to be detected, multiple frames of images to be detected are obtained at intervals from the video to be detected, which can improve the processing speed of multi-target tracking in the video to be detected and increase the access of the video to be detected that can be processed road number.
应理解,待检测视频中待检测图像帧速一般为每秒25帧以上,如果电子设备(如服务器)对每一帧待检测图像都进行检测,则计算量过大,会导致服务器过载,影响多目标跟踪的处理速度和待检测视频的接入路数。在本实施例中,电子设备获取待检测视频后,从待检测视频中间隔获取多帧待检测图像,能够提高待检测视频中多目标跟踪的处理速度和增加可以处理的待检测视频的接入路数。It should be understood that the frame rate of the image to be detected in the video to be detected is generally more than 25 frames per second. If an electronic device (such as a server) detects each frame of the image to be detected, the amount of calculation will be too large, which will cause the server to overload and affect The processing speed of multi-target tracking and the number of access channels of the video to be detected. In this embodiment, after the electronic device acquires the video to be detected, it acquires multiple frames of images to be detected at intervals from the video to be detected, which can improve the processing speed of multi-target tracking in the video to be detected and increase the access of the video to be detected that can be processed road number.
步骤S102:提取所述第一目标物体的外观特征向量。Step S102: Extracting an appearance feature vector of the first target object.
在示例中,可以将前述得到的待检测图像(当前帧图像)输入至预先训练的重识别模型对第一目标物体进行外观特征的提取,得到外观特征向量。本实施方式中,所提取的外观特征均为64维度的向量。其中,关于重识别模型的训练方法,将在后文进行详细阐述。当然,也可以采用其他方法来提取第一目标物体的外观特征向量,在此不做限定。In an example, the previously obtained image to be detected (current frame image) may be input to a pre-trained re-identification model to extract appearance features of the first target object to obtain an appearance feature vector. In this embodiment, the extracted appearance features are all 64-dimensional vectors. Among them, the training method of the re-identification model will be described in detail later. Of course, other methods may also be used to extract the appearance feature vector of the first target object, which is not limited here.
步骤S103:计算所述第一目标物体的外观特征向量,与所述当前帧图像之前的至少一帧图像中检测出的各目标物体的外观特征向量之间的相似度。Step S103: Calculate the similarity between the appearance feature vector of the first target object and the appearance feature vectors of each target object detected in at least one frame image before the current frame image.
在示例中,以图3中的三个第一目标物体10为例进行说明,为了清楚的表示相似度的计算过程,本实施方式中,将当前帧图像(表中也称后帧图像)中所检测出的三个第一目标物体10分别的赋予代码D0、D1以及D2,而当前帧的之前的多帧图像,为了方便说明,从多帧图像中取一帧(表中也称前帧图像)为例来进行相似度的计算,并将前帧图像中的目标物体分别赋予代码Q0、Q1以及Q2,相似度的计算结果如下表1所示:In the example, three first target objects 10 in FIG. 3 are taken as an example for illustration. In order to clearly show the calculation process of the similarity, in this embodiment, the current frame image (also called the subsequent frame image in the table) The detected three first target objects 10 are assigned codes D0, D1 and D2 respectively, and the multi-frame images before the current frame, for the convenience of description, take one frame from the multi-frame images (also called the previous frame in the table image) as an example to calculate the similarity, and assign codes Q0, Q1, and Q2 to the target objects in the previous frame image respectively. The calculation results of the similarity are shown in Table 1 below:
表1Table 1
Figure PCTCN2022075415-appb-000001
Figure PCTCN2022075415-appb-000001
从上表中可以看出,经过相似度计算后,当前帧图像中的第一目标物体D0与当前帧图像的前帧图像中的目标物体Q0的相似度最高,相似度计算结果为0.82;当前帧图像中的第一目标物体D1与当前帧图像的前帧图像中的目标物体Q1的相似度最高,相似度计算结果为0.91;当前帧图像中的第一目标物体D2与当前帧图像的前帧图像中的目标物体Q2的相似度最高,相似度计算结果为0.92。It can be seen from the above table that after similarity calculation, the first target object D0 in the current frame image has the highest similarity with the target object Q0 in the previous frame image of the current frame image, and the similarity calculation result is 0.82; The first target object D1 in the frame image has the highest similarity with the target object Q1 in the previous frame image of the current frame image, and the similarity calculation result is 0.91; the first target object D2 in the current frame image and the previous frame image of the current frame image have the highest similarity. The target object Q2 in the frame image has the highest similarity, and the similarity calculation result is 0.92.
应理解,为表述的简明性,只使用两帧图像作以说明。在实际操作过程中时,为降低环境突变对外观特征造成影响,可以使用缓存存储最近的多帧(如100帧)图像上相同物体的外观特征,使用相似度最大值作为计算结果,进而可以提高目标追踪的可靠性。即,使用暂存策略存储最近的多帧图像上同一物体的外观特征,该策略降低了由于外观突变导致的相似度的跳变的概率,进而得到更好的多目标跟踪结果。It should be understood that for simplicity of expression, only two frames of images are used for illustration. In the actual operation process, in order to reduce the impact of environmental mutations on the appearance features, you can use the cache to store the appearance features of the same object on the latest multi-frame (such as 100 frames) images, and use the maximum similarity as the calculation result, which can improve Reliability of object tracking. That is, the temporary storage strategy is used to store the appearance features of the same object on the most recent multi-frame images. This strategy reduces the probability of similarity jumps caused by appearance mutations, thereby obtaining better multi-target tracking results.
步骤S104:基于所述相似度,确定针对所述第一目标物体的目标跟踪结果;所述目标跟踪结果用于反映所述第一目标物体在所述当前帧图像以及所述至少一帧图像中的检测结果。Step S104: Based on the similarity, determine a target tracking result for the first target object; the target tracking result is used to reflect that the first target object is in the current frame image and the at least one frame image test results.
在一些实施方式中,参见图4所示,本公开实施例提供的一种确定针对第一目标物体的目标跟踪结果的方法流程图,在基于所述相似度,确定针对所述第一目标物体的目标跟踪结果时,可以包括以下步骤S1041至步骤S1042:In some implementations, see FIG. 4 , which is a flow chart of a method for determining a target tracking result for a first target object provided by an embodiment of the present disclosure. Based on the similarity, determine the target tracking result for the first target object When the target tracking result of , may include the following steps S1041 to S1042:
步骤S1041:基于所述相似度,将所述第一检测结果和所述至少一帧图像中各目标物体的检测结果进行匹配,确定与所述第一目标物体匹配的至少一帧图像中的所述第一目标物体的检测结果。Step S1041: Based on the similarity, match the first detection result with the detection results of each target object in the at least one frame of image, and determine all the target objects in the at least one frame of image that match the first target object Describe the detection result of the first target object.
在示例中,匹配也称为数据关联,是多目标跟踪任务中经常使用的典型的处理方法,用于解决目标间的匹配问题。本公开实施例中,可以采用匈牙利算法将所述第一检测结果和所述至少一帧图像中各目标物体的检测结果进行匹配,如此可以提高匹配的精度。In the example, matching, also known as data association, is a typical processing method often used in multi-target tracking tasks to solve the matching problem between targets. In the embodiment of the present disclosure, the Hungarian algorithm may be used to match the first detection result with the detection results of each target object in the at least one frame of image, so that the matching accuracy may be improved.
步骤S1042:根据所述至少一帧图像中的所述第一目标物体的检测结果以及所述第一检测结果,确定针对所述第一目标物体的目标跟踪结果。Step S1042: Determine a target tracking result for the first target object according to the detection result of the first target object in the at least one frame of images and the first detection result.
在示例中,请结合参见图5和图6,本公开实施例中,由于基于预先训练的重识别模型提取所述第一目标物体10的外观特征向量,使得所提取的外观特征能够更好地表示了第一目标物体10的身份信息,其中,一个第一目标物体10的检测框的地址为id1,另一个第一目标物体10的检测框的地址为id2。In an example, please refer to FIG. 5 and FIG. 6 in conjunction. In the embodiment of the present disclosure, since the appearance feature vector of the first target object 10 is extracted based on the pre-trained re-identification model, the extracted appearance features can be better Indicates the identity information of the first target object 10, wherein the address of the detection frame of one first target object 10 is id1, and the address of the detection frame of another first target object 10 is id2.
在图5和图6中可见,即使在发生遮挡的情况下,仍能较好的识别出不同的第一目标物体的身份信息,利用这种更好的特征信息,不仅可以处理第一目标物体10被遮挡后重现的轨迹衔接问题,还可以降低因车辆颠簸导致的追踪不稳定的情况发生的概率,进而可以得到更加平稳的多目标跟踪结果,提高多目标追踪的稳定性。It can be seen from Figure 5 and Figure 6 that even in the case of occlusion, the identity information of different first target objects can still be recognized well, and with this better feature information, not only the first target object can be processed 10 The problem of track convergence after being occluded can also reduce the probability of tracking instability caused by vehicle bumps, and then can obtain a more stable multi-target tracking result and improve the stability of multi-target tracking.
本公开实施例中,通过将所述第一检测结果和所至少一帧图像中各目标物体的检测 结果进行一一匹配,不仅可以得到第一目标物体的跟踪结果,还可以提高跟踪结果的确定精度。In the embodiment of the present disclosure, by matching the first detection result with the detection results of each target object in the at least one frame image, not only the tracking result of the first target object can be obtained, but also the determination of the tracking result can be improved. precision.
在一些实施方式中,参见图7所示,为本公开实施例提供的一种重识别模型的训练方法的流程图,针对上述提到的重识别模型可以采用如下方法训练获得。具体在对该重识别模型进行训练时,可以包括以下步骤S1021至步骤S1023:In some implementations, see FIG. 7 , which is a flow chart of a method for training a re-identification model provided by an embodiment of the present disclosure. The above-mentioned re-identification model can be obtained through training using the following method. Specifically, when training the re-identification model, the following steps S1021 to S1023 may be included:
步骤S1021:获取图像样本集合,所述图像样本集合中包括多个图像样本及图像样本的标注信息,所述标注信息用于指示对应同一目标物体的图像样本。Step S1021: Obtain an image sample set, the image sample set includes a plurality of image samples and annotation information of the image samples, and the annotation information is used to indicate the image samples corresponding to the same target object.
应理解,由于现有的大部图像均是通过监控摄像头的俯拍视角所拍摄的,进而导致所拍摄的图像与自动驾驶场景不相符,因此,为实现该重识别模型的重识别能力,在一些实施方式中,所述图像样本为在自动驾驶场景下拍摄的。It should be understood that since most of the existing images are taken from the overhead perspective of the surveillance camera, the captured images do not match the automatic driving scene. Therefore, in order to realize the re-identification capability of the re-identification model, in In some implementations, the image samples are taken in an automatic driving scene.
本公开实施例中,由于图像样本为在驾驶场景下拍摄的,使得训练好的重识别模型能够较好的适应驾驶场景,提高了模型识别的准确率以及在驾驶场景下的适应性。In the embodiment of the present disclosure, since the image samples are taken in a driving scene, the trained re-identification model can better adapt to the driving scene, improving the accuracy of model recognition and the adaptability in the driving scene.
步骤S1022:基于所述图像样本集合,对待训练的重识别模型进行训练,得到所述重识别模型。Step S1022: Based on the image sample set, train the re-identification model to be trained to obtain the re-identification model.
本公开实施例中,基于所述图像样本集合,对待训练的重识别模型进行训练得到重识别模型,并采用重识别模型提取所述第一目标物体的外观特征向量,如此可以提高第一目标物体的外观特征向量的提取精度。In the embodiment of the present disclosure, based on the image sample set, the re-recognition model to be trained is trained to obtain the re-recognition model, and the re-recognition model is used to extract the appearance feature vector of the first target object, so that the first target object can be improved. The extraction accuracy of the appearance feature vector.
在一些实施方式中,可以根据具体需求,确定待训练的基础网络,本实施方式中,选用深度可分离卷积网络(如mobileNetV2网络)作为骨干网络,当然,也可以选用mobileNetV1网络等轻量化卷积神经网络,在此不做限定。如此,通过采用轻量化的卷积神经网络作为基础训练网络,可以提高训练好的重识别模型的识别效率,实时性更强。In some implementations, the basic network to be trained can be determined according to specific needs. In this implementation, a depth-separable convolutional network (such as a mobileNetV2 network) is selected as the backbone network. Of course, a lightweight volume such as a mobileNetV1 network can also be selected. The product neural network is not limited here. In this way, by using a lightweight convolutional neural network as the basic training network, the recognition efficiency of the trained re-recognition model can be improved, and the real-time performance is stronger.
在示例中,在确定基础网络之后,可以将图像样本集合中图像样本分别输入至该基础网络进行特征提取,然后根据分类结果以及损失函数对所述基础网络进行训练,得到重识别模型。其中,具体的模型训练方法与现有模型训练方法类似。In an example, after the basic network is determined, the image samples in the image sample set can be respectively input to the basic network for feature extraction, and then the basic network is trained according to the classification result and loss function to obtain a re-identification model. Wherein, the specific model training method is similar to the existing model training method.
应理解,为了提高重识别模型的识别精度,可以使用激光雷达的跟踪结果搭建重图像样本集合,参见图8所示,通过激光雷达的跟踪可以得到多个检测框,比如点云检测框1、点云检测框2、点云检测框17、点云检测框18、点云检测框19等,然而,如图8所示,会存在一些点云检测框(如点云检测框19或者点云检测框17)被遮挡的情况、且激光雷达的跟踪框与目标物体(如车辆)的贴合度不高,而图像样本集合中的图像样本应该不含有被遮挡物体、且要求检测框贴合目标物体,否则将引入不必要的噪声,导致重识别模型训练结果变差。It should be understood that in order to improve the recognition accuracy of the re-identification model, the tracking results of the lidar can be used to build a re-image sample set, as shown in Figure 8, through the tracking of the lidar, multiple detection frames can be obtained, such as point cloud detection frame 1, Point cloud detection frame 2, point cloud detection frame 17, point cloud detection frame 18, point cloud detection frame 19, etc., however, as shown in Figure 8, there will be some point cloud detection frames (such as point cloud detection frame 19 or point cloud The detection frame 17) is occluded, and the tracking frame of the lidar is not well fitted to the target object (such as a vehicle), and the image samples in the image sample set should not contain occluded objects, and the detection frame is required to fit The target object, otherwise unnecessary noise will be introduced, resulting in poor re-identification model training results.
在一些实施方式中,参见图9所示,获取图像样本集合的方法包括如下步骤S10211至步骤S10214:In some implementations, as shown in FIG. 9, the method for acquiring an image sample set includes the following steps S10211 to S10214:
步骤S10211:获取相机所采集的候选图像,对所述候选图像进行目标检测,得到指示检测出的至少一个第二目标物体的第二检测结果。Step S10211: Acquire candidate images captured by the camera, perform target detection on the candidate images, and obtain a second detection result indicating at least one detected second target object.
步骤S10212:获取与所述候选图像针对同一场景同步采集的点云数据,对所述点云数据进行检测,得到至少一个点云检测框。Step S10212: Obtain point cloud data collected synchronously with the candidate image for the same scene, detect the point cloud data, and obtain at least one point cloud detection frame.
步骤S10213:确定所述第二检测结果中所述第二目标物体的检测框与所述点云检测框的交并比IOU。Step S10213: Determine the intersection-over-union ratio IOU between the detection frame of the second target object and the point cloud detection frame in the second detection result.
步骤S10214:在存在任一点云检测框与所述第二目标物体的检测框的IOU大于预设阈值的情况下,基于所述候选图像确定所述图像样本。Step S10214: If the IOU between any point cloud detection frame and the detection frame of the second target object is greater than a preset threshold, determine the image sample based on the candidate image.
本公开实施例中,通过将相机所采集的候选图像以及激光雷达所采集的点云数据相结合,即,使用对候选图像进行检测得到的第二检测结果,对激光雷达多目标跟踪结果进行过滤,得到图像样本,可以提高图像样本的获取精度就,进而提高了训练好的重识别模型的识别精度。In the embodiment of the present disclosure, by combining the candidate image collected by the camera and the point cloud data collected by the laser radar, that is, using the second detection result obtained by detecting the candidate image, the laser radar multi-target tracking result is filtered , obtaining image samples can improve the acquisition accuracy of image samples, thereby improving the recognition accuracy of the trained re-identification model.
在示例中,可以获取相机所采集的候选图像,并获取与所述候选图像针对同一场景同步采集的点云数据,对所述点云数据进行检测,得到至少一个点云检测框;然后使用对候选图像进行检测得到的第二检测结果,对激光雷达多目标跟踪结果进行过滤。即,在得到第二检测结果之后,确定所述第二检测结果中所述第二目标物体的检测框与所述点云检测框的交并比IOU,只保留IOU大于预设阈值(如0.7)的候选图像。In an example, the candidate image collected by the camera can be obtained, and the point cloud data collected synchronously with the candidate image for the same scene can be obtained, and the point cloud data can be detected to obtain at least one point cloud detection frame; then use the The second detection result obtained by detecting the candidate image is used to filter the lidar multi-target tracking result. That is, after obtaining the second detection result, determine the intersection-over-union ratio IOU between the detection frame of the second target object and the point cloud detection frame in the second detection result, and only keep the IOU greater than a preset threshold (such as 0.7 ) candidate images.
此外,还可以基于点云检测框的位置信息确定所述图像样本中所述第二目标物体的检测框的位置信息,即,将第二目标物体的检测框的ID设置为激光雷达的点云检测框ID。之后,将第二目标物体的检测框所对应的部分从所述候选图像中剪切下来,根据ID分类。In addition, the position information of the detection frame of the second target object in the image sample can also be determined based on the position information of the point cloud detection frame, that is, the ID of the detection frame of the second target object is set as the point cloud of the lidar Detection box ID. Afterwards, the part corresponding to the detection frame of the second target object is cut out from the candidate image, and classified according to the ID.
在示例中,参见图10所示,为本公开实施例所提供的一种图像样本集合的示意图。如图10所示,目标物体M对应多个图像样本,但该目标物体M所对应的图像样本的ID相同,ID均为001,即标注信息相同;同理,目标物体N对应多个图像样本,但该目标物体N所对应的图像样本的ID相同,ID均为002,即标注信息相同。然而,由于目标物体M和目标物体N不同,因此目标物体M所对应的图像样本和目标物体N所对应的图像样本的ID不同。In an example, refer to FIG. 10 , which is a schematic diagram of an image sample set provided by an embodiment of the present disclosure. As shown in Figure 10, the target object M corresponds to multiple image samples, but the IDs of the image samples corresponding to the target object M are the same, and the IDs are all 001, that is, the label information is the same; similarly, the target object N corresponds to multiple image samples , but the IDs of the image samples corresponding to the target object N are the same, and the IDs are all 002, that is, the labeling information is the same. However, since the target object M is different from the target object N, the IDs of the image samples corresponding to the target object M and the image samples corresponding to the target object N are different.
应理解,在上述实施方式的方法步骤中,各步骤的撰写顺序并不意味着严格的执行顺序而对实施过程构成任何限定,各步骤的具体执行顺序应当以其功能和可能的内在逻辑确定。It should be understood that in the method steps of the above embodiments, the writing order of each step does not imply a strict execution order and constitutes any limitation on the implementation process, and the specific execution order of each step should be determined by its function and possible internal logic.
在一些实施方式中,所述基于所述候选图像确定所述图像样本,包括:从所述候选图像中将所述第二目标物体的检测框所对应的部分图像进行裁剪,得到所述图像样本。In some implementation manners, the determining the image sample based on the candidate image includes: clipping a part of the image corresponding to the detection frame of the second target object from the candidate image to obtain the image sample .
本公开实施例中,通过将所述第二目标物体的检测框所对应的部分图像从所述候选图像中进行裁剪,得到所述图像样本,可以降低图像样本引入不必要的噪声。In the embodiment of the present disclosure, the image sample is obtained by cutting the partial image corresponding to the detection frame of the second target object from the candidate image, which can reduce unnecessary noise introduced by the image sample.
基于同一技术构思,本公开实施例中还提供了与多目标跟踪方法对应的多目标跟踪装置,由于本公开实施例中的装置解决问题的原理与本公开实施例上述多目标跟踪方法相似,因此装置的实施可以参见方法的实施。Based on the same technical idea, the embodiment of the present disclosure also provides a multi-target tracking device corresponding to the multi-target tracking method. Since the problem-solving principle of the device in the embodiment of the present disclosure is similar to the above-mentioned multi-target tracking method in the embodiment of the present disclosure, therefore The implementation of the device can refer to the implementation of the method.
参照图11所示,为本公开实施例提供的一种多目标跟踪装置500的示意图,所述装置应用于电子设备中,包括:Referring to FIG. 11 , it is a schematic diagram of a multi-target tracking device 500 provided by an embodiment of the present disclosure. The device is applied to electronic equipment, including:
目标检测模块501,配置为对当前帧图像进行目标检测,得到检测出的至少一个第一目标物体的第一检测结果。The target detection module 501 is configured to perform target detection on the current frame image, and obtain a first detection result of at least one detected first target object.
特征提取模块502,配置为提取所述第一目标物体的外观特征向量。The feature extraction module 502 is configured to extract an appearance feature vector of the first target object.
相似度计算模块503,配置为计算所述第一目标物体的外观特征向量,与所述当前帧图像之前的至少一帧图像中检测出的各目标物体的外观特征向量之间的相似度。The similarity calculation module 503 is configured to calculate the similarity between the appearance feature vector of the first target object and the appearance feature vectors of each target object detected in at least one frame image before the current frame image.
跟踪结果确定模块504,配置为基于所述相似度,确定针对所述第一目标物体的目标跟踪结果;所述目标跟踪结果配置为反映所述第一目标物体在所述当前帧图像以及所述至少一帧图像中的检测结果。The tracking result determining module 504 is configured to determine a target tracking result for the first target object based on the similarity; the target tracking result is configured to reflect the first target object in the current frame image and the Detection results in at least one frame of images.
在一种可能的实施方式中,所述跟踪结果确定模块504具体配置为:In a possible implementation manner, the tracking result determining module 504 is specifically configured as:
基于所述相似度,将所述第一检测结果和所述至少一帧图像中各目标物体的检测结果进行匹配,确定与所述第一目标物体匹配的至少一帧图像中的所述第一目标物体的检测结果;Based on the similarity, match the first detection result with the detection results of each target object in the at least one frame of images, and determine the first detection result in the at least one frame of images that matches the first target object The detection result of the target object;
根据所述至少一帧图像中的所述第一目标物体的检测结果以及所述第一检测结果,确定针对所述第一目标物体的目标跟踪结果。Determine a target tracking result for the first target object according to the detection result of the first target object in the at least one frame of images and the first detection result.
在一种可能的实施方式中,所述第一检测结果中还包括所述第一目标物体的检测框信息、所述第一目标物体的种类、和所述第一目标物体的检测结果的置信度中的至少一种。In a possible implementation manner, the first detection result further includes the detection frame information of the first target object, the type of the first target object, and the confidence of the detection result of the first target object. at least one of the degrees.
在一种可能的实施方式中,参见图12所示,所述装置还包括模型训练模块505,所述模型训练模块505配置为:In a possible implementation manner, as shown in FIG. 12, the device further includes a model training module 505, and the model training module 505 is configured to:
获取图像样本集合,所述图像样本集合中包括多个图像样本及图像样本的标注信息,所述标注信息配置为指示对应同一目标物体的图像样本;Obtain an image sample set, the image sample set includes a plurality of image samples and annotation information of the image samples, the annotation information is configured to indicate image samples corresponding to the same target object;
基于所述图像样本集合,对待训练的重识别模型进行训练,得到所述重识别模型。Based on the set of image samples, the re-identification model to be trained is trained to obtain the re-identification model.
在一种可能的实施方式中,所述图像样本为在驾驶场景下拍摄的。In a possible implementation manner, the image sample is taken in a driving scene.
在一种可能的实施方式中,所述模型训练模块505具体配置为:In a possible implementation manner, the model training module 505 is specifically configured as:
获取相机所采集的候选图像,对所述候选图像进行目标检测,得到指示检测出的至少一个第二目标物体的第二检测结果;Acquiring candidate images collected by the camera, performing target detection on the candidate images, and obtaining a second detection result indicating at least one detected second target object;
获取与所述候选图像针对同一场景同步采集的点云数据,对所述点云数据进行检测,得到至少一个点云检测框;Obtain point cloud data collected synchronously with the candidate image for the same scene, detect the point cloud data, and obtain at least one point cloud detection frame;
确定所述第二检测结果中所述第二目标物体的检测框与所述点云检测框的交并比IOU;Determining the intersection-over-union ratio IOU between the detection frame of the second target object and the point cloud detection frame in the second detection result;
在存在任一点云检测框与所述第二目标物体的检测框的IOU大于预设阈值的情况下,基于所述候选图像确定所述图像样本。The image sample is determined based on the candidate image when there is an IOU between any point cloud detection frame and the detection frame of the second target object that is greater than a preset threshold.
在一种可能的实施方式中,所述模型训练模块505具体配置为:In a possible implementation manner, the model training module 505 is specifically configured as:
从所述候选图像中将所述第二目标物体的检测框所对应的部分图像进行裁剪,得到所述图像样本。Cutting a part of the image corresponding to the detection frame of the second target object from the candidate image to obtain the image sample.
在一种可能的实施方式中,所述目标检测模块501还配置为:In a possible implementation manner, the target detection module 501 is further configured to:
按照预设时间间隔或帧数间隔,从待检测视频中获取所述当前帧图像。The current frame image is acquired from the video to be detected according to a preset time interval or frame number interval.
关于装置中的各模块的处理流程、以及各模块之间的交互流程的描述可以参照上述方法实施例中的相关说明,这里不再详述。For the description of the processing flow of each module in the device and the interaction flow between the modules, reference may be made to the relevant description in the above method embodiment, and details will not be described here.
基于同一技术构思,本公开实施例还提供了一种电子设备。参照图13所示,为本公开实施例提供的电子设备700的结构示意图,包括处理器701、存储器702、和总线703。其中,存储器702用于存储执行指令,包括内存7021和外部存储器7022;这里的内存7021也称内存储器,用于暂时存放处理器701中的运算数据,以及与硬盘等外部存储器7022交换的数据,处理器701通过内存7021与外部存储器7022进行数据交换。Based on the same technical idea, an embodiment of the present disclosure also provides an electronic device. Referring to FIG. 13 , it is a schematic structural diagram of an electronic device 700 provided by an embodiment of the present disclosure, including a processor 701 , a memory 702 , and a bus 703 . Among them, the memory 702 is used to store execution instructions, including a memory 7021 and an external memory 7022; the memory 7021 here is also called an internal memory, and is used to temporarily store calculation data in the processor 701 and exchange data with an external memory 7022 such as a hard disk. The processor 701 exchanges data with the external memory 7022 through the memory 7021 .
本公开实施例中,存储器702具体用于存储执行本公开实施例方案的应用程序代码,并由处理器701来控制执行。即,当电子设备700运行时,处理器701与存储器702之间通过总线703通信,使得处理器701执行存储器702中存储的应用程序代码,进而执行前述任一实施例中所述的方法。In the embodiment of the present disclosure, the memory 702 is specifically used to store application program codes for executing the solutions of the embodiments of the present disclosure, and the execution is controlled by the processor 701 . That is, when the electronic device 700 is running, the processor 701 communicates with the memory 702 through the bus 703, so that the processor 701 executes the application program code stored in the memory 702, and then executes the method described in any of the foregoing embodiments.
其中,存储器702可以是,但不限于,随机存取存储器(Random Access Memory,RAM),只读存储器(Read Only Memory,ROM),可编程只读存储器(Programmable Read-Only Memory,PROM),可擦除只读存储器(Erasable Programmable Read-Only Memory,EPROM),电可擦除只读存储器(Electric Erasable Programmable Read-Only Memory,EEPROM)等。Wherein, memory 702 can be, but not limited to, random access memory (Random Access Memory, RAM), read-only memory (Read Only Memory, ROM), programmable read-only memory (Programmable Read-Only Memory, PROM), can Erasable Programmable Read-Only Memory (EPROM), Electric Erasable Programmable Read-Only Memory (EEPROM), etc.
处理器701可能是一种集成电路芯片,具有信号的处理能力。上述的处理器可以是通用处理器,包括中央处理器(Central Processing Unit,CPU)、网络处理器(Network Processor,NP)等;还可以是数字信号处理器(DSP)、专用集成电路(ASIC)、现场可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本公开实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。The processor 701 may be an integrated circuit chip with signal processing capability. The above-mentioned processor can be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; it can also be a digital signal processor (DSP), an application-specific integrated circuit (ASIC) , field programmable gate array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components. Various methods, steps and logic block diagrams disclosed in the embodiments of the present disclosure may be implemented or executed. A general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
应理解,本公开实施例示意的结构并不构成对电子设备700的具体限定。在本公开另一些实施例中,电子设备700可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。It should be understood that the structure shown in the embodiment of the present disclosure does not constitute a specific limitation on the electronic device 700 . In other embodiments of the present disclosure, the electronic device 700 may include more or fewer components than shown in the illustration, or combine certain components, or separate certain components, or arrange different components. The illustrated components can be realized in hardware, software or a combination of software and hardware.
本公开实施例还提供一种计算机可读存储介质,该计算机可读存储介质上存储有计 算机程序,该计算机程序被处理器运行时执行上述方法实施例中的多目标跟踪方法的步骤。其中,该存储介质可以是易失性的计算机可读存储介质或非易失性的计算机可读取存储介质。Embodiments of the present disclosure also provide a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the steps of the multi-target tracking method in the foregoing method embodiments are executed. Wherein, the storage medium may be a volatile computer-readable storage medium or a non-volatile computer-readable storage medium.
本公开实施例还提供一种计算机程序产品,该计算机程序产品承载有程序代码,所述程序代码包括的指令可用于执行上述方法实施例中的多目标跟踪方法的步骤,具体可参见上述方法实施例。The embodiment of the present disclosure also provides a computer program product, the computer program product carries a program code, and the instructions included in the program code can be used to execute the steps of the multi-target tracking method in the above method embodiment, for details, please refer to the above method implementation example.
在一种可能的实施方式中,上述计算机程序产品可以具体通过硬件、软件或其结合的方式实现。例如,所述计算机程序产品具体体现为计算机存储介质或者软件产品,软件产品例如软件开发包(Software Development Kit,SDK)等等。In a possible implementation manner, the above computer program product may be specifically implemented by means of hardware, software or a combination thereof. For example, the computer program product is embodied as a computer storage medium or a software product, such as a software development kit (Software Development Kit, SDK) and the like.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统和装置的具体工作过程,可以参考前述方法实施例中的对应过程。在本公开所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,又例如,多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些通信接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。Those skilled in the art can clearly understand that for the convenience and brevity of description, for the specific working process of the system and device described above, reference can be made to the corresponding process in the foregoing method embodiments. In the several embodiments provided in the present disclosure, it should be understood that the disclosed systems, devices and methods may be implemented in other ways. The device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some communication interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
另外,在本公开各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。In addition, each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个处理器可执行的非易失的计算机可读取存储介质中。基于这样的理解,本公开的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本公开各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。If the functions are realized in the form of software function units and sold or used as independent products, they can be stored in a non-volatile computer-readable storage medium executable by a processor. Based on this understanding, the technical solution of the present disclosure is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in various embodiments of the present disclosure. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disc and other media that can store program codes. .
以上所述实施例,仅为本公开的具体实施方式,用以说明本公开的技术方案,而非对其限制,本公开的保护范围并不局限于此,尽管参照前述实施例对本公开进行了详细的说明,本领域的普通技术人员应当理解:任何熟悉本技术领域的技术人员在本公开揭露的技术范围内,其依然可以对前述实施例所记载的技术方案进行修改或可轻易想到变化,或者对其中部分技术特征进行等同替换;而这些修改、变化或者替换,并不使相应技术方案的本质脱离本公开实施例技术方案的精神和范围,都应涵盖在本公开的保护范围之内。因此,本公开的保护范围应所述以权利要求的保护范围为准。The above-described embodiments are only specific implementations of the present disclosure, and are used to illustrate the technical solutions of the present disclosure, not to limit them. The protection scope of the present disclosure is not limited thereto, although the present disclosure has been described with reference to the foregoing For detailed description, those of ordinary skill in the art should understand that: within the technical scope of the present disclosure, any person familiar with the technical field can still modify the technical solutions described in the foregoing embodiments or can easily think of changes. Or perform equivalent replacements for some of the technical features; and these modifications, changes or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the embodiments of the present disclosure, and should be covered within the protection scope of the present disclosure. Therefore, the protection scope of the present disclosure should be defined by the protection scope of the claims.
工业实用性Industrial Applicability
本公开实施例提供了一种多目标跟踪方法、装置、电子设备及存储介质,该多目标跟踪方法包括:对当前帧图像进行目标检测,得到检测出的至少一个第一目标物体的第一检测结果;提取所述第一目标物体的外观特征向量;计算所述第一目标物体的外观特征向量,与所述当前帧图像之前的至少一帧图像中检测出的各目标物体的外观特征向量之间的相似度;基于所述相似度,确定针对所述第一目标物体的目标跟踪结果;所述目 标跟踪结果用于反映所述第一目标物体在所述当前帧图像以及所述多帧图像中的检测结果。本公开实施例,能够提高多目标跟踪的稳定性和精度。Embodiments of the present disclosure provide a multi-target tracking method, device, electronic equipment, and storage medium. The multi-target tracking method includes: performing target detection on the current frame image to obtain the first detection of at least one first target object detected Result; extract the appearance feature vector of the first target object; calculate the appearance feature vector of the first target object, and the difference between the appearance feature vectors of each target object detected in at least one frame image before the current frame image Based on the similarity, determine the target tracking result for the first target object; the target tracking result is used to reflect the first target object in the current frame image and the multi-frame images The detection results in . The embodiments of the present disclosure can improve the stability and precision of multi-target tracking.

Claims (11)

  1. 一种多目标跟踪方法,应用于电子设备中,包括:A multi-target tracking method applied to electronic equipment, including:
    对当前帧图像进行目标检测,得到检测出的至少一个第一目标物体的第一检测结果;Perform target detection on the current frame image to obtain a first detection result of at least one detected first target object;
    提取所述第一目标物体的外观特征向量;extracting an appearance feature vector of the first target object;
    计算所述第一目标物体的外观特征向量,与所述当前帧图像之前的至少一帧图像中检测出的各目标物体的外观特征向量之间的相似度;calculating the similarity between the appearance feature vector of the first target object and the appearance feature vectors of each target object detected in at least one frame image before the current frame image;
    基于所述相似度,确定针对所述第一目标物体的目标跟踪结果;所述目标跟踪结果用于反映所述第一目标物体在所述当前帧图像以及所述至少一帧图像中的检测结果。Based on the similarity, determine a target tracking result for the first target object; the target tracking result is used to reflect a detection result of the first target object in the current frame image and the at least one frame image .
  2. 根据权利要求1所述的方法,其中,所述基于所述相似度,确定针对所述第一目标物体的目标跟踪结果,包括:The method according to claim 1, wherein said determining the target tracking result for the first target object based on the similarity includes:
    基于所述相似度,将所述第一检测结果和所述至少一帧图像中各目标物体的检测结果进行匹配,确定与所述第一目标物体匹配的至少一帧图像中的所述第一目标物体的检测结果;Based on the similarity, match the first detection result with the detection results of each target object in the at least one frame of images, and determine the first detection result in the at least one frame of images that matches the first target object The detection result of the target object;
    根据所述至少一帧图像中的所述第一目标物体的检测结果以及所述第一检测结果,确定针对所述第一目标物体的目标跟踪结果。Determine a target tracking result for the first target object according to the detection result of the first target object in the at least one frame of images and the first detection result.
  3. 根据权利要求1或2所述的方法,其中,所述第一检测结果中还包括所述第一目标物体的检测框信息、所述第一目标物体的种类、和所述第一目标物体的检测结果的置信度中的至少一种。The method according to claim 1 or 2, wherein the first detection result further includes the detection frame information of the first target object, the type of the first target object, and the At least one of the confidence levels of the detection results.
  4. 根据权利要求1-3任一所述的方法,其中,采用重识别模型提取所述第一目标物体的外观特征向量,所述重识别模型采用如下方法训练获得:The method according to any one of claims 1-3, wherein a re-identification model is used to extract the appearance feature vector of the first target object, and the re-recognition model is obtained by training as follows:
    获取图像样本集合,所述图像样本集合中包括多个图像样本及图像样本的标注信息,所述标注信息用于指示对应同一目标物体的图像样本;Obtain an image sample set, the image sample set includes a plurality of image samples and annotation information of the image samples, the annotation information is used to indicate image samples corresponding to the same target object;
    基于所述图像样本集合,对待训练的重识别模型进行训练,得到所述重识别模型。Based on the set of image samples, the re-identification model to be trained is trained to obtain the re-identification model.
  5. 根据权利要求4所述的方法,其中,所述图像样本为在驾驶场景下拍摄的。The method of claim 4, wherein the image samples are captured in a driving scene.
  6. 根据权利要求4或5所述的方法,其中,根据以下步骤获取图像样本:The method according to claim 4 or 5, wherein image samples are obtained according to the following steps:
    获取相机所采集的候选图像,对所述候选图像进行目标检测,得到指示检测出的至少一个第二目标物体的第二检测结果;Acquiring candidate images collected by the camera, performing target detection on the candidate images, and obtaining a second detection result indicating at least one detected second target object;
    获取与所述候选图像针对同一场景同步采集的点云数据,对所述点云数据进行检测,得到至少一个点云检测框;Obtain point cloud data collected synchronously with the candidate image for the same scene, detect the point cloud data, and obtain at least one point cloud detection frame;
    确定所述第二检测结果中所述第二目标物体的检测框与所述点云检测框的交并比IOU;Determining the intersection-over-union ratio IOU between the detection frame of the second target object and the point cloud detection frame in the second detection result;
    在存在任一点云检测框与所述第二目标物体的检测框的IOU大于预设阈值的情况下,基于所述候选图像确定所述图像样本。The image sample is determined based on the candidate image when there is an IOU between any point cloud detection frame and the detection frame of the second target object that is greater than a preset threshold.
  7. 根据权利要求6所述的方法,其中,所述基于所述候选图像确定所述图像样本,包括:The method of claim 6, wherein said determining said image samples based on said candidate images comprises:
    从所述候选图像中将所述第二目标物体的检测框所对应的部分图像进行裁剪,得到所述图像样本。Cutting a part of the image corresponding to the detection frame of the second target object from the candidate image to obtain the image sample.
  8. 根据权利要求1-7任一所述的方法,其中,所述基于目标检测模型对当前帧图像进行目标检测之前,所述方法还包括:The method according to any one of claims 1-7, wherein, before performing target detection on the current frame image based on the target detection model, the method further comprises:
    按照预设时间间隔或帧数间隔,从待检测视频中获取所述当前帧图像。The current frame image is acquired from the video to be detected according to a preset time interval or frame number interval.
  9. 一种多目标跟踪装置,应用于电子设备中,包括:A multi-target tracking device applied to electronic equipment, comprising:
    目标检测模块,用于对当前帧图像进行目标检测,得到检测出的至少一个第一目标 物体的第一检测结果。The target detection module is configured to perform target detection on the current frame image, and obtain a first detection result of at least one first target object detected.
    特征提取模块,用于提取所述第一目标物体的外观特征向量。A feature extraction module, configured to extract an appearance feature vector of the first target object.
    相似度计算模块,用于计算所述第一目标物体的外观特征向量,与所述当前帧图像之前的至少一帧图像中检测出的各目标物体的外观特征向量之间的相似度。A similarity calculation module, configured to calculate the similarity between the appearance feature vector of the first target object and the appearance feature vectors of each target object detected in at least one frame image before the current frame image.
    跟踪结果确定模块,用于基于所述相似度,确定针对所述第一目标物体的目标跟踪结果;所述目标跟踪结果用于反映所述第一目标物体在所述当前帧图像以及所述至少一帧图像中的检测结果。A tracking result determining module, configured to determine a target tracking result for the first target object based on the similarity; the target tracking result is used to reflect the first target object in the current frame image and the at least Detection results in a frame of images.
  10. 一种电子设备,包括:处理器、存储器和总线,所述存储器存储有所述处理器可执行的机器可读指令,当电子设备运行时,所述处理器与所述存储器之间通过总线通信,所述机器可读指令被所述处理器执行时执行如权利要求1-8任一所述的多目标跟踪方法。An electronic device, comprising: a processor, a memory, and a bus, the memory stores machine-readable instructions executable by the processor, and when the electronic device is running, the processor communicates with the memory through the bus , when the machine-readable instructions are executed by the processor, the multi-target tracking method according to any one of claims 1-8 is executed.
  11. 一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行如权利要求1-8任一所述的多目标跟踪方法。A computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is run by a processor, the multi-target tracking method according to any one of claims 1-8 is executed.
PCT/CN2022/075415 2021-09-30 2022-02-07 Multi-target tracking method and apparatus, and electronic device, storage medium and program WO2023050678A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111165457.4A CN113822910A (en) 2021-09-30 2021-09-30 Multi-target tracking method and device, electronic equipment and storage medium
CN202111165457.4 2021-09-30

Publications (1)

Publication Number Publication Date
WO2023050678A1 true WO2023050678A1 (en) 2023-04-06

Family

ID=78919964

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/075415 WO2023050678A1 (en) 2021-09-30 2022-02-07 Multi-target tracking method and apparatus, and electronic device, storage medium and program

Country Status (2)

Country Link
CN (1) CN113822910A (en)
WO (1) WO2023050678A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116665177A (en) * 2023-07-31 2023-08-29 福思(杭州)智能科技有限公司 Data processing method, device, electronic device and storage medium
CN116935446A (en) * 2023-09-12 2023-10-24 深圳须弥云图空间科技有限公司 Pedestrian re-recognition method and device, electronic equipment and storage medium
CN116958203A (en) * 2023-08-01 2023-10-27 北京知存科技有限公司 Image processing method and device, electronic equipment and storage medium
CN117557599A (en) * 2024-01-12 2024-02-13 上海仙工智能科技有限公司 3D moving object tracking method and system and storage medium

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113822910A (en) * 2021-09-30 2021-12-21 上海商汤临港智能科技有限公司 Multi-target tracking method and device, electronic equipment and storage medium
CN114549584A (en) * 2022-01-28 2022-05-27 北京百度网讯科技有限公司 Information processing method and device, electronic equipment and storage medium
TWI790957B (en) * 2022-04-06 2023-01-21 淡江大學學校財團法人淡江大學 A high-speed data association method for multi-object tracking
CN114937265B (en) * 2022-07-25 2022-10-28 深圳市商汤科技有限公司 Point cloud detection method, model training method, device, equipment and storage medium
CN116052062B (en) * 2023-03-07 2023-06-16 深圳爱莫科技有限公司 Robust tobacco display image processing method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107038423A (en) * 2017-04-20 2017-08-11 常州智行科技有限公司 A kind of vehicle is detected and tracking in real time
CN109325967A (en) * 2018-09-14 2019-02-12 腾讯科技(深圳)有限公司 Method for tracking target, device, medium and equipment
CN109859245A (en) * 2019-01-22 2019-06-07 深圳大学 Multi-object tracking method, device and the storage medium of video object
CN113822910A (en) * 2021-09-30 2021-12-21 上海商汤临港智能科技有限公司 Multi-target tracking method and device, electronic equipment and storage medium

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107945198B (en) * 2016-10-13 2021-02-23 北京百度网讯科技有限公司 Method and device for marking point cloud data
CN111291714A (en) * 2020-02-27 2020-06-16 同济大学 Vehicle detection method based on monocular vision and laser radar fusion
CN111369590A (en) * 2020-02-27 2020-07-03 北京三快在线科技有限公司 Multi-target tracking method and device, storage medium and electronic equipment
CN112102364A (en) * 2020-09-22 2020-12-18 广州华多网络科技有限公司 Target tracking method and device, electronic equipment and storage medium
CN112561963A (en) * 2020-12-18 2021-03-26 北京百度网讯科技有限公司 Target tracking method and device, road side equipment and storage medium
CN113158909B (en) * 2021-04-25 2023-06-27 中国科学院自动化研究所 Behavior recognition light-weight method, system and equipment based on multi-target tracking
CN113449632B (en) * 2021-06-28 2023-04-07 重庆长安汽车股份有限公司 Vision and radar perception algorithm optimization method and system based on fusion perception and automobile

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107038423A (en) * 2017-04-20 2017-08-11 常州智行科技有限公司 A kind of vehicle is detected and tracking in real time
CN109325967A (en) * 2018-09-14 2019-02-12 腾讯科技(深圳)有限公司 Method for tracking target, device, medium and equipment
CN109859245A (en) * 2019-01-22 2019-06-07 深圳大学 Multi-object tracking method, device and the storage medium of video object
CN113822910A (en) * 2021-09-30 2021-12-21 上海商汤临港智能科技有限公司 Multi-target tracking method and device, electronic equipment and storage medium

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116665177A (en) * 2023-07-31 2023-08-29 福思(杭州)智能科技有限公司 Data processing method, device, electronic device and storage medium
CN116665177B (en) * 2023-07-31 2023-10-13 福思(杭州)智能科技有限公司 Data processing method, device, electronic device and storage medium
CN116958203A (en) * 2023-08-01 2023-10-27 北京知存科技有限公司 Image processing method and device, electronic equipment and storage medium
CN116935446A (en) * 2023-09-12 2023-10-24 深圳须弥云图空间科技有限公司 Pedestrian re-recognition method and device, electronic equipment and storage medium
CN116935446B (en) * 2023-09-12 2024-02-20 深圳须弥云图空间科技有限公司 Pedestrian re-recognition method and device, electronic equipment and storage medium
CN117557599A (en) * 2024-01-12 2024-02-13 上海仙工智能科技有限公司 3D moving object tracking method and system and storage medium
CN117557599B (en) * 2024-01-12 2024-04-09 上海仙工智能科技有限公司 3D moving object tracking method and system and storage medium

Also Published As

Publication number Publication date
CN113822910A (en) 2021-12-21

Similar Documents

Publication Publication Date Title
WO2023050678A1 (en) Multi-target tracking method and apparatus, and electronic device, storage medium and program
US10360436B2 (en) Object detection device
Son et al. Robust multi-lane detection and tracking using adaptive threshold and lane classification
CN110619658B (en) Object tracking method, object tracking device and electronic equipment
JP2018523877A (en) System and method for object tracking
CN112997190B (en) License plate recognition method and device and electronic equipment
CN112614187A (en) Loop detection method, device, terminal equipment and readable storage medium
CN115641359B (en) Method, device, electronic equipment and medium for determining movement track of object
WO2023273344A1 (en) Vehicle line crossing recognition method and apparatus, electronic device, and storage medium
CN114049383A (en) Multi-target tracking method and device and readable storage medium
Specker et al. Improving multi-target multi-camera tracking by track refinement and completion
CN117593685B (en) Method and device for constructing true value data and storage medium
CN113674317B (en) Vehicle tracking method and device for high-level video
Al Mamun et al. Efficient lane marking detection using deep learning technique with differential and cross-entropy loss.
JP2023539643A (en) Identification of critical scenarios for vehicle confirmation and validation
CN112163521A (en) Vehicle driving behavior identification method, device and equipment
CN115908498B (en) Multi-target tracking method and device based on category optimal matching
CN110543818A (en) Traffic light tracking method, device, medium and equipment based on weight graph matching
Wu et al. Camera-based clear path detection
US20230008015A1 (en) Sensor fusion architecture for low-latency accurate road user detection
Sadik et al. Vehicles detection and tracking in advanced & automated driving systems: Limitations and challenges
KR101936108B1 (en) Method and apparatus for detecting traffic sign
Tseng et al. Efficient vehicle counting based on time-spatial images by neural networks
KR102172849B1 (en) Detecting system for approaching vehicle in video and method thereof
CN117456407B (en) Multi-target image tracking method and device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22874081

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE