WO2022188663A1 - Procédé et appareil de détection de cible - Google Patents

Procédé et appareil de détection de cible Download PDF

Info

Publication number
WO2022188663A1
WO2022188663A1 PCT/CN2022/078611 CN2022078611W WO2022188663A1 WO 2022188663 A1 WO2022188663 A1 WO 2022188663A1 CN 2022078611 W CN2022078611 W CN 2022078611W WO 2022188663 A1 WO2022188663 A1 WO 2022188663A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
point cloud
image
tracking trajectory
target tracking
Prior art date
Application number
PCT/CN2022/078611
Other languages
English (en)
Chinese (zh)
Inventor
吴家俊
梁振宝
周伟
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2022188663A1 publication Critical patent/WO2022188663A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/003Reconstruction from projections, e.g. tomography
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds

Definitions

  • the embodiments of the present application relate to the field of intelligent driving, and in particular, to a target detection method and device.
  • Most current object detection methods are based on a single type of sensor, such as only relying on lidar to obtain point clouds or only relying on cameras to obtain images.
  • the point cloud can provide the three-dimensional information of the target and can better overcome the problem of mutual occlusion of the target, but the point cloud is relatively sparse, and the recognition rate of the target features is not high.
  • images have richer information, but images are greatly affected by lighting, weather, etc., and the reliability of detection and tracking is poor.
  • the image only has two-dimensional plane information, and the information of the occluded target cannot be obtained, which is easy to lose the target or cause errors.
  • Embodiments of the present application provide a target detection method and device, so as to improve the accuracy and real-time performance of target detection.
  • an embodiment of the present application provides a target detection method, the method includes: acquiring a point cloud from a three-dimensional scanning device and an image from a vision sensor; placing the point cloud and at least one target tracking trajectory on the point cloud
  • the three-dimensional space position of the predicted target is input into the target detection model for processing, and the three-dimensional space position of at least one first target is obtained, wherein the target detection model is based on a plurality of three-dimensional space positions of the predicted target corresponding to the known target tracking trajectory.
  • the point cloud samples and the three-dimensional space position detection results of the multiple targets corresponding to the multiple point cloud samples one-to-one are obtained by training; according to the projection and the three-dimensional space position of the at least one first target in the image.
  • the at least one target tracking trajectory predicts the two-dimensional spatial position of the target in the image, and determines the two-dimensional spatial position of the at least one second target in the image; according to the two-dimensional spatial position of the at least one second target,
  • the projection in the point cloud determines the three-dimensional space position of the at least one second target in the point cloud.
  • a target tracking trajectory feedback mechanism is added.
  • a target tracking trajectory feedback mechanism When performing target detection in point clouds and images, more attention is paid to the area where the target tracking trajectory is located in the point clouds and images where the predicted target position is located, which can effectively reduce leakage. to improve the accuracy of target detection.
  • the method further includes: according to the target feature corresponding to the at least one target tracking track and the target feature of the at least one second target, performing the tracking on the at least one target tracking track and the at least one target tracking track and the at least one target tracking track.
  • a second target is matched; the matched target tracking trajectory is associated with the second target.
  • the target features include one or more of the following: position, size, speed, direction, category, number of point cloud points, numerical distribution of coordinates in each direction of point cloud, distribution of point cloud reflection intensity, appearance feature, depth features, etc.
  • the detected target can be associated with the existing target tracking trajectory based on the target feature, which is conducive to obtaining a complete target tracking trajectory and predicting the position where the target will appear at the next moment.
  • the method further includes: for the second target that is not matched to the target tracking trajectory, establishing a target tracking trajectory corresponding to the second target.
  • a new ID can be given to the target, and a target tracking trajectory corresponding to the target can be established, which is conducive to tracking all the targets that appear.
  • the method further includes: for the target tracking trajectory that is not matched to the second target, comparing the target tracking trajectory and the target tracking trajectory on the point cloud and/or the target tracking trajectory. predicted target associations in the image.
  • the target tracking trajectory can be associated with the predicted target of the target tracking trajectory in the point cloud and/or image, which is beneficial to avoid leakage due to leakage. It can detect the problem that the same target corresponds to multiple target tracking trajectories, and improve the reliability of target tracking.
  • the target tracking trajectory and the target tracking trajectory are in the point cloud and/or the image.
  • the method further includes: when the number of times the target tracking trajectory is associated with the predicted target is greater than or equal to a first threshold, deleting the target tracking trajectory.
  • deleting the target tracking trajectories for which the corresponding target is not detected in the acquired point cloud and/or image for many times is beneficial to save processing resources.
  • the method further includes: acquiring a calibration object point cloud from a three-dimensional scanning device and a calibration object image from a vision sensor; The three-dimensional coordinates and the two-dimensional coordinates in the calibration object image determine the projection matrix of the point cloud coordinate system and the image coordinate system.
  • the three-dimensional scanning device and the visual sensor can be jointly calibrated by the calibration object, and the projection matrix of the point cloud coordinate system and the image coordinate system (also called the pixel coordinate system) can be determined, which is beneficial to the point cloud and image.
  • the target detection results are fused to improve the accuracy of target detection.
  • an embodiment of the present application provides a target detection device, the device has the function of implementing the first aspect or any possible method in the design of the first aspect, and the function can be implemented by hardware or by The hardware executes the corresponding software implementation.
  • the hardware or software includes one or more units (modules) corresponding to the above functions, such as an acquisition unit and a processing unit.
  • an embodiment of the present application provides a target detection apparatus, including at least one processor and an interface, where the processor is configured to call and run a computer program from the interface, and when the processor executes the computer program,
  • a target detection apparatus including at least one processor and an interface, where the processor is configured to call and run a computer program from the interface, and when the processor executes the computer program,
  • an embodiment of the present application provides a terminal, where the terminal includes the device described in the second aspect above.
  • the terminal may be a vehicle-mounted device, a vehicle, a monitoring controller, an unmanned aerial vehicle, a robot, a roadside unit, or the like.
  • the terminal may also be a smart device that needs to perform target detection or tracking, such as smart home and smart manufacturing.
  • an embodiment of the present application provides a chip system, the chip system includes: a processor and an interface, the processor is configured to call and run a computer program from the interface, and when the processor executes the computer program , the method described in the first aspect or any possible design of the first aspect can be implemented.
  • an embodiment of the present application provides a computer-readable storage medium, the computer-readable storage medium having a computer for executing the method described in the first aspect or any possible design of the first aspect program.
  • an embodiment of the present application further provides a computer program product, including a computer program or instruction, when the computer program or instruction is executed, the first aspect or any possible design of the first aspect can be implemented method described in.
  • FIG. 1 is a schematic diagram of a target detection system provided by an embodiment of the present application.
  • FIG. 2 is a schematic flowchart of a target detection process provided by an embodiment of the present application.
  • FIG. 3 is a schematic diagram of an intelligent driving scenario provided by an embodiment of the present application.
  • FIG. 4 is one of the schematic diagrams of a target detection solution based on multi-sensor fusion provided by an embodiment of the present application
  • FIG. 5 is the second schematic diagram of the target detection solution based on multi-sensor fusion provided by the embodiment of the present application.
  • FIG. 6 is a third schematic diagram of a target detection solution based on multi-sensor fusion provided by an embodiment of the present application.
  • FIG. 7 is a schematic process diagram of a target detection method provided by an embodiment of the present application.
  • FIG. 8 is one of the schematic diagrams of the target detection apparatus provided by the embodiment of the present application.
  • FIG. 9 is a second schematic diagram of a target detection apparatus provided by an embodiment of the present application.
  • FIG. 1 is a schematic diagram of a target detection system provided for the implementation of this application, including a data preprocessing module, a joint calibration module, a point cloud detection module, an image region of interest acquisition module, a point cloud domain prediction module, an image domain prediction module, and a prediction module. Decision-making module, data association module, trajectory management module.
  • the data preprocessing module is mainly used to filter point clouds, remove ground points, and perform distortion correction on images.
  • Joint calibration module It is mainly used to jointly calibrate the point cloud and image obtained by the 3D scanning device and the vision sensor, and obtain the projection matrix between the point cloud coordinate system and the image coordinate system.
  • Point cloud detection module It is mainly used to input the point cloud obtained at the current moment and the results of the feedback target tracking trajectory management (such as at least one target tracking trajectory predicting the three-dimensional space position of the target in the point cloud obtained at the current moment) into the trained In a good target detection model (such as a deep neural network model), the target detection results are obtained.
  • a good target detection model such as a deep neural network model
  • Image ROI acquisition module It is mainly used to project the target detection results obtained based on the point cloud into the image using the projection matrix, and combine the results of the feedback target tracking trajectory management (such as at least one target tracking trajectory obtained at the current moment). The two-dimensional spatial position of the predicted target in the image) to obtain the region of interest.
  • Prediction decision module It is mainly used to back-project the target detection result of the image to the point cloud, and compare it with the target detection result of the point cloud to decide a more accurate target detection result.
  • Data association module It is mainly used to associate and match the target detection result after the prediction decision and the target tracking trajectory.
  • Trajectory management module It is mainly used to manage and update all target tracking trajectories according to the data association results.
  • Point cloud domain prediction module It is mainly used to predict the three-dimensional space position of the target in the point cloud obtained by the target tracking trajectory based on the updated target tracking trajectory at the next moment.
  • Image domain prediction module It is mainly used to predict the two-dimensional spatial position of the target in the image obtained at the next moment based on the updated target tracking trajectory and predicting the target tracking trajectory.
  • the structure of the target detection system illustrated in the embodiments of the present application does not constitute a specific limitation on the target detection system.
  • the target detection system may include more or less modules than shown, or some modules may be combined, or some modules may be split, or different modules are arranged.
  • the target detection solution provided in the embodiment of the present application can be applied to a terminal to which the target detection system shown in FIG. 1 is applied, and the terminal can be a vehicle-mounted device, a vehicle, a monitoring controller, an unmanned aerial vehicle, a robot, a roadside unit ( Road side unit, RSU) and other equipment, suitable for monitoring, intelligent driving, drone navigation, robot travel and other scenarios.
  • a terminal to which the target detection system shown in FIG. 1 is applied in an intelligent driving scenario is used as an example for description.
  • a terminal (such as vehicle A) can obtain point clouds and images of the surrounding environment through the three-dimensional scanning device(s) and visual sensor(s) set on the terminal, and can monitor the surrounding environment.
  • Vehicles such as vehicle B, vehicle C, etc.
  • pedestrians, bicycles (not shown in the figure), trees (not shown in the figure) and other objects are detected and tracked.
  • the target detection solutions based on multi-sensor fusion mainly include the following:
  • the first scheme uses a deep convolutional neural network to detect the three-dimensional spatial position of the target and extract the point cloud features after obtaining the point cloud from the lidar.
  • Acquire an image from a monocular camera project the 3D boundary of the object detected from the point cloud to the image, and use a deep convolutional neural network to extract image features of the projected area.
  • the bipartite graph matching relationship between the target and the target tracking trajectory is combined with the Kalman filter to estimate the state of the target tracking trajectory, so as to achieve the tracking of the target in the point cloud.
  • this scheme uses a deep network for feature extraction in images and point clouds at the same time, which consumes more resources, has low computational efficiency, and is poorly implemented; and once there is a missed detection in the point cloud obtained based on lidar, it cannot be retrieved through the image. Missing target, low accuracy.
  • this scheme first uses the deep learning algorithm to obtain the target detection information in the collected images and point clouds.
  • this scheme uses the deep learning image target detection algorithm for the image to obtain the two-dimensional (2-dimension, 2D) detection frame category, the pixel coordinate position of the center point and the length and width size information of the target in the image; use the deep learning point cloud target detection for the point cloud
  • the algorithm obtains the information of the three-dimensional (3-dimension, 3D) detection frame type, the spatial coordinates of the center point and the length, width and height of the target in the point cloud.
  • the Hungarian algorithm is used to optimally match the detection frame of the image obtained at the adjacent moment and the target in the point cloud to achieve target tracking, and establish the target tracking trajectory of the image and the point cloud respectively.
  • this scheme uses deep learning algorithm for feature extraction in images and point clouds at the same time, which consumes more resources and has poor real-time performance; in addition, there is no real tracking algorithm, and the detection frame and Distance matching of detection boxes is error-prone.
  • the third scheme As shown in Figure 6, this scheme collects the point cloud of the target, filters the collected point cloud, outputs the ground object point data after filtering out the ground points, and maps the obtained ground object point data to generate distance Image and based on the reflection intensity image, perform point cloud segmentation and clustering on the object point data according to the distance image, reflection intensity image and echo intensity information to obtain a plurality of point cloud regions.
  • the target point cloud area of the suspected target is screened out from the point cloud area; the feature extraction is performed on each target point cloud area, and the extracted feature vector is used to classify the target to identify the target, and obtain the first target detection result.
  • the purpose of this application is to provide a target detection solution.
  • the target detection result in the point cloud is corrected by the target detection result in the image, and the target tracking trajectory feedback mechanism is used to reduce the missed detection rate and improve the accuracy and real-time performance of target detection. .
  • Point cloud the set of point data on the surface of the object scanned by the 3D scanning device can be called a point cloud.
  • a point cloud is a collection of vectors in a three-dimensional coordinate system. These vectors are usually expressed in the form of x, y, z three-dimensional coordinates, and are generally used to represent the outer surface shape of an object. Not only that, in addition to the geometric position information represented by (x, y, z), the point cloud can also represent the RGB color, gray value, depth, intensity of the object's reflective surface, etc. of a point.
  • the point cloud coordinate system involved in the embodiments of the present application is the three-dimensional (x, y, z) coordinate system where the point cloud points in the point cloud are located.
  • the image coordinate system also known as the pixel coordinate system, is usually a two-dimensional coordinate system established with the upper left corner of the image as the origin, and the unit is pixel.
  • the two coordinate axes of the image coordinate system consist of u and v.
  • the coordinates of a point in the image coordinate system can be identified as (u, v).
  • Corner points are points with particularly prominent attributes in a certain aspect, and refer to representative and robust points in point clouds and images, such as the intersection of two sides.
  • region of interest in image processing, the area to be processed is outlined from the processed image in the form of boxes, circles, ellipses, irregular polygons, etc., which is called the region of interest.
  • the region of interest may be considered as a region in an image where a target exists.
  • FIG. 7 is a schematic diagram of a target detection method provided by an embodiment of the present application, and the method includes:
  • S701 The terminal acquires the point cloud from the three-dimensional scanning device and the image from the vision sensor.
  • the three-dimensional scanning device can be a lidar, a millimeter-wave radar, a depth camera, etc.
  • the visual sensor can be a monocular camera, a multi-eye camera, and the like.
  • At least one three-dimensional scanning device and at least one visual sensor may be installed on the terminal, and the terminal may scan objects around the terminal (or in a certain direction, such as the direction of travel) through the three-dimensional scanning device, and collect The point cloud of objects around the terminal (or in a certain direction); it is also possible to scan the objects around the terminal (or in a certain direction) through the vision sensor, and collect images of the objects around the terminal (or in a certain direction).
  • the point cloud may be a collection of point cloud points, and the information of each point cloud point in the collection includes the three-dimensional coordinates (x, y, z) of the point cloud point.
  • the information of each point cloud point can also include information such as laser reflection intensity or millimeter wave reflection intensity.
  • the terminal when the terminal starts to initially acquire the point cloud from the 3D scanning device and the image from the vision sensor, it can also obtain the acquisition time of the point cloud and the image from the 3D scanning device and the visual sensor. Therefore, according to the acquisition time of the point cloud and the image, the point cloud and the image obtained from the 3D scanning device and the vision sensor are time-aligned to ensure that the same set of point clouds and images for target detection have the same acquisition time.
  • the terminal may further perform a data preprocessing operation on the point cloud and/or the image. For example, the terminal can filter the point cloud, remove the ground point cloud points, reduce the data volume of the point cloud, and improve the target detection efficiency; it can also be based on the internal and external parameters of the visual sensor (usually provided by the visual sensor manufacturer). The barrel distortion or pincushion distortion that exists in the collected image is corrected for distortion.
  • the terminal can remove the point cloud points that meet the above conditions in the above point cloud according to the pre-given conditions that the point cloud points belonging to the ground should meet (for example, the z-coordinate of the point cloud point is less than a certain threshold), The point cloud points on the ground are filtered out, thereby reducing the data volume of the point cloud and improving the efficiency of target detection.
  • S702 The terminal inputs the point cloud and the three-dimensional space position of the target predicted in the point cloud and the at least one target tracking trajectory into the target detection model for processing, and obtains the three-dimensional space position of at least one first target.
  • the three-dimensional space position of the target includes information such as center point coordinates, length, width and height, which can also be called a three-dimensional detection box or a three-dimensional bounding box (3D BBox).
  • the target detection model is based on the prediction corresponding to the known target tracking trajectory.
  • the multiple point cloud samples of the three-dimensional spatial position of the target and the three-dimensional spatial position detection results of the multiple targets corresponding to the multiple point cloud samples one-to-one are obtained by training.
  • a target tracking track corresponds to a target, and the target tracking track records information of the target, such as an identity document (ID), target characteristics, existence time, and each frame in which the target exists.
  • ID identity document
  • target characteristics target characteristics
  • existence time time
  • each frame in which the target exists The three-dimensional space position in the point cloud, the two-dimensional space position in each frame of image where the target exists, etc.
  • the target can be tracked in the point cloud by the Kalman algorithm, etc.
  • the target can be predicted in the next The three-dimensional space position that appears in the frame point cloud (that is, the point cloud collected at the next moment), that is, the target tracking trajectory can be obtained to predict the three-dimensional space position of the target in the next frame point cloud; Tracking in the image, according to the two-dimensional space position of the target in each frame of the target image in the target tracking track corresponding to the target, through the optical flow algorithm, etc. can predict the target in the next frame of image (that is, the next moment to collect the image.
  • the two-dimensional space position that appears in the image that is, the two-dimensional space position of the target tracking trajectory in the next frame image can be obtained.
  • the existing target tracking trajectory in the current point cloud predicts the three-dimensional space position of the target in the location area where the probability of the target appearing is significantly higher than that in other location areas in the point cloud.
  • the terminal can predict the three-dimensional space position of the target in the point cloud by processing the point cloud and at least one target tracking trajectory by the target detection model.
  • the target detection model can be a plurality of point cloud samples that predict the three-dimensional spatial position of the target based on the known target tracking trajectories maintained in the sample set by the training device, and a plurality of point cloud samples corresponding to the plurality of point cloud samples one-to-one.
  • the three-dimensional space position detection result of the target is obtained by training.
  • the training device can add a three-dimensional space position label vector (such as center point coordinates, length, width, height, etc.) to each point cloud sample according to the three-dimensional space position of the target corresponding to each point cloud sample. label vector of information).
  • a three-dimensional space position label vector such as center point coordinates, length, width, height, etc.
  • label vector of information can be added to each point cloud sample, which correspond to multiple targets one-to-one.
  • the spatial location label vector can also exist in the form of a matrix.
  • the training device can input the 3D space position of the predicted target corresponding to the point cloud sample and the target tracking track(s) into the target detection model for processing , obtain the predicted value of the three-dimensional space position of the target (one or more) output by the target detection model, according to the predicted value of the three-dimensional space position of the output target and the three-dimensional space position label vector of the real target corresponding to the point cloud sample, through the loss function ( loss function) training equipment can calculate the loss of the target detection model. Adjust the parameters in the target detection model according to the loss.
  • the training process of the target detection model becomes the process of reducing the loss as much as possible.
  • the target detection model is continuously trained through the point cloud samples in the sample set. When the loss is reduced to a preset range, the trained target detection model can be obtained.
  • the target detection model may be a deep neural network or the like.
  • the point cloud samples in the training set can be obtained by pre-sampling, such as pre-collecting point cloud samples through the terminal, and predicting the three-dimensional shape of the predicted target in the collected point cloud samples according to the target tracking trajectory (one or more).
  • the spatial position is recorded, and the three-dimensional spatial position of the real target existing in the point cloud sample is marked at the same time.
  • the above training equipment can be a personal computer (PC), a notebook computer, a server, etc., or a terminal. If the training equipment and the terminal are not the same equipment, after the training equipment has completed the training of the target detection model, the training completed can be used.
  • the target detection model is imported into the terminal, so that the terminal can detect the first target in the acquired point cloud.
  • S703 The terminal predicts the two-dimensional space position of the target in the image according to the projection of the three-dimensional space position of the at least one first target in the image and the at least one target tracking trajectory in the image, and determines the position of the target in the image. two-dimensional spatial location of at least one second object.
  • the three-dimensional space position in the point cloud can be projected into the image, and the two-dimensional space position in the image can be obtained.
  • the dimensional space position is projected into the point cloud, and the 3D space position in the point cloud is obtained.
  • the projection matrix for the determination of the projection matrix, several calibration objects (such as a three-dimensional carton with multiple edges and corners) can be preset and placed in the common field of view of the 3D scanning device and the vision sensor, and the calibration object points are collected by the 3D scanning device and the vision sensor.
  • Cloud and calibration object image select multiple calibration points (such as the corners of the three-dimensional carton) in the collected calibration object point cloud and calibration object image, and obtain the three-dimensional coordinates of the multiple calibration points in the calibration object point cloud and the calibration object.
  • the projection matrix of the point cloud coordinate system and the image coordinate system can be solved according to the three-dimensional coordinates of multiple calibration points in the calibration object point cloud and the two-dimensional coordinates in the calibration object image.
  • K is the internal parameter matrix of the visual sensor.
  • the internal parameter matrix of the visual sensor is fixed after leaving the factory and is usually provided by the manufacturer or obtained through a calibration algorithm.
  • [R, T] is the external parameter matrix of the visual sensor. 3)
  • the three-dimensional coordinates of the calibration point in the point cloud of the calibration object and the two-dimensional coordinates in the image of the calibration object, the projection matrix M from the point cloud coordinate system to the image coordinate system can be solved.
  • the terminal may also add feedback on the predicted target of the target tracking trajectory when detecting the second target in the image, and convert the two-dimensional spatial position obtained by the projection of at least one first target in the image.
  • at least one target tracking trajectory predicts the two-dimensional space position of the target in the image as a target, and the two-dimensional space position obtained by projection and the two-dimensional space position of the predicted target are output as the two-dimensional space position of the second target.
  • S704 The terminal determines the three-dimensional space position of the at least one second target in the point cloud according to the projection of the two-dimensional space position of the at least one second target in the point cloud.
  • the terminal projects the two-dimensional spatial position of the at least one second target in the image into the point cloud to obtain the three-dimensional spatial position of the at least one second target in the point cloud, and obtains the final target detection result output of the point cloud.
  • the features of the second target may include target features in a three-dimensional space position in the point cloud and target features in a two-dimensional space position in the image.
  • the target features in the three-dimensional space position in the point cloud may include position (such as center point coordinates), size (such as length, width and height), speed, direction, category, number of point cloud points, coordinate value distribution in each direction of point cloud, point cloud Reflection intensity distribution (such as point cloud reflection intensity distribution histogram), depth features, etc.
  • the target features of the two-dimensional space position in the image include position (center point coordinates), size (such as length and width), speed, direction, category, appearance Features (such as image color histogram, directional gradient histogram), etc.
  • a target tracking trajectory corresponds to a target, and the target tracking trajectory records the information of the target, such as ID, target characteristics, existence time, three-dimensional space position in each frame of point cloud where the target exists, The two-dimensional spatial position in each frame of images of the target, etc., in order to achieve the tracking of the same target, in some embodiments, the terminal can detect at least one second target according to the target feature corresponding to the existing at least one target tracking trajectory The target feature is matched with the at least one target tracking trajectory and the at least one second target. The second target matched to the target tracking trajectory is associated with the target tracking trajectory to improve the existing target tracking trajectory.
  • the matching degree (or similarity) between the target feature of the at least one target tracking trajectory and the target feature of the at least one second target can be used as the cost matrix, and the Hungarian algorithm can be used to analyze the at least one target tracking trajectory and the at least one target tracking trajectory.
  • the second objective performs global optimal matching.
  • the Hungarian algorithm is a combinatorial optimization algorithm that solves the task assignment problem in polynomial time.
  • the terminal When calculating the similarity between the target feature of the target tracking trajectory and the target feature of the second target, the terminal considers the position (in the point cloud and/or in the image), size (in the point cloud and/or in the image), speed (in the point cloud and/or in the image), direction (in the point cloud and/or in the image), category (in the point cloud and/or in the image), number of points in the point cloud, numerical distribution of coordinates in each direction of the point cloud, point cloud reflection
  • One or more of the target features such as intensity distribution, appearance feature, depth feature, etc. When multiple target features are considered, different target features can be assigned different weights, and the sum of the weighted values is 1.
  • the terminal can be the target. Assign a new target tracking track ID to create a new target tracking track.
  • the terminal can associate the target tracking trajectory with the predicted target of the target tracking trajectory in the point cloud and/or image, improve the target tracking trajectory, and avoid missed detections, etc.
  • the reason is that the same target corresponds to multiple target tracking trajectories.
  • the target tracking trajectory and the target tracking trajectory in the point cloud and/or image are Before predicting the target association, if the number of times the target tracking trajectory is associated with the predicted target is greater than or equal to the first threshold, the terminal deletes the target tracking trajectory.
  • the apparatus may include corresponding hardware structures and/or software modules for performing each function.
  • the present application can be implemented in hardware or a combination of hardware and computer software with the units and algorithm steps of each example described in conjunction with the embodiments disclosed herein. Whether a function is performed by hardware or computer software driving hardware depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each particular application, but such implementations should not be considered beyond the scope of this application.
  • FIG. 8 shows a possible exemplary block diagram of the target detection apparatus involved in the embodiment of the present application, and the target detection apparatus 800 may exist in the form of a software module or a hardware module.
  • the target detection apparatus 800 may include: an acquisition unit 803 and a processing unit 802 .
  • the device may be a chip.
  • the apparatus 800 may further include a storage unit 801 for storing program codes and/or data of the apparatus 800 .
  • the acquiring unit 803 is configured to acquire the point cloud from the three-dimensional scanning device and the image from the vision sensor;
  • the processing unit 802 is configured to input the point cloud and at least one target tracking trajectory in the point cloud to predict the three-dimensional space position of the target into the target detection model for processing, and obtain the three-dimensional space position of at least one first target, wherein the three-dimensional space position of the first target is obtained.
  • the target detection model is obtained by training based on multiple point cloud samples of the three-dimensional spatial position of the predicted target corresponding to the known target tracking trajectory, and the three-dimensional spatial position detection results of the multiple targets corresponding to the multiple point cloud samples one-to-one. of;
  • the processing unit 802 is further configured to predict the two-dimensional spatial position of the target according to the projection of the three-dimensional spatial position of the at least one first target in the image and the at least one target tracking trajectory in the image, and determine the two-dimensional spatial position of at least one second object in the image;
  • the processing unit 802 is further configured to determine the three-dimensional spatial position of the at least one second target in the point cloud according to the projection of the two-dimensional spatial position of the at least one second target in the point cloud.
  • the processing unit 802 is further configured to, according to the target feature corresponding to the at least one target tracking track and the target feature of the at least one second target, perform the tracking of the at least one target tracking track and the The at least one second target is matched; the matched target tracking trajectory is associated with the second target.
  • the processing unit 802 is further configured to establish a target tracking trajectory corresponding to the second target for the second target that is not matched to the target tracking trajectory.
  • the processing unit 802 is further configured to, for the target tracking trajectory that is not matched to the second target, place the target tracking trajectory and the target tracking trajectory in the point cloud and/or predicted target associations in the image.
  • the processing unit 802 compares the target tracking trajectory with the target tracking trajectory in the point cloud and/or the target tracking trajectory for the target tracking trajectory that is not matched to the second target. Before being associated with the predicted target in the image, it is also used for deleting the target tracking trajectory when the number of times the target tracking trajectory is associated with the predicted target is greater than or equal to a first threshold.
  • the target features include one or more of the following: position, length, width, height, speed, direction, category, number of point cloud points, coordinate value distribution in each direction of the point cloud, and point cloud reflection Intensity distribution, appearance features, depth features.
  • the acquiring unit 803 is further configured to acquire the calibration object point cloud from the three-dimensional scanning device and the calibration object image from the vision sensor;
  • the processing unit 802 is further configured to determine a point cloud coordinate system and an image coordinate system according to the three-dimensional coordinates of a plurality of calibration points in the calibration object in the calibration object point cloud and the two-dimensional coordinates in the calibration object image projection matrix.
  • an embodiment of the present application further provides a target detection apparatus 900 .
  • the target detection apparatus 900 includes at least one processor 902 and an interface circuit. Further, the apparatus further includes at least one memory 901 , and the at least one memory 901 is connected to the processor 902 .
  • the interface circuit is used to provide input and output of data and/or information for the at least one processor.
  • the memory 901 is used to store the computer-executed instructions.
  • the processor 902 executes the computer-executed instructions stored in the memory 901, so that the target detection device 900 can realize the above-mentioned target detection method.
  • the target detection device 900 executes the computer-executed instructions stored in the memory 901, so that the target detection device 900 can realize the above-mentioned target detection method.
  • a computer-readable storage medium on which a program or an instruction is stored, and when the program or instruction is executed, the target detection method in the above method embodiment can be executed.
  • a computer program product including an instruction is provided, and when the instruction is executed, the target detection method in the above method embodiment can be executed.
  • a chip is provided.
  • the chip can be coupled with a memory and is used to call a computer program product stored in the memory to implement the target detection method in the above method embodiments.
  • the embodiments of the present application may be provided as a method, a system, or a computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
  • computer-usable storage media including, but not limited to, disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions
  • the apparatus implements the functions specified in the flow or flow of the flowcharts and/or the block or blocks of the block diagrams.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Length Measuring Devices By Optical Means (AREA)
  • Image Analysis (AREA)

Abstract

La présente demande se rapporte au domaine de la conduite intelligente. Sont décrits un procédé et un appareil de détection de cible, qui sont utilisés pour améliorer la précision et les performances en temps réel de détection de cible. Le procédé consiste à : acquérir un nuage de points à partir d'un dispositif de balayage tridimensionnel et une image à partir d'un capteur visuel ; entrer, dans un modèle de détection de cible, le nuage de points, et la position spatiale tridimensionnelle d'une cible prédite d'au moins une trajectoire de suivi de cible dans le nuage de points, et traiter ceux-ci, de façon à obtenir la position spatiale tridimensionnelle d'au moins une première cible ; en fonction de la projection de la position spatiale tridimensionnelle de la au moins une première cible dans l'image et de la position spatiale bidimensionnelle de la cible prédite de la au moins une trajectoire de suivi de cible dans l'image, déterminer la position spatiale bidimensionnelle d'au moins une seconde cible dans l'image ; et en fonction de la projection de la position spatiale bidimensionnelle de la au moins une seconde cible dans le nuage de points, déterminer la position spatiale tridimensionnelle de la au moins une seconde cible dans le nuage de points.
PCT/CN2022/078611 2021-03-09 2022-03-01 Procédé et appareil de détection de cible WO2022188663A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110256851.2 2021-03-09
CN202110256851.2A CN115049700A (zh) 2021-03-09 2021-03-09 一种目标检测方法及装置

Publications (1)

Publication Number Publication Date
WO2022188663A1 true WO2022188663A1 (fr) 2022-09-15

Family

ID=83156444

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/078611 WO2022188663A1 (fr) 2021-03-09 2022-03-01 Procédé et appareil de détection de cible

Country Status (2)

Country Link
CN (1) CN115049700A (fr)
WO (1) WO2022188663A1 (fr)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115830079A (zh) * 2023-02-15 2023-03-21 天翼交通科技有限公司 交通参与者的轨迹追踪方法、装置及介质
CN115965824A (zh) * 2023-03-01 2023-04-14 安徽蔚来智驾科技有限公司 点云数据标注方法、点云目标检测方法、设备及存储介质
CN116071231A (zh) * 2022-12-16 2023-05-05 群滨智造科技(苏州)有限公司 眼镜框的点油墨工艺轨迹的生成方法、装置、设备及介质
CN116430338A (zh) * 2023-03-20 2023-07-14 北京中科创益科技有限公司 一种移动目标的追踪方法、系统及设备
CN116952988A (zh) * 2023-09-21 2023-10-27 斯德拉马机械(太仓)有限公司 一种用于ecu产品的2d线扫检测方法及系统
CN117252992A (zh) * 2023-11-13 2023-12-19 整数智能信息技术(杭州)有限责任公司 基于时序数据的4d道路场景标注方法及装置、电子设备
CN117523379A (zh) * 2023-11-20 2024-02-06 广东海洋大学 基于ai的水下摄影目标定位方法及系统
CN117576166A (zh) * 2024-01-15 2024-02-20 浙江华是科技股份有限公司 基于相机和低帧率激光雷达的目标跟踪方法及系统

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116664790B (zh) * 2023-07-26 2023-11-17 昆明人为峰科技有限公司 基于无人机测绘的三维地形分析系统及方法

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107871129A (zh) * 2016-09-27 2018-04-03 北京百度网讯科技有限公司 用于处理点云数据的方法和装置
CN110675431A (zh) * 2019-10-08 2020-01-10 中国人民解放军军事科学院国防科技创新研究院 一种融合图像和激光点云的三维多目标跟踪方法
US20200160542A1 (en) * 2018-11-15 2020-05-21 Toyota Research Institute, Inc. Systems and methods for registering 3d data with 2d image data
CN111709923A (zh) * 2020-06-10 2020-09-25 中国第一汽车股份有限公司 一种三维物体检测方法、装置、计算机设备和存储介质
CN112102409A (zh) * 2020-09-21 2020-12-18 杭州海康威视数字技术股份有限公司 目标检测方法、装置、设备及存储介质
CN112270272A (zh) * 2020-10-31 2021-01-26 武汉中海庭数据技术有限公司 高精度地图制作中道路路口提取方法及系统
US20210043002A1 (en) * 2018-09-11 2021-02-11 Tencent Technology (Shenzhen) Company Limited Object annotation method and apparatus, movement control method and apparatus, device, and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107871129A (zh) * 2016-09-27 2018-04-03 北京百度网讯科技有限公司 用于处理点云数据的方法和装置
US20210043002A1 (en) * 2018-09-11 2021-02-11 Tencent Technology (Shenzhen) Company Limited Object annotation method and apparatus, movement control method and apparatus, device, and storage medium
US20200160542A1 (en) * 2018-11-15 2020-05-21 Toyota Research Institute, Inc. Systems and methods for registering 3d data with 2d image data
CN110675431A (zh) * 2019-10-08 2020-01-10 中国人民解放军军事科学院国防科技创新研究院 一种融合图像和激光点云的三维多目标跟踪方法
CN111709923A (zh) * 2020-06-10 2020-09-25 中国第一汽车股份有限公司 一种三维物体检测方法、装置、计算机设备和存储介质
CN112102409A (zh) * 2020-09-21 2020-12-18 杭州海康威视数字技术股份有限公司 目标检测方法、装置、设备及存储介质
CN112270272A (zh) * 2020-10-31 2021-01-26 武汉中海庭数据技术有限公司 高精度地图制作中道路路口提取方法及系统

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116071231A (zh) * 2022-12-16 2023-05-05 群滨智造科技(苏州)有限公司 眼镜框的点油墨工艺轨迹的生成方法、装置、设备及介质
CN116071231B (zh) * 2022-12-16 2023-12-29 群滨智造科技(苏州)有限公司 眼镜框的点油墨工艺轨迹的生成方法、装置、设备及介质
CN115830079A (zh) * 2023-02-15 2023-03-21 天翼交通科技有限公司 交通参与者的轨迹追踪方法、装置及介质
CN115965824A (zh) * 2023-03-01 2023-04-14 安徽蔚来智驾科技有限公司 点云数据标注方法、点云目标检测方法、设备及存储介质
CN116430338B (zh) * 2023-03-20 2024-05-10 北京中科创益科技有限公司 一种移动目标的追踪方法、系统及设备
CN116430338A (zh) * 2023-03-20 2023-07-14 北京中科创益科技有限公司 一种移动目标的追踪方法、系统及设备
CN116952988A (zh) * 2023-09-21 2023-10-27 斯德拉马机械(太仓)有限公司 一种用于ecu产品的2d线扫检测方法及系统
CN116952988B (zh) * 2023-09-21 2023-12-08 斯德拉马机械(太仓)有限公司 一种用于ecu产品的2d线扫检测方法及系统
CN117252992A (zh) * 2023-11-13 2023-12-19 整数智能信息技术(杭州)有限责任公司 基于时序数据的4d道路场景标注方法及装置、电子设备
CN117252992B (zh) * 2023-11-13 2024-02-23 整数智能信息技术(杭州)有限责任公司 基于时序数据的4d道路场景标注方法及装置、电子设备
CN117523379B (zh) * 2023-11-20 2024-04-30 广东海洋大学 基于ai的水下摄影目标定位方法及系统
CN117523379A (zh) * 2023-11-20 2024-02-06 广东海洋大学 基于ai的水下摄影目标定位方法及系统
CN117576166A (zh) * 2024-01-15 2024-02-20 浙江华是科技股份有限公司 基于相机和低帧率激光雷达的目标跟踪方法及系统
CN117576166B (zh) * 2024-01-15 2024-04-30 浙江华是科技股份有限公司 基于相机和低帧率激光雷达的目标跟踪方法及系统

Also Published As

Publication number Publication date
CN115049700A (zh) 2022-09-13

Similar Documents

Publication Publication Date Title
WO2022188663A1 (fr) Procédé et appareil de détection de cible
CN111337941B (zh) 一种基于稀疏激光雷达数据的动态障碍物追踪方法
WO2020043041A1 (fr) Procédé et dispositif de partitionnement de données en nuage de points, support de stockage, et dispositif électronique
CN114842438B (zh) 用于自动驾驶汽车的地形检测方法、系统及可读存储介质
CN111665842B (zh) 一种基于语义信息融合的室内slam建图方法及系统
CN111563415B (zh) 一种基于双目视觉的三维目标检测系统及方法
CN111260683A (zh) 一种三维点云数据的目标检测与跟踪方法及其装置
CN110674705B (zh) 基于多线激光雷达的小型障碍物检测方法及装置
CN113192091A (zh) 一种基于激光雷达与相机融合的远距离目标感知方法
CN113936198A (zh) 低线束激光雷达与相机融合方法、存储介质及装置
CN111213153A (zh) 目标物体运动状态检测方法、设备及存储介质
CN114454875A (zh) 一种基于强化学习的城市道路自动泊车方法及系统
CN115376109B (zh) 障碍物检测方法、障碍物检测装置以及存储介质
CN113345008A (zh) 一种考虑轮式机器人位姿估计的激光雷达动态障碍物检测方法
CN114972968A (zh) 基于多重神经网络的托盘识别和位姿估计方法
Choe et al. Fast point cloud segmentation for an intelligent vehicle using sweeping 2D laser scanners
CN116109601A (zh) 一种基于三维激光雷达点云的实时目标检测方法
CN115861968A (zh) 一种基于实时点云数据的动态障碍物剔除方法
CN114998276A (zh) 一种基于三维点云的机器人动态障碍物实时检测方法
CN115201849A (zh) 一种基于矢量地图的室内建图方法
CN115100741A (zh) 一种点云行人距离风险检测方法、系统、设备和介质
Zhao et al. Omni-Directional Obstacle Detection for Vehicles Based on Depth Camera
CN116385997A (zh) 一种车载障碍物精确感知方法、系统及存储介质
CN115685237A (zh) 视锥与几何约束相结合的多模态三维目标检测方法及系统
CN116863325A (zh) 一种用于多个目标检测的方法和相关产品

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22766189

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22766189

Country of ref document: EP

Kind code of ref document: A1