WO2021134285A1 - Image tracking processing method and apparatus, and computer device and storage medium - Google Patents

Image tracking processing method and apparatus, and computer device and storage medium Download PDF

Info

Publication number
WO2021134285A1
WO2021134285A1 PCT/CN2019/130077 CN2019130077W WO2021134285A1 WO 2021134285 A1 WO2021134285 A1 WO 2021134285A1 CN 2019130077 W CN2019130077 W CN 2019130077W WO 2021134285 A1 WO2021134285 A1 WO 2021134285A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
standard
point cloud
cloud data
area
Prior art date
Application number
PCT/CN2019/130077
Other languages
French (fr)
Chinese (zh)
Inventor
许双杰
何明
叶茂盛
邹晓艺
吴伟
许家妙
曹通易
Original Assignee
深圳元戎启行科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳元戎启行科技有限公司 filed Critical 深圳元戎启行科技有限公司
Priority to CN201980037486.7A priority Critical patent/CN113490965A/en
Priority to PCT/CN2019/130077 priority patent/WO2021134285A1/en
Publication of WO2021134285A1 publication Critical patent/WO2021134285A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments

Definitions

  • This application relates to an image tracking processing method, device, computer equipment, storage medium, and transportation.
  • Visual tracking refers to the use of computer technology to extract, identify, and track targets to obtain information such as the location of the target for subsequent processing and analysis.
  • visual tracking technology can be implemented in many application scenarios. For example: visual tracking technology can be applied to related fields such as autonomous driving and assisted driving.
  • the visual tracking technology is usually based on the image taken by the camera and other equipment for target tracking.
  • the inventor realized that the tracking result is easily affected by the image quality in the way of target tracking based on the captured image. Under the influence of factors such as environmental lighting changes and target movement speed, the image quality is lower, which in turn leads to lower accuracy and robustness of target tracking results.
  • an image tracking processing method is provided.
  • An image tracking processing method including:
  • the target tracking area corresponding to the point cloud data of the current frame is determined according to the candidate area tag.
  • An image tracking processing device including:
  • Point cloud acquisition module for acquiring point cloud data of the current frame
  • the preprocessing module is used to preprocess the point cloud data of the current frame to generate a projection image
  • the standard image acquisition module is used to acquire the standard area image corresponding to the standard frame point cloud data
  • the target tracking module is used to call the target tracking model, and obtain the candidate area label corresponding to the candidate area based on the projection image and the standard area image; determine the target tracking area corresponding to the current frame point cloud data according to the candidate area label .
  • a computer device including a memory and one or more processors, the memory stores computer readable instructions, and when the computer readable instructions are executed by the processor, the one or more processors execute The following steps:
  • the target tracking area corresponding to the point cloud data of the current frame is determined according to the candidate area tag.
  • One or more non-volatile computer-readable storage media storing computer-readable instructions.
  • the computer-readable instructions When executed by one or more processors, the one or more processors perform the following steps:
  • the target tracking area corresponding to the point cloud data of the current frame is determined according to the candidate area tag.
  • a vehicle includes the steps of executing the above-mentioned image tracking processing method.
  • Fig. 1 is an application scene diagram of an image tracking processing method according to one or more embodiments.
  • Fig. 2 is a schematic flowchart of an image tracking processing method according to one or more embodiments.
  • FIG. 3 is a schematic flowchart of the step of obtaining a standard detection area corresponding to a standard frame image according to one or more embodiments.
  • Fig. 4 is a block diagram of an image tracking processing device according to one or more embodiments.
  • Figure 5 is a block diagram of a computer device according to one or more embodiments.
  • the image tracking processing method provided in this application can be applied to a variety of application environments. For example, it can be applied to the application environment of automatic driving as shown in FIG. 1, and it can include a laser sensor 102 and a computer device 104.
  • the computer device 104 can communicate with the laser sensor 102 according to the connection established with the laser sensor 102.
  • a wired connection or a wireless connection can be established between the laser sensor 102 and the computer device 104.
  • the laser sensor 102 can collect multi-frame point cloud data of the surrounding environment
  • the computer device 104 can acquire the current frame point cloud data collected by the laser sensor 102
  • the computer device 104 can also acquire preset current frame point cloud data.
  • the computer device 104 preprocesses the point cloud data of the current frame, generates a projection image, and obtains a standard area image corresponding to the standard frame point cloud data.
  • the computer device 104 calls the target tracking model, and obtains the candidate region label corresponding to the candidate region based on the projection image and the standard region image.
  • the computer device 104 determines the target tracking area corresponding to the point cloud data of the current frame according to the candidate area tag.
  • the laser sensor 102 may be a laser sensor carried by an automatic driving device, and may specifically include a laser radar, a laser scanner, and the like.
  • an image tracking processing method is provided. Taking the method applied to the computer device 104 in FIG. 1 as an example for description, the method includes the following steps:
  • Step 202 Obtain the point cloud data of the current frame.
  • the laser sensor may be equipped with a device capable of autonomous driving. For example, it can be carried by an unmanned vehicle, or it can be carried by a vehicle including an autonomous driving model.
  • Laser sensors can be used to collect environmental data within the visual range. Specifically, the laser sensor can emit a detection signal, such as a laser beam. The laser sensor compares the signal reflected by the object in the environment with the detection signal to obtain the surrounding environment data.
  • the environmental data collected by the laser sensor may specifically be point cloud data. Point cloud data refers to a collection of point data corresponding to multiple points on the surface of the object in the scanning environment recorded in the form of points. Among them, multiple specifically may refer to two or more than two.
  • the laser sensor can collect according to a preset frequency to obtain multi-frame point cloud data. The preset frequency may be preset according to actual needs, for example, it may be specifically set to 50 frames per second.
  • the point cloud data may be three-dimensional point cloud data, and each frame of point cloud data may include point data corresponding to multiple points.
  • the point data may specifically include at least one of three-dimensional coordinates, laser reflection intensity, and color information corresponding to the point.
  • the three-dimensional coordinates may be the coordinates of the point in the Cartesian coordinate system, and specifically include the horizontal axis coordinates, the vertical axis coordinates, and the vertical axis coordinates of the point in the Cartesian coordinate system.
  • the Cartesian coordinate system is a three-dimensional space coordinate system established with the location of the laser sensor as the origin.
  • the three-dimensional space coordinate system includes a horizontal axis (x axis), a vertical axis (y axis), and a vertical axis (z axis).
  • the three-dimensional space coordinate system established with the position of the laser sensor as the origin satisfies the right-hand rule.
  • Computer equipment can obtain point cloud data. Specifically, the computer device may obtain the collected point cloud data in real time every time the laser sensor collects one frame of point cloud data, or may obtain the collected multi-frame point cloud data after the laser sensor collects the multi-frame point cloud data.
  • the computer equipment can follow the time sequence of the point cloud data collected by the laser sensor, and perform target tracking based on the multi-frame point cloud data in turn.
  • the computer device may record the point cloud data that has started or is in the process of target tracking as the point cloud data of the current frame.
  • Targets can include living or non-living objects in the surrounding environment.
  • the target can be moving or stationary.
  • the target may specifically include at least one of pedestrians, roadblocks, vehicles, and buildings.
  • the point cloud data of the current frame can be recorded as the point cloud of the previous frame according to the order of point cloud data collection. Data, get the point cloud data of the next frame and record it as the point cloud data of the current frame.
  • Step 204 Preprocessing the point cloud data of the current frame to generate a projection image.
  • the computer device may preprocess the acquired point cloud data of the current frame, and the preprocessing may include at least one of multiple processing methods. Specifically, the preprocessing performed by the computer device on the point cloud data of the current frame may specifically include at least one of processing methods such as data cleaning, point cloud segmentation, and point cloud projection.
  • the computer equipment generates a projection image from point cloud data with a large number of discrete point data, which effectively reduces the amount of data calculation and saves the computing resources of the computer equipment.
  • the method for the computer device to preprocess the point cloud data of the current frame may include point cloud projection.
  • the computer device may obtain point data corresponding to multiple points in the point cloud data of the current frame, and extract the three-dimensional coordinates corresponding to the points from the point data.
  • the computer device can project the points in the point cloud data of the current frame onto a plane according to the three-dimensional coordinates of the points, and record the image formed by the points projected on the plane as the projected image.
  • the generated projection image is a two-dimensional image.
  • the computer device can project the points in the point cloud data of the current frame to the x-y plane where the horizontal axis and the vertical axis are located to obtain a top view of the point cloud, and the computer device can record the top view of the point cloud as a projection image.
  • the way for the computer equipment to preprocess the point cloud data of the current frame may also include data cleaning and point cloud projection.
  • the computer device can clean up the point cloud data of the current frame, and clean up abnormal point data from multiple point data included in the point cloud data of the current frame, thereby avoiding the interference of abnormal point data on target tracking and ensuring Track the accuracy of the results.
  • the computer device can perform point cloud projection according to the cleaned current frame point cloud data to obtain a projected image generated after projection.
  • the way for the computer equipment to preprocess the point cloud data of the current frame may also include point cloud segmentation and point cloud projection.
  • the computer device may divide the point cloud data of the current frame into multiple sub-point clouds according to the point data, and generate a segmentation threshold corresponding to the sub-point cloud based on the point data included in each sub-point cloud.
  • the computer device can segment the points in the corresponding sub-point cloud according to the segmentation threshold, and count the segmentation results corresponding to the multiple sub-point clouds to obtain the ground point set and the non-ground point set corresponding to the point cloud data of the current frame.
  • the computer equipment can project the points in the non-ground point set to generate a projected image.
  • the way for the computer device to preprocess the point cloud data of the current frame may also include data cleaning, point cloud segmentation, and point cloud projection.
  • Step 206 Obtain a standard area image corresponding to the standard frame point cloud data.
  • the standard frame point cloud data can be used as a reference basis for target tracking, and the computer device can perform target tracking on the current frame point cloud data based on the standard frame point cloud data.
  • the standard frame point cloud data can be one of a variety of point cloud data.
  • the standard frame point cloud data may be a frame of point cloud data determined by the user from multiple frames of point cloud data according to actual needs, or may be the first frame of point cloud data in the multiple frames of point cloud data collected by the laser sensor.
  • the computer equipment can obtain the standard area image corresponding to the standard frame point cloud data.
  • the standard frame point cloud data may correspond to one or more standard area images, and the standard area image refers to an image corresponding to the area where the target is located in the standard frame point cloud data.
  • the standard area image can be an image of various shapes. For example, the standard area image can be rectangular or circular.
  • the standard area image may be a part of the standard image corresponding to the standard frame point cloud data, and the standard image may be obtained after point cloud projection is performed according to the standard frame point cloud data.
  • the computer device can obtain the standard area image corresponding to the standard frame point cloud data in a variety of ways. Specifically, the computer device can detect the standard frame point cloud data to obtain a standard area image corresponding to the standard frame point cloud data.
  • the standard area image can also be preset by the user according to actual needs. For example, the computer device can receive the target to be tracked selected by the user in advance, and determine the standard area image corresponding to the target to be tracked.
  • the computer equipment can obtain the standard area image corresponding to the standard frame point cloud data.
  • step 208 the target tracking model is called, and the candidate area label corresponding to the candidate area is obtained based on the projection image and the standard area image.
  • the computer device can call the target tracking model, and perform tracking processing on the projected image according to the target tracking model to obtain the tracking area corresponding to the point cloud data of the current frame.
  • the target tracking model can be pre-configured in the computer device.
  • the target tracking model can be one of a variety of deep learning models. For example, it may be one of a variety of convolutional neural network models, deep trust network models, and so on.
  • the target tracking model may be obtained after training the deep learning model according to the point cloud image samples.
  • the computer device can input the projection image generated by preprocessing and the standard area image corresponding to the standard frame point cloud data to the target tracking model, and calculate the projection image and the standard area image through the target tracking model to obtain the candidate output of the target tracking model
  • the label of the candidate area corresponding to the area refers to the area where the target may be located in the projected image, and the candidate area may specifically include the location, range and shape of the area where the target may be located.
  • the candidate area label refers to the tag label corresponding to the candidate area, and the candidate area label is uniquely associated with the candidate area.
  • the candidate area label may include the area confidence or probability value of the candidate area belonging to the real area of the target.
  • Step 210 Determine the target tracking area corresponding to the point cloud data of the current frame according to the candidate area tag.
  • the computer device can obtain the candidate area tags corresponding to the multiple candidate areas, and determine the target tracking area corresponding to the point cloud data of the current frame according to the candidate area tags, so as to achieve target tracking.
  • the target tracking area refers to the location area of the target in the point cloud data of the current frame estimated through tracking processing, and the target tracking area may be a target frame corresponding to the target.
  • the computer device may use one of a variety of algorithms to determine the target tracking area. For example, the computer device can use the maximum value algorithm to compare multiple candidate area tags with each other, and determine the candidate area corresponding to the candidate area tag with the highest area confidence among the multiple candidate area tags, as the point cloud data corresponding to the current frame Target tracking area.
  • the computer device may also use a non-maximum suppression algorithm (Non-Maximum Suppression, NMS for short) to filter the candidate region tags.
  • NMS non-Maximum Suppression
  • the computer device may screen multiple candidate regions according to the region confidence level according to the non-maximum value suppression algorithm, and remove unselected candidate regions each time until the screening ends.
  • the computer device can determine the candidate area corresponding to the selected candidate area label as the target tracking area corresponding to the point cloud data of the current frame, which effectively improves the accuracy of determining the target tracking area from multiple candidate areas.
  • the computer device preprocesses the acquired point cloud data of the current frame, generates a projection image, and tracks the projection image.
  • the calculation amount of the computer equipment is effectively reduced, and the calculation resources of the computer equipment are saved.
  • Call the target tracking model to process the standard area image and projection image corresponding to the standard frame point cloud data to obtain the candidate area label corresponding to the candidate area, and determine the target tracking area according to the candidate area label, so as to realize the target based on the current frame point cloud data track.
  • the point cloud data collected by the laser sensor is not easily affected by factors such as environmental lighting changes and target movement speed, which effectively improves the accuracy and robustness of target tracking.
  • the point cloud data of the current frame is preprocessed, and the steps of generating a projection image include: obtaining a target tracking task; obtaining a corresponding image plane according to the target tracking task; projecting the points in the current frame point cloud data To the image plane, get the projected image.
  • the computer equipment can acquire the target tracking task, and the target tracking task can be used to instruct the computer equipment and the laser sensor to track the target.
  • the target tracking task can be triggered according to the user's operating instructions, or it can be automatically generated by the computer equipment according to actual needs.
  • Target tracking tasks can carry tracking task types.
  • the tracking task type refers to the task type corresponding to the target tracking task, and the target tracking task can correspond to one of a variety of task types.
  • the tracking task type can be used to represent multiple tracking scenarios. In different tracking scenarios, the requirements for point cloud projection can be different, and the tracking task type of the target tracking task can also be different.
  • the computer device can obtain the image plane corresponding to the tracking task type according to the tracking task type. The image plane is used to project the point cloud data of the current frame to generate a projected image. In different tracking scenarios, the computer equipment can determine different planes as image planes.
  • the computer equipment can determine the horizontal plane where the laser sensor is located, that is, the xy plane formed by the horizontal axis and the vertical axis in the spatial coordinate system.
  • the vertical axis coordinates in the three-dimensional coordinates of the points are not considered.
  • the computer equipment can determine the vertical plane corresponding to the laser sensor, that is, the yz plane formed by the vertical axis and the vertical axis in the space coordinate system as the image plane, regardless of the three-dimensional point The abscissa coordinate in the coordinate.
  • the computer device can project multiple points in the point cloud data of the current frame, project the multiple points into the image plane, and obtain multiple projection points in the image plane.
  • the computer device can record the images corresponding to multiple projection points in the image plane as the projected image, and the projected image is a two-dimensional image.
  • the computer device can track according to the generated projection image to obtain a two-dimensional target tracking area in the projection image.
  • the computer device may obtain multiple image planes, and respectively project the points in the point cloud data of the current frame to the multiple image planes to obtain multiple projection images.
  • the computer device can separately track multiple projection images to obtain target tracking areas corresponding to the multiple projection images. It can be understood that the target tracking area determined in the two-dimensional projection image is also two-dimensional.
  • the computer equipment can synthesize the target tracking area corresponding to multiple projection images to generate the three-dimensional target tracking area corresponding to the point cloud data of the current frame, so as to more accurately determine the position and size of the tracked target in the three-dimensional space, which is beneficial to the computer equipment according to Three-dimensional target tracking area for analysis and control of automatic driving.
  • the computer device can determine the corresponding image plane according to the target tracking task, and project the points in the point cloud data of the current frame to the image plane corresponding to the target tracking task to obtain the projected image.
  • the dimensionality reduction reduces the data volume of the point cloud data of the current frame.
  • the computer equipment performs target tracking according to the generated projection image, and can use the image characteristics in the projection image. Compared with the traditional Kalman filtering method of point cloud data to achieve target tracking, it effectively improves the target tracking based on point cloud data. Accuracy.
  • the step of obtaining the standard area image corresponding to the standard frame point cloud data includes: generating a standard frame image according to the standard frame point cloud data; obtaining the standard detection area corresponding to the standard frame image; intercepting the standard frame image The standard area image that matches the standard detection area.
  • Computer equipment can obtain standard frame point cloud data.
  • the standard frame point cloud data can be a frame of point cloud data where the user determines the target from the multi-frame point cloud data according to actual needs, or it can be the first frame of point cloud data in the multi-frame point cloud data collected by the laser sensor.
  • the computer device may use various methods to generate a standard frame image based on the standard frame point cloud data.
  • the computer device can project the points in the standard frame point cloud data, and determine the image obtained by the projection as the standard frame image.
  • the way that the computer device projects the standard frame image according to the standard frame point cloud data may be similar to the way of generating the projection image according to the current frame point cloud data in the above embodiment, so it will not be repeated here.
  • the computer equipment can also obtain the point data included in the standard frame point cloud data, encode the points according to the point data, and obtain the point features corresponding to each of the multiple points, and generate the feature map according to the point features corresponding to the multiple points.
  • the computer equipment can The feature map generated from the standard frame point cloud data is recorded as the standard frame image.
  • the computer device can obtain the standard detection area corresponding to the standard frame image.
  • the standard detection area can be used to indicate the area where the target is located in the standard frame image, and it can be a part of the area range in the standard frame image.
  • the standard detection area can be detected by computer equipment based on standard frame point cloud data. Specifically, the computer device can perform target detection based on the standard frame point cloud data to obtain the standard detection area.
  • the computer device can also generate a standard frame image based on the standard frame point cloud data, and then perform target detection based on the standard frame image to obtain a standard detection area.
  • the standard detection area may specifically include the position, range, and area shape of the target in the standard frame point cloud data.
  • the computer device can obtain one standard detection area corresponding to the standard frame image, and can also obtain multiple corresponding standard detection areas.
  • the computer equipment can intercept the standard area image in the standard frame image according to the standard detection area corresponding to the standard frame image to obtain the standard area image corresponding to the standard detection area.
  • the standard area image may include the target to be tracked, and the intercepted standard area image matches the size and shape of the standard area.
  • the computer device generates a standard frame image according to the standard frame point cloud data, acquires a standard detection area corresponding to the standard frame image, and intercepts a standard area image matching the standard detection area from the standard frame image.
  • the computer equipment can use the intercepted standard area image as the basis of target tracking, and perform target tracking on the projected image. By generating the image, the depth characteristics of the point cloud data are used, which effectively improves the accuracy of target tracking.
  • the step of obtaining the standard detection area corresponding to the standard frame image includes:
  • Step 302 Perform rasterization processing on the standard frame point cloud data to obtain multiple rasters.
  • Step 304 Extract point features corresponding to the standard frame point cloud data in the multiple rasters to generate a point feature matrix.
  • Step 306 Invoke the target detection model, and input the point feature matrix into the target detection model to obtain the point cloud detection area corresponding to the standard frame point cloud data.
  • Step 308 Determine a standard detection area corresponding to the standard frame image according to the point cloud detection area.
  • the computer equipment can detect the target according to the standard frame point cloud data, and obtain the standard detection area corresponding to the target.
  • the computer device may perform rasterization processing on the standard frame point cloud data, and divide the three-dimensional space corresponding to the standard frame point cloud data into multiple grids.
  • the computer device can determine the grid to which the point belongs according to the three-dimensional coordinates of the point in the standard frame point cloud data.
  • the computer equipment can count the point data corresponding to the points in each grid, perform feature extraction on the points in each grid, and obtain the point features corresponding to the points.
  • the computer device can call the feature extraction model to extract the point features in the grid.
  • the feature extraction model can be obtained after training through a large number of point cloud samples and point feature samples.
  • the feature extraction model can be one of a variety of neural network models.
  • the feature extraction model may be a convolutional neural network model, and specifically may be a PointNet model.
  • the computer device can input the point data in each grid to the feature extraction model, and calculate the point data through the feature extraction model to obtain the point features output by the feature extraction model.
  • the computer equipment can count the point features corresponding to multiple points in the grid to generate a point feature matrix.
  • the point feature matrix can be a three-dimensional matrix.
  • the computer equipment can call the target detection model, and detect the target in the standard frame point cloud data through the target detection model.
  • the target detection model may be pre-trained and configured in the computer device.
  • the target detection model may be obtained after training based on a convolutional neural network (Convolutional Neural Networks, referred to as CNN) model, and the target detection model may specifically include one of a YOLO model or a Mask RCNN model.
  • the computer device can input the generated point feature matrix to the target detection model, and calculate the point feature matrix through the target detection model to obtain the detection area output by the target detection model.
  • the computer equipment can de-rasterize the detection area output by the target detection model to obtain the point cloud detection area corresponding to the standard frame point cloud data.
  • the computer can determine the standard detection area corresponding to the standard frame image according to the point cloud detection area. Specifically, the computer device can project the point cloud detection area to the corresponding image plane according to the standard frame point cloud data projection to generate the standard frame image, to obtain the standard detection area corresponding to the standard frame image.
  • the two-dimensional detection area corresponding to the standard frame image when the computer device performs detection based on the standard frame image, the two-dimensional detection area corresponding to the standard frame image can be obtained.
  • the computer equipment can directly record the two-dimensional detection area corresponding to the standard frame image as the standard detection area corresponding to the standard frame image.
  • the computer device can call the target detection model to detect the point feature matrix corresponding to the standard frame point cloud data, and obtain the standard detection area corresponding to the standard frame image, so that the computer device can compare the current frame point cloud data based on the standard detection area. Tracking the target in the target, effectively improving the accuracy of target tracking.
  • calling the target tracking model and obtaining the candidate area label corresponding to the candidate area based on the projection image and the standard area image includes: extracting the current image feature corresponding to the projected image and the standard image feature corresponding to the standard area image; Input the current image features and standard image features into the target tracking model; filter the current image features and standard image features based on the target tracking model to obtain candidate region labels corresponding to multiple candidate regions output by the target tracking model.
  • the computer device can perform feature extraction on the projection image corresponding to the current frame point cloud data and the standard region image corresponding to the standard frame point cloud data to obtain the current image feature corresponding to the projection image and the standard image feature corresponding to the standard region image.
  • the computer device may sequentially extract the image features of the projected image and the standard area image in a single thread, or it may extract the image characteristics of the projected image and the standard area image in parallel in multiple threads.
  • the computer equipment can call the image feature model, perform feature extraction on the projection image and the standard area image, and obtain the current image features and standard image features output by the image feature model.
  • the image feature model may be a two-dimensional convolutional neural network model.
  • the computer device when the computer device extracts image features in parallel in multiple threads, the computer device can obtain the twin network model corresponding to the image feature model, and extract the features of the projected image and the standard area image in parallel.
  • the computer device can input the extracted current image features and standard image features into the target tracking model.
  • the target tracking model can be one of a variety of convolutional neural network models.
  • the target tracking model may specifically include a SiamMask model, a Siamese RPN (Region Proposal Network) model, etc.
  • the computer equipment can process the current image features and standard image features based on the target tracking model.
  • the target tracking model can perform convolution filtering on the current image feature and the standard image feature, and compare the current image feature with the standard image feature respectively to obtain the candidate region labels corresponding to the multiple candidate regions output by the target tracking model.
  • the avatar image of the candidate area corresponds to the standard area image.
  • the computer when the standard frame point cloud data corresponds to multiple standard area images, can obtain the twin network models corresponding to the multiple target tracking models, and perform operations on the standard image features corresponding to the multiple standard area images to obtain Candidate regions corresponding to multiple standard region images.
  • the computer device calculates the current image feature corresponding to the projected image and the standard image feature corresponding to the standard area image by calling the target tracking model to obtain candidate area labels corresponding to multiple candidate areas, making full use of the point cloud
  • the image features of the image corresponding to the data are used to determine multiple candidate regions through the deep learning model. Compared with the tracking method of Kalman filtering on the point cloud, the accuracy of target tracking is effectively improved.
  • the step of filtering the current image features and standard image features based on the target tracking model includes: obtaining a historical feature matrix; adjusting the standard image features according to the historical feature matrix, and filtering according to the adjusted image features deal with.
  • the computer device Before calling the target tracking model to perform operations on standard image features and current image features, the computer device can also obtain a historical feature matrix.
  • the historical feature matrix refers to a feature matrix generated by a computer device based on historical image features corresponding to historical target images in historical point cloud data.
  • the historical point cloud data may include the point cloud data including the target collected by the laser sensor before the point cloud data of the current frame.
  • the historical feature matrix may be generated from image features corresponding to the target in multiple frames of historical point cloud data, and the historical feature matrix and historical point cloud data may be stored in a memory corresponding to the computer device.
  • the computer device can record the current frame point cloud data as historical point cloud data after finishing target tracking on the current frame point cloud data.
  • the computer device can adjust the historical feature matrix according to the image characteristics of the target tracking area corresponding to the current frame point cloud data, and continuously adjust the historical feature matrix corresponding to the target, which effectively improves the accuracy and robustness of the historical feature matrix corresponding to the target.
  • the computer device can adjust the standard image feature according to the acquired historical feature matrix. Specifically, the computer device can perform convolution processing on the historical feature matrix and the standard image feature through the target tracking model to obtain the adjusted image feature. The computer device may perform convolution filtering according to the adjusted image feature and the current image feature to obtain candidate region labels corresponding to multiple candidate regions.
  • the computer device can obtain the historical feature matrix corresponding to the target to adjust the standard image features, and perform filtering processing according to the adjusted image features to obtain candidate region labels corresponding to multiple candidate regions.
  • the standard image features are adjusted through the historical feature matrix corresponding to the target.
  • the adjusted image features can more accurately reflect the characteristics of the target in the image. Through the multi-frame point cloud data in the historical time, the accuracy of target tracking is effectively improved. And robustness.
  • an image tracking processing device including: a point cloud acquisition module 402, a preprocessing module 404, a standard image acquisition module 406, and a target tracking module 408, wherein:
  • the point cloud acquisition module 402 is used to acquire the point cloud data of the current frame.
  • the preprocessing module 404 is used to preprocess the point cloud data of the current frame to generate a projection image.
  • the standard image acquisition module 406 is used to acquire the standard area image corresponding to the standard frame point cloud data.
  • the target tracking module 408 is used to call the target tracking model, obtain the candidate area label corresponding to the candidate area based on the projection image and the standard area image; determine the target tracking area corresponding to the point cloud data of the current frame according to the candidate area label.
  • the preprocessing module 404 is also used to obtain a target tracking task; obtain a corresponding image plane according to the target tracking task; project a point in the point cloud data of the current frame onto the image plane to obtain a projected image.
  • the above-mentioned standard image acquisition module 406 is further configured to generate a standard frame image according to the standard frame point cloud data; acquire the standard detection area corresponding to the standard frame image; and intercept the standard frame image to match the standard detection area Standard area image.
  • the above-mentioned standard image acquisition module 406 is also used for rasterizing the standard frame point cloud data to obtain multiple grids; extracting point features corresponding to the standard frame point cloud data in the multiple grids, Generate a point feature matrix; call the target detection model and input the point feature matrix to the target detection model to obtain the point cloud detection area corresponding to the standard frame point cloud data; determine the standard detection area corresponding to the standard frame image according to the point cloud detection area.
  • the target tracking module 408 is also used to extract the current image features corresponding to the projected image and the standard image features corresponding to the standard area image; input the current image features and standard image features into the target tracking model; The tracking model performs filtering processing on the current image features and standard image features to obtain candidate region labels corresponding to multiple candidate regions output by the target tracking model.
  • the target tracking module 408 is also used to obtain a historical feature matrix; adjust the standard image features according to the historical feature matrix, and perform filtering processing according to the adjusted image features.
  • the candidate area label includes the area confidence
  • the target tracking module 408 is also used to screen multiple candidate areas according to the area confidence; determine the selected candidate area as the target corresponding to the point cloud data of the current frame Tracking area.
  • Each module in the above-mentioned image tracking processing device can be implemented in whole or in part by software, hardware, and a combination thereof.
  • the above-mentioned modules can be embedded in the form of hardware or independent of the processor in the computer equipment, or can be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the corresponding operations of the above-mentioned modules.
  • a computer device is provided.
  • the computer device may be a server, and its internal structure diagram may be as shown in FIG. 5.
  • the computer equipment includes a processor, a memory, a network interface, and a database connected through a system bus. Among them, the processor of the computer device is used to provide calculation and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system, computer readable instructions, and a database.
  • the internal memory provides an environment for the operation of the operating system and computer-readable instructions in the non-volatile storage medium.
  • the database of the computer equipment is used to store image tracking processing data.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • the computer readable instruction is executed by the processor to realize an image tracking processing method.
  • FIG. 5 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device to which the solution of the present application is applied.
  • the specific computer device may Including more or fewer parts than shown in the figure, or combining some parts, or having a different arrangement of parts.
  • a computer device including a memory and one or more processors.
  • the memory stores computer-readable instructions.
  • the one or more processors When executed, the steps in the above method embodiments are implemented.
  • one or more non-volatile computer-readable storage media storing computer-readable instructions are provided.
  • the computer-readable instructions are executed by one or more processors, one or more processing
  • the steps in the above method embodiments are implemented when the device is executed.
  • a vehicle is provided.
  • the vehicle may specifically include self-driving vehicles, electric vehicles, bicycles, and aircraft.
  • the vehicle includes the above-mentioned computer equipment and can execute the steps in the above-mentioned image tracking processing method embodiment. .
  • the embodiments and implementation objects created by the present invention are not limited to autonomous vehicles, electric vehicles, bicycles, aircrafts, robots, etc., but also include simulation devices and test equipment related to these devices.
  • Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

An image tracking processing method, comprising: acquiring point cloud data of a current frame; pre-processing the point cloud data of the current frame to generate a projected image; acquiring a standard region image corresponding to point cloud data of a standard frame; calling a target tracking model, and acquiring a candidate region tag corresponding to a candidate region on the basis of the projected image and the standard region image; and determining a target tracking region corresponding to the point cloud data of the current frame according to the candidate region tag.

Description

图像跟踪处理方法、装置、计算机设备和存储介质Image tracking processing method, device, computer equipment and storage medium 技术领域Technical field
本申请涉及一种图像跟踪处理方法、装置、计算机设备、存储介质和交通工具。This application relates to an image tracking processing method, device, computer equipment, storage medium, and transportation.
背景技术Background technique
视觉跟踪是指利用计算机技术对目标进行提取、识别和跟踪,获得目标的位置等信息,从而进行后续处理和分析。随着计算机技术的发展,视觉跟踪技术可以在诸多应用场景中实现。例如:视觉跟踪技术可以应用于自动驾驶领域、辅助驾驶领域等相关领域中。Visual tracking refers to the use of computer technology to extract, identify, and track targets to obtain information such as the location of the target for subsequent processing and analysis. With the development of computer technology, visual tracking technology can be implemented in many application scenarios. For example: visual tracking technology can be applied to related fields such as autonomous driving and assisted driving.
在传统方式中,视觉跟踪技术通常都是基于照相机等设备拍摄的图像进行目标跟踪的。然而,发明人意识到,基于拍摄的图像进行目标跟踪的方式,跟踪结果容易受到图像质量的影响。在环境光照变化、目标运动速度等因素影响下,图像质量较低,进而导致目标跟踪结果的准确性和鲁棒性较低。In the traditional way, the visual tracking technology is usually based on the image taken by the camera and other equipment for target tracking. However, the inventor realized that the tracking result is easily affected by the image quality in the way of target tracking based on the captured image. Under the influence of factors such as environmental lighting changes and target movement speed, the image quality is lower, which in turn leads to lower accuracy and robustness of target tracking results.
发明内容Summary of the invention
根据本申请公开的各种实施例,提供一种图像跟踪处理方法、装置、计算机设备、存储介质和交通工具。According to various embodiments disclosed in the present application, an image tracking processing method, device, computer equipment, storage medium, and transportation tool are provided.
一种图像跟踪处理方法,包括:An image tracking processing method, including:
获取当前帧点云数据;Obtain the point cloud data of the current frame;
对所述当前帧点云数据进行预处理,生成投影图像;Preprocessing the point cloud data of the current frame to generate a projection image;
获取标准帧点云数据对应的标准区域图像;Obtain the standard area image corresponding to the standard frame point cloud data;
调用目标跟踪模型,基于所述投影图像以及所述标准区域图像获取候选区域对应的候选区域标签;及Calling the target tracking model, and obtaining the candidate area label corresponding to the candidate area based on the projection image and the standard area image; and
根据所述候选区域标签确定所述当前帧点云数据对应的目标跟踪区域。The target tracking area corresponding to the point cloud data of the current frame is determined according to the candidate area tag.
一种图像跟踪处理装置,包括:An image tracking processing device, including:
点云获取模块,用于获取当前帧点云数据;Point cloud acquisition module for acquiring point cloud data of the current frame;
预处理模块,用于对所述当前帧点云数据进行预处理,生成投影图像;The preprocessing module is used to preprocess the point cloud data of the current frame to generate a projection image;
标准图像获取模块,用于获取标准帧点云数据对应的标准区域图像;及The standard image acquisition module is used to acquire the standard area image corresponding to the standard frame point cloud data; and
目标跟踪模块,用于调用目标跟踪模型,基于所述投影图像以及所述标准区域图像获取候选区域对应的候选区域标签;根据所述候选区域标签确定所述当前帧点云数据对应的目标跟踪区域。The target tracking module is used to call the target tracking model, and obtain the candidate area label corresponding to the candidate area based on the projection image and the standard area image; determine the target tracking area corresponding to the current frame point cloud data according to the candidate area label .
一种计算机设备,包括存储器和一个或多个处理器,所述存储器中储存有计算机可读指令,所述计算机可读指令被所述处理器执行时,使得所述一个或多个处理器执行以下步骤:A computer device, including a memory and one or more processors, the memory stores computer readable instructions, and when the computer readable instructions are executed by the processor, the one or more processors execute The following steps:
获取当前帧点云数据;Obtain the point cloud data of the current frame;
对所述当前帧点云数据进行预处理,生成投影图像;Preprocessing the point cloud data of the current frame to generate a projection image;
获取标准帧点云数据对应的标准区域图像;Obtain the standard area image corresponding to the standard frame point cloud data;
调用目标跟踪模型,基于所述投影图像以及所述标准区域图像获取候选区域对应的候选区域标签;及Calling the target tracking model, and obtaining the candidate area label corresponding to the candidate area based on the projection image and the standard area image; and
根据所述候选区域标签确定所述当前帧点云数据对应的目标跟踪区域。The target tracking area corresponding to the point cloud data of the current frame is determined according to the candidate area tag.
一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下步骤:One or more non-volatile computer-readable storage media storing computer-readable instructions. When the computer-readable instructions are executed by one or more processors, the one or more processors perform the following steps:
获取当前帧点云数据;Obtain the point cloud data of the current frame;
对所述当前帧点云数据进行预处理,生成投影图像;Preprocessing the point cloud data of the current frame to generate a projection image;
获取标准帧点云数据对应的标准区域图像;Obtain the standard area image corresponding to the standard frame point cloud data;
调用目标跟踪模型,基于所述投影图像以及所述标准区域图像获取候选区域对应的候选区域标签;及Calling the target tracking model, and obtaining the candidate area label corresponding to the candidate area based on the projection image and the standard area image; and
根据所述候选区域标签确定所述当前帧点云数据对应的目标跟踪区域。The target tracking area corresponding to the point cloud data of the current frame is determined according to the candidate area tag.
一种交通工具,包括执行上述图像跟踪处理方法的步骤。A vehicle includes the steps of executing the above-mentioned image tracking processing method.
本申请的一个或多个实施例的细节在下面的附图和描述中提出。本申请 的其它特征和优点将从说明书、附图以及权利要求书变得明显。The details of one or more embodiments of the present application are set forth in the following drawings and description. Other features and advantages of this application will become apparent from the description, drawings and claims.
附图说明Description of the drawings
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。In order to more clearly describe the technical solutions in the embodiments of the present application, the following will briefly introduce the drawings needed in the embodiments. Obviously, the drawings in the following description are only some embodiments of the present application. A person of ordinary skill in the art can obtain other drawings based on these drawings without creative work.
图1为根据一个或多个实施例中图像跟踪处理方法的应用场景图。Fig. 1 is an application scene diagram of an image tracking processing method according to one or more embodiments.
图2为根据一个或多个实施例中图像跟踪处理方法的流程示意图。Fig. 2 is a schematic flowchart of an image tracking processing method according to one or more embodiments.
图3为根据一个或多个实施例中获取标准帧图像所对应的标准检测区域步骤的流程示意图。FIG. 3 is a schematic flowchart of the step of obtaining a standard detection area corresponding to a standard frame image according to one or more embodiments.
图4为根据一个或多个实施例中图像跟踪处理装置的框图。Fig. 4 is a block diagram of an image tracking processing device according to one or more embodiments.
图5为根据一个或多个实施例中计算机设备的框图。Figure 5 is a block diagram of a computer device according to one or more embodiments.
具体实施方式Detailed ways
为了使本申请的技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。In order to make the technical solutions and advantages of the present application clearer, the following further describes the present application in detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present application, and are not used to limit the present application.
本申请提供的图像跟踪处理方法,可以应用于多种应用环境中。例如,可以应用于如图1所示的自动驾驶的应用环境中,可以包括激光传感器102和计算机设备104。计算机设备104可以根据与激光传感器102之间建立的连接,与激光传感器102进行通信。激光传感器102与计算机设备104之间可以建立有线连接,也可以建立无线连接。激光传感器102可以采集周围环境的多帧点云数据,计算机设备104可以获取激光传感器102采集的当前帧点云数据,计算机设备104还可以获取预先设置的当前帧点云数据。计算机设备104对当前帧点云数据进行预处理,生成投影图像,获取标准帧点云数据对应的标准区域图像。计算机设备104调用目标跟踪模型,基于投影图像 以及标准区域图像获取候选区域对应的候选区域标签。计算机设备104根据候选区域标签确定当前帧点云数据对应的目标跟踪区域。激光传感器102可以是自动驾驶设备搭载的激光传感器,具体可以包括激光雷达、激光扫描仪等。The image tracking processing method provided in this application can be applied to a variety of application environments. For example, it can be applied to the application environment of automatic driving as shown in FIG. 1, and it can include a laser sensor 102 and a computer device 104. The computer device 104 can communicate with the laser sensor 102 according to the connection established with the laser sensor 102. A wired connection or a wireless connection can be established between the laser sensor 102 and the computer device 104. The laser sensor 102 can collect multi-frame point cloud data of the surrounding environment, the computer device 104 can acquire the current frame point cloud data collected by the laser sensor 102, and the computer device 104 can also acquire preset current frame point cloud data. The computer device 104 preprocesses the point cloud data of the current frame, generates a projection image, and obtains a standard area image corresponding to the standard frame point cloud data. The computer device 104 calls the target tracking model, and obtains the candidate region label corresponding to the candidate region based on the projection image and the standard region image. The computer device 104 determines the target tracking area corresponding to the point cloud data of the current frame according to the candidate area tag. The laser sensor 102 may be a laser sensor carried by an automatic driving device, and may specifically include a laser radar, a laser scanner, and the like.
在其中一个实施例中,如图2所示,提供了一种图像跟踪处理方法,以该方法应用于图1中的计算机设备104为例进行说明,包括以下步骤:In one of the embodiments, as shown in FIG. 2, an image tracking processing method is provided. Taking the method applied to the computer device 104 in FIG. 1 as an example for description, the method includes the following steps:
步骤202,获取当前帧点云数据。Step 202: Obtain the point cloud data of the current frame.
激光传感器可以是由能够进行自动驾驶的设备搭载的。比如可以是由无人车搭载的,也可以是由包括自动驾驶模型的车辆搭载的。激光传感器可以用于采集视觉范围内的环境数据。具体的,激光传感器可以发射探测信号,例如激光束等。激光传感器将环境中物体反射回的信号与探测信号进行比对,得到周围的环境数据。激光传感器采集的环境数据具体可以是点云数据。点云数据是指扫描环境中的物体以点的形式记录,物体表面多个点所对应点数据的集合。其中,多个具体可以指两个或者两个以上。激光传感器可以按照预设频率进行采集,得到多帧点云数据。预设频率可以是根据实际需求预先设置的,例如,具体可以设置为每秒50帧。The laser sensor may be equipped with a device capable of autonomous driving. For example, it can be carried by an unmanned vehicle, or it can be carried by a vehicle including an autonomous driving model. Laser sensors can be used to collect environmental data within the visual range. Specifically, the laser sensor can emit a detection signal, such as a laser beam. The laser sensor compares the signal reflected by the object in the environment with the detection signal to obtain the surrounding environment data. The environmental data collected by the laser sensor may specifically be point cloud data. Point cloud data refers to a collection of point data corresponding to multiple points on the surface of the object in the scanning environment recorded in the form of points. Among them, multiple specifically may refer to two or more than two. The laser sensor can collect according to a preset frequency to obtain multi-frame point cloud data. The preset frequency may be preset according to actual needs, for example, it may be specifically set to 50 frames per second.
点云数据可以是三维点云数据,每一帧点云数据可以包括多个点各自对应的点数据。点数据具体可以包括点对应的三维坐标、激光反射强度以及颜色信息等中的至少一种。其中,三维坐标可以是点在笛卡尔坐标系中的坐标,具体包括点在笛卡尔坐标系中的横轴坐标、纵轴坐标以及竖轴坐标。笛卡尔坐标系是以激光传感器所在位置为原点建立的三维空间坐标系,三维空间坐标系包括横轴(x轴)、纵轴(y轴)和竖轴(z轴)。以激光传感器所在的位置为原点建立的三维空间坐标系满足右手定则。The point cloud data may be three-dimensional point cloud data, and each frame of point cloud data may include point data corresponding to multiple points. The point data may specifically include at least one of three-dimensional coordinates, laser reflection intensity, and color information corresponding to the point. Among them, the three-dimensional coordinates may be the coordinates of the point in the Cartesian coordinate system, and specifically include the horizontal axis coordinates, the vertical axis coordinates, and the vertical axis coordinates of the point in the Cartesian coordinate system. The Cartesian coordinate system is a three-dimensional space coordinate system established with the location of the laser sensor as the origin. The three-dimensional space coordinate system includes a horizontal axis (x axis), a vertical axis (y axis), and a vertical axis (z axis). The three-dimensional space coordinate system established with the position of the laser sensor as the origin satisfies the right-hand rule.
计算机设备可以获取点云数据。具体的,计算机设备可以在激光传感器每采集一帧点云数据时实时获取所采集的点云数据,也可以在激光传感器采集多帧点云数据后获取所采集的多帧点云数据。计算机设备可以按照激光传感器采集点云数据的时间顺序,依次根据多帧点云数据进行目标跟踪。计算 机设备可以将开始进行目标跟踪或者正在进行目标跟踪的点云数据记作当前帧点云数据。目标可以包括周围环境中的生物体或非生物体。目标可以是运动的,也可以是静止的。例如,目标具体可以包括行人、路障、车辆和建筑物等中的至少一种。可以理解的,当计算机设备对当前帧点云数据跟踪结束,开始对下一帧点云数据进行跟踪时,可以按照点云数据的采集顺序,将当前帧点云数据记作上一帧点云数据,获取下一帧点云数据重新记作当前帧点云数据。Computer equipment can obtain point cloud data. Specifically, the computer device may obtain the collected point cloud data in real time every time the laser sensor collects one frame of point cloud data, or may obtain the collected multi-frame point cloud data after the laser sensor collects the multi-frame point cloud data. The computer equipment can follow the time sequence of the point cloud data collected by the laser sensor, and perform target tracking based on the multi-frame point cloud data in turn. The computer device may record the point cloud data that has started or is in the process of target tracking as the point cloud data of the current frame. Targets can include living or non-living objects in the surrounding environment. The target can be moving or stationary. For example, the target may specifically include at least one of pedestrians, roadblocks, vehicles, and buildings. It is understandable that when the computer equipment finishes tracking the point cloud data of the current frame and starts to track the point cloud data of the next frame, the point cloud data of the current frame can be recorded as the point cloud of the previous frame according to the order of point cloud data collection. Data, get the point cloud data of the next frame and record it as the point cloud data of the current frame.
步骤204,对当前帧点云数据进行预处理,生成投影图像。Step 204: Preprocessing the point cloud data of the current frame to generate a projection image.
计算机设备可以对获取到的当前帧点云数据进行预处理,预处理可以包括多种处理方式中的至少一种。具体的,计算机设备对当前帧点云数据进行的预处理具体可以包括数据清理、点云分割以及点云投影等处理方式中的至少一种。计算机设备将具有大量离散点数据的点云数据生成投影图像,有效的减少了数据的计算量,节省了计算机设备的运算资源。The computer device may preprocess the acquired point cloud data of the current frame, and the preprocessing may include at least one of multiple processing methods. Specifically, the preprocessing performed by the computer device on the point cloud data of the current frame may specifically include at least one of processing methods such as data cleaning, point cloud segmentation, and point cloud projection. The computer equipment generates a projection image from point cloud data with a large number of discrete point data, which effectively reduces the amount of data calculation and saves the computing resources of the computer equipment.
举例说明,计算机设备对当前帧点云数据进行预处理的方式可以包括点云投影。具体的,计算机设备可以获取当前帧点云数据中多个点各自对应的点数据,从点数据中提取点对应的三维坐标。计算机设备可以根据点的三维坐标,将当前帧点云数据中的点投影到一个平面上,将投影在平面上的点所组成的图像记作投影图像。生成的投影图像是二维的图像。例如,计算机设备可以将当前帧点云数据中的点投影至横轴与纵轴所在的x-y平面,得到点云的俯视图,计算机设备可以将点云的俯视图记作投影图像。For example, the method for the computer device to preprocess the point cloud data of the current frame may include point cloud projection. Specifically, the computer device may obtain point data corresponding to multiple points in the point cloud data of the current frame, and extract the three-dimensional coordinates corresponding to the points from the point data. The computer device can project the points in the point cloud data of the current frame onto a plane according to the three-dimensional coordinates of the points, and record the image formed by the points projected on the plane as the projected image. The generated projection image is a two-dimensional image. For example, the computer device can project the points in the point cloud data of the current frame to the x-y plane where the horizontal axis and the vertical axis are located to obtain a top view of the point cloud, and the computer device can record the top view of the point cloud as a projection image.
计算机设备对当前帧点云数据进行预处理的方式也可以包括数据清理和点云投影。具体的,计算机设备可以对当前帧点云数据进行数据清理,从当前帧点云数据包括的多个点数据中清理掉存在异常的点数据,从而避免异常点数据对目标跟踪的干扰,保证了跟踪结果的准确性。计算机设备可以根据清理后的当前帧点云数据进行点云投影,得到投影后生成的投影图像。The way for the computer equipment to preprocess the point cloud data of the current frame may also include data cleaning and point cloud projection. Specifically, the computer device can clean up the point cloud data of the current frame, and clean up abnormal point data from multiple point data included in the point cloud data of the current frame, thereby avoiding the interference of abnormal point data on target tracking and ensuring Track the accuracy of the results. The computer device can perform point cloud projection according to the cleaned current frame point cloud data to obtain a projected image generated after projection.
计算机设备对当前帧点云数据进行预处理的方式也可以包括点云分割和点云投影。具体的,计算机设备可以根据点数据将当前帧点云数据划分为多 个子点云,基于每个子点云包括的点数据生成子点云对应的分割阈值。计算机设备可以根据分割阈值将对应子点云中的点进行分割,统计多个子点云对应的分割结果,得到当前帧点云数据对应的地面点集合以及非地面点集合。计算机设备可以将非地面点集合中的点进行投影,生成投影图像。通过对点云数据进行分割,排除了地面点对目标跟踪的干扰,进而保证了跟踪结果的准确性。在其中一个实施例中,计算机设备对当前帧点云数据进行预处理的方式还可以包括数据清理、点云分割以及点云投影。The way for the computer equipment to preprocess the point cloud data of the current frame may also include point cloud segmentation and point cloud projection. Specifically, the computer device may divide the point cloud data of the current frame into multiple sub-point clouds according to the point data, and generate a segmentation threshold corresponding to the sub-point cloud based on the point data included in each sub-point cloud. The computer device can segment the points in the corresponding sub-point cloud according to the segmentation threshold, and count the segmentation results corresponding to the multiple sub-point clouds to obtain the ground point set and the non-ground point set corresponding to the point cloud data of the current frame. The computer equipment can project the points in the non-ground point set to generate a projected image. By segmenting the point cloud data, the interference of the ground points on the target tracking is eliminated, thereby ensuring the accuracy of the tracking results. In one of the embodiments, the way for the computer device to preprocess the point cloud data of the current frame may also include data cleaning, point cloud segmentation, and point cloud projection.
步骤206,获取标准帧点云数据对应的标准区域图像。Step 206: Obtain a standard area image corresponding to the standard frame point cloud data.
标准帧点云数据可以作为进行目标跟踪的参考基础,计算机设备可以基于标准帧点云数据对当前帧点云数据进行目标跟踪。标准帧点云数据可以是多种点云数据中的一种。例如,标准帧点云数据可以是用户从多帧点云数据中根据实际需求确定的一帧点云数据,还可以是激光传感器采集到的多帧点云数据中的第一帧点云数据。The standard frame point cloud data can be used as a reference basis for target tracking, and the computer device can perform target tracking on the current frame point cloud data based on the standard frame point cloud data. The standard frame point cloud data can be one of a variety of point cloud data. For example, the standard frame point cloud data may be a frame of point cloud data determined by the user from multiple frames of point cloud data according to actual needs, or may be the first frame of point cloud data in the multiple frames of point cloud data collected by the laser sensor.
计算机设备可以获取标准帧点云数据对应的标准区域图像。标准帧点云数据可以对应一个或多个标准区域图像,标准区域图像是指标准帧点云数据中目标所在区域对应的图像。标准区域图像可以是多种形状的图像。例如,标准区域图像可以是矩形的,也可以是圆形的。标准区域图像可以是标准帧点云数据所对应的标准图像中的一部分,标准图像可以是根据标准帧点云数据进行点云投影后得到的。The computer equipment can obtain the standard area image corresponding to the standard frame point cloud data. The standard frame point cloud data may correspond to one or more standard area images, and the standard area image refers to an image corresponding to the area where the target is located in the standard frame point cloud data. The standard area image can be an image of various shapes. For example, the standard area image can be rectangular or circular. The standard area image may be a part of the standard image corresponding to the standard frame point cloud data, and the standard image may be obtained after point cloud projection is performed according to the standard frame point cloud data.
计算机设备可以通过多种方式获取标准帧点云数据对应的标准区域图像。具体的,计算机设备可以对标准帧点云数据进行检测,得到标准帧点云数据对应的标准区域图像。标准区域图像还可以是用户根据实际需求预先设定的。例如,计算机设备可以接收用户预先选择所需要跟踪的目标,确定需要跟踪的目标所对应的标准区域图像。计算机设备可以获取标准帧点云数据对应预先设置的标准区域图像。The computer device can obtain the standard area image corresponding to the standard frame point cloud data in a variety of ways. Specifically, the computer device can detect the standard frame point cloud data to obtain a standard area image corresponding to the standard frame point cloud data. The standard area image can also be preset by the user according to actual needs. For example, the computer device can receive the target to be tracked selected by the user in advance, and determine the standard area image corresponding to the target to be tracked. The computer equipment can obtain the standard area image corresponding to the standard frame point cloud data.
步骤208,调用目标跟踪模型,基于投影图像以及标准区域图像获取候选区域对应的候选区域标签。In step 208, the target tracking model is called, and the candidate area label corresponding to the candidate area is obtained based on the projection image and the standard area image.
计算机设备可以调用目标跟踪模型,根据目标跟踪模型对投影图像进行跟踪处理,得到当前帧点云数据对应的跟踪区域。目标跟踪模型可以是预先配置在计算机设备中的。目标跟踪模型可以是多种深度学习模型中的一种。例如,可以是多种卷积神经网络模型、深度信任网络模型等中的一种。目标跟踪模型可以是根据点云图像样本对深度学习模型进行训练后得到的。The computer device can call the target tracking model, and perform tracking processing on the projected image according to the target tracking model to obtain the tracking area corresponding to the point cloud data of the current frame. The target tracking model can be pre-configured in the computer device. The target tracking model can be one of a variety of deep learning models. For example, it may be one of a variety of convolutional neural network models, deep trust network models, and so on. The target tracking model may be obtained after training the deep learning model according to the point cloud image samples.
计算机设备可以将通过预处理生成的投影图像,以及标准帧点云数据对应的标准区域图像输入至目标跟踪模型,通过目标跟踪模型对投影图像和标准区域图像进行运算,获取目标跟踪模型输出的候选区域对应的候选区域标签。候选区域是指在投影图像中目标可能所在的区域,候选区域具体可以包括目标可能所在的位置、区域的范围以及形状等。候选区域标签是指候选区域对应的标记标签,候选区域标签与候选区域之间唯一关联。候选区域标签可以包括候选区域属于目标的真实区域的区域置信度或者概率值。The computer device can input the projection image generated by preprocessing and the standard area image corresponding to the standard frame point cloud data to the target tracking model, and calculate the projection image and the standard area image through the target tracking model to obtain the candidate output of the target tracking model The label of the candidate area corresponding to the area. The candidate area refers to the area where the target may be located in the projected image, and the candidate area may specifically include the location, range and shape of the area where the target may be located. The candidate area label refers to the tag label corresponding to the candidate area, and the candidate area label is uniquely associated with the candidate area. The candidate area label may include the area confidence or probability value of the candidate area belonging to the real area of the target.
步骤210,根据候选区域标签确定当前帧点云数据对应的目标跟踪区域。Step 210: Determine the target tracking area corresponding to the point cloud data of the current frame according to the candidate area tag.
计算机设备可以获取多个候选区域所对应的候选区域标签,根据候选区域标签确定当前帧点云数据对应的目标跟踪区域,以此实现目标跟踪。目标跟踪区域是指通过跟踪处理所预估的当前帧点云数据中目标所在的位置区域,目标跟踪区域可以是目标对应的目标框。具体的,计算机设备可以采用多种算法中的一种确定目标跟踪区域。例如,计算机设备可以采用最大值算法,将多个候选区域标签相互进行比对,确定多个候选区域标签中区域置信度最大的候选区域标签所对应的候选区域,作为当前帧点云数据对应的目标跟踪区域。The computer device can obtain the candidate area tags corresponding to the multiple candidate areas, and determine the target tracking area corresponding to the point cloud data of the current frame according to the candidate area tags, so as to achieve target tracking. The target tracking area refers to the location area of the target in the point cloud data of the current frame estimated through tracking processing, and the target tracking area may be a target frame corresponding to the target. Specifically, the computer device may use one of a variety of algorithms to determine the target tracking area. For example, the computer device can use the maximum value algorithm to compare multiple candidate area tags with each other, and determine the candidate area corresponding to the candidate area tag with the highest area confidence among the multiple candidate area tags, as the point cloud data corresponding to the current frame Target tracking area.
在其中一个实施例中,计算机设备还可以采用非极大值抑制算法(Non-Maximum Suppression,简称NMS)对候选区域标签进行筛选。具体的,计算机设备可以按照非极大值抑制算法,根据区域置信度对多个候选区域进行多次筛选,每次筛选清除未选中的候选区域,直到筛选结束。计算机设备可以确定筛选出的候选区域标签所对应的候选区域,作为当前帧点云数据对应的目标跟踪区域,有效的提高了从多个候选区域中确定目标跟踪区域 的准确性。In one of the embodiments, the computer device may also use a non-maximum suppression algorithm (Non-Maximum Suppression, NMS for short) to filter the candidate region tags. Specifically, the computer device may screen multiple candidate regions according to the region confidence level according to the non-maximum value suppression algorithm, and remove unselected candidate regions each time until the screening ends. The computer device can determine the candidate area corresponding to the selected candidate area label as the target tracking area corresponding to the point cloud data of the current frame, which effectively improves the accuracy of determining the target tracking area from multiple candidate areas.
在本实施例中,计算机设备对获取的当前帧点云数据进行预处理,生成投影图像,对投影图像进行跟踪。通过将具有大量离散点数据的当前帧点云数据处理生成投影图像,有效的减小了计算机设备的计算量,节省了计算机设备的运算资源。调用目标跟踪模型对标准帧点云数据对应的标准区域图像和投影图像进行处理,得到候选区域对应的候选区域标签,根据候选区域标签确定目标跟踪区域,实现基于当前帧点云数据中对目标进行跟踪。相较于传统基于图像进行目标跟踪的方式,激光传感器采集的点云数据不易受到环境光照变化、目标运动速度等因素的影响,有效的提高了目标跟踪的准确性和鲁棒性。In this embodiment, the computer device preprocesses the acquired point cloud data of the current frame, generates a projection image, and tracks the projection image. By processing the point cloud data of the current frame with a large amount of discrete point data to generate a projection image, the calculation amount of the computer equipment is effectively reduced, and the calculation resources of the computer equipment are saved. Call the target tracking model to process the standard area image and projection image corresponding to the standard frame point cloud data to obtain the candidate area label corresponding to the candidate area, and determine the target tracking area according to the candidate area label, so as to realize the target based on the current frame point cloud data track. Compared with the traditional image-based target tracking method, the point cloud data collected by the laser sensor is not easily affected by factors such as environmental lighting changes and target movement speed, which effectively improves the accuracy and robustness of target tracking.
在其中一个实施例中,对当前帧点云数据进行预处理,生成投影图像的步骤包括:获取目标跟踪任务;根据目标跟踪任务获取相对应的图像平面;将当前帧点云数据中的点投影至图像平面,得到投影图像。In one of the embodiments, the point cloud data of the current frame is preprocessed, and the steps of generating a projection image include: obtaining a target tracking task; obtaining a corresponding image plane according to the target tracking task; projecting the points in the current frame point cloud data To the image plane, get the projected image.
计算机设备可以获取目标跟踪任务,目标跟踪任务可以用于指示计算机设备和激光传感器进行目标跟踪。目标跟踪任务可以是根据用户的操作指令触发的,也可以是计算机设备根据实际需求自动生成的。目标跟踪任务可以携带有跟踪任务类型。跟踪任务类型是指目标跟踪任务所对应的任务类型,目标跟踪任务可以对应多种任务类型中的一种。The computer equipment can acquire the target tracking task, and the target tracking task can be used to instruct the computer equipment and the laser sensor to track the target. The target tracking task can be triggered according to the user's operating instructions, or it can be automatically generated by the computer equipment according to actual needs. Target tracking tasks can carry tracking task types. The tracking task type refers to the task type corresponding to the target tracking task, and the target tracking task can correspond to one of a variety of task types.
跟踪任务类型可以用于表示多种跟踪情景,在不同的跟踪情景中,对点云投影的需求可以是不同的,目标跟踪任务的跟踪任务类型也可以是不同的。计算机设备可以根据跟踪任务类型获取与跟踪任务类型相对应的图像平面。图像平面是用于对当前帧点云数据进行投影,生成投影图像的平面。在不同的跟踪情景中,计算机设备可以确定不同的平面作为图像平面。The tracking task type can be used to represent multiple tracking scenarios. In different tracking scenarios, the requirements for point cloud projection can be different, and the tracking task type of the target tracking task can also be different. The computer device can obtain the image plane corresponding to the tracking task type according to the tracking task type. The image plane is used to project the point cloud data of the current frame to generate a projected image. In different tracking scenarios, the computer equipment can determine different planes as image planes.
例如,当搭载激光传感器的车辆行驶在水平路面时,需要确定目标在车辆所在水平面中的分布,计算机设备可以确定激光传感器所在的水平面,即空间坐标系中横轴与纵轴所形成的x-y平面作为图像平面,不考虑点的三维坐标中的竖轴坐标。当搭载激光传感器的车辆行驶在上坡或者下坡路线时, 计算机设备可以确定激光传感器对应竖直平面,即空间坐标系中纵轴与竖轴所形成的y-z平面作为图像平面,不考虑点的三维坐标中的横轴坐标。For example, when a vehicle equipped with a laser sensor is driving on a level road, it is necessary to determine the distribution of the target in the horizontal plane where the vehicle is located. The computer equipment can determine the horizontal plane where the laser sensor is located, that is, the xy plane formed by the horizontal axis and the vertical axis in the spatial coordinate system. As the image plane, the vertical axis coordinates in the three-dimensional coordinates of the points are not considered. When a vehicle equipped with a laser sensor is driving on an uphill or downhill route, the computer equipment can determine the vertical plane corresponding to the laser sensor, that is, the yz plane formed by the vertical axis and the vertical axis in the space coordinate system as the image plane, regardless of the three-dimensional point The abscissa coordinate in the coordinate.
计算机设备可以将当前帧点云数据中的多个点进行投影,将多个点投影至图像平面中,在图像平面中得到多个投影点。计算机设备可以将图像平面中多个投影点所对应的图像记作投影图像,投影图像是二维的图像。计算机设备可以根据生成的投影图像进行跟踪,得到在投影图像中二维的目标跟踪区域。The computer device can project multiple points in the point cloud data of the current frame, project the multiple points into the image plane, and obtain multiple projection points in the image plane. The computer device can record the images corresponding to multiple projection points in the image plane as the projected image, and the projected image is a two-dimensional image. The computer device can track according to the generated projection image to obtain a two-dimensional target tracking area in the projection image.
在其中一个实施例中,计算机设备可以获取多个图像平面,分别将当前帧点云数据中的点投影至多个图像平面,得到多个投影图像。计算机设备可以分别对多个投影图像进行跟踪处理,得到多个投影图像各自对应的目标跟踪区域。可以理解的,在二维的投影图像中确定的目标跟踪区域也是二维的。计算机设备可以综合多个投影图像对应的目标跟踪区域,生成当前帧点云数据对应的三维目标跟踪区域,以此更加准确的确定跟踪的目标在三维空间中的位置和大小,有利于计算机设备根据三维的目标跟踪区域,对自动驾驶进行分析和控制。In one of the embodiments, the computer device may obtain multiple image planes, and respectively project the points in the point cloud data of the current frame to the multiple image planes to obtain multiple projection images. The computer device can separately track multiple projection images to obtain target tracking areas corresponding to the multiple projection images. It can be understood that the target tracking area determined in the two-dimensional projection image is also two-dimensional. The computer equipment can synthesize the target tracking area corresponding to multiple projection images to generate the three-dimensional target tracking area corresponding to the point cloud data of the current frame, so as to more accurately determine the position and size of the tracked target in the three-dimensional space, which is beneficial to the computer equipment according to Three-dimensional target tracking area for analysis and control of automatic driving.
在本实施例中,计算机设备可以根据目标跟踪任务确定相对应的图像平面,将当前帧点云数据中的点投影至目标跟踪任务所对应的图像平面,得到投影图像,对当前帧点云数据进行降维,减少了当前帧点云数据的数据量。计算机设备根据生成的投影图像进行目标跟踪,可以利用投影图像中的图像特征,相较于传统对点云数据进行卡尔曼滤波以实现目标跟踪的方式,有效的提高了基于点云数据进行目标跟踪的准确性。In this embodiment, the computer device can determine the corresponding image plane according to the target tracking task, and project the points in the point cloud data of the current frame to the image plane corresponding to the target tracking task to obtain the projected image. The dimensionality reduction reduces the data volume of the point cloud data of the current frame. The computer equipment performs target tracking according to the generated projection image, and can use the image characteristics in the projection image. Compared with the traditional Kalman filtering method of point cloud data to achieve target tracking, it effectively improves the target tracking based on point cloud data. Accuracy.
在其中一个实施例中,获取标准帧点云数据对应的标准区域图像的步骤包括:根据标准帧点云数据生成标准帧图像;获取标准帧图像所对应的标准检测区域;在标准帧图像中截取与标准检测区域相匹配的标准区域图像。In one of the embodiments, the step of obtaining the standard area image corresponding to the standard frame point cloud data includes: generating a standard frame image according to the standard frame point cloud data; obtaining the standard detection area corresponding to the standard frame image; intercepting the standard frame image The standard area image that matches the standard detection area.
计算机设备可以获取标准帧点云数据。标准帧点云数据可以是用户从多帧点云数据中根据实际需求确定目标所在的一帧点云数据,也可以是激光传感器采集到的多帧点云数据中的第一帧点云数据。Computer equipment can obtain standard frame point cloud data. The standard frame point cloud data can be a frame of point cloud data where the user determines the target from the multi-frame point cloud data according to actual needs, or it can be the first frame of point cloud data in the multi-frame point cloud data collected by the laser sensor.
具体的,计算机设备可以采用多种方式根据标准帧点云数据生成标准帧图像。例如,计算机设备可以将标准帧点云数据中的点进行投影,确定通过投影得到的图像作为标准帧图像。计算机设备根据标准帧点云数据进行投影得到标准帧图像的方式,可以与上述实施例中根据当前帧点云数据生成投影图像的方式类似,故在此不再赘述。计算机设备还可以获取标准帧点云数据包括的点数据,根据点数据对点进行编码,得到多个点各自对应的点特征,根据多个点各自对应的点特征生成特征图,计算机设备可以将根据标准帧点云数据生成的特征图记作标准帧图像。Specifically, the computer device may use various methods to generate a standard frame image based on the standard frame point cloud data. For example, the computer device can project the points in the standard frame point cloud data, and determine the image obtained by the projection as the standard frame image. The way that the computer device projects the standard frame image according to the standard frame point cloud data may be similar to the way of generating the projection image according to the current frame point cloud data in the above embodiment, so it will not be repeated here. The computer equipment can also obtain the point data included in the standard frame point cloud data, encode the points according to the point data, and obtain the point features corresponding to each of the multiple points, and generate the feature map according to the point features corresponding to the multiple points. The computer equipment can The feature map generated from the standard frame point cloud data is recorded as the standard frame image.
计算机设备可以获取标准帧图像所对应的标准检测区域,标准检测区域可以用于表示在标准帧图像中目标所在的区域,可以是标准帧图像中的一部分区域范围。标准检测区域可以通过计算机设备根据标准帧点云数据检测得到的。具体的,计算机设备可以根据标准帧点云数据进行目标检测,得到标准检测区域。计算机设备还可以根据标准帧点云数据生成标准帧图像之后,根据标准帧图像进行目标检测,得到标准检测区域。标准检测区域具体可以包括在标准帧点云数据中目标所在的位置、目标范围以及区域形状等。计算机设备可以获取标准帧图像所对应的一个标准检测区域,也可以获取对应的多个标准检测区域。The computer device can obtain the standard detection area corresponding to the standard frame image. The standard detection area can be used to indicate the area where the target is located in the standard frame image, and it can be a part of the area range in the standard frame image. The standard detection area can be detected by computer equipment based on standard frame point cloud data. Specifically, the computer device can perform target detection based on the standard frame point cloud data to obtain the standard detection area. The computer device can also generate a standard frame image based on the standard frame point cloud data, and then perform target detection based on the standard frame image to obtain a standard detection area. The standard detection area may specifically include the position, range, and area shape of the target in the standard frame point cloud data. The computer device can obtain one standard detection area corresponding to the standard frame image, and can also obtain multiple corresponding standard detection areas.
计算机设备可以根据标准帧图像所对应的标准检测区域,在标准帧图像中对标准区域图像进行截取,得到得到与标准检测区域相对应的标准区域图像。标准区域图像中可以包括待跟踪的目标,截取得到的标准区域图像与标准区域的大小及形状相匹配。The computer equipment can intercept the standard area image in the standard frame image according to the standard detection area corresponding to the standard frame image to obtain the standard area image corresponding to the standard detection area. The standard area image may include the target to be tracked, and the intercepted standard area image matches the size and shape of the standard area.
在本实施例中,计算机设备根据标准帧点云数据生成标准帧图像,获取标准帧图像对应的标准检测区域,在标准帧图像中截取与标准检测区域相匹配的标准区域图像。计算机设备可以将截取的标准区域图像作为目标跟踪的基础,对投影图像进行目标跟踪,通过生成图像利用了点云数据的深度特征,有效的提高了目标跟踪的准确性。In this embodiment, the computer device generates a standard frame image according to the standard frame point cloud data, acquires a standard detection area corresponding to the standard frame image, and intercepts a standard area image matching the standard detection area from the standard frame image. The computer equipment can use the intercepted standard area image as the basis of target tracking, and perform target tracking on the projected image. By generating the image, the depth characteristics of the point cloud data are used, which effectively improves the accuracy of target tracking.
在其中一个实施例中,如图3所示,获取标准帧图像所对应的标准检测 区域的步骤包括:In one of the embodiments, as shown in Fig. 3, the step of obtaining the standard detection area corresponding to the standard frame image includes:
步骤302,将标准帧点云数据进行栅格化处理,得到多个栅格。Step 302: Perform rasterization processing on the standard frame point cloud data to obtain multiple rasters.
步骤304,提取多个栅格中标准帧点云数据对应的点特征,生成点特征矩阵。Step 304: Extract point features corresponding to the standard frame point cloud data in the multiple rasters to generate a point feature matrix.
步骤306,调用目标检测模型,将点特征矩阵输入至目标检测模型,得到标准帧点云数据对应的点云检测区域。Step 306: Invoke the target detection model, and input the point feature matrix into the target detection model to obtain the point cloud detection area corresponding to the standard frame point cloud data.
步骤308,根据点云检测区域确定标准帧图像对应的标准检测区域。Step 308: Determine a standard detection area corresponding to the standard frame image according to the point cloud detection area.
计算机设备可以根据标准帧点云数据对目标进行检测,得到目标对应的标准检测区域。具体的,计算机设备可以将标准帧点云数据进行栅格化处理,将标准帧点云数据对应的三维空间划分为多个栅格。计算机设备可以根据标准帧点云数据中点的三维坐标确定点所属的栅格。The computer equipment can detect the target according to the standard frame point cloud data, and obtain the standard detection area corresponding to the target. Specifically, the computer device may perform rasterization processing on the standard frame point cloud data, and divide the three-dimensional space corresponding to the standard frame point cloud data into multiple grids. The computer device can determine the grid to which the point belongs according to the three-dimensional coordinates of the point in the standard frame point cloud data.
计算机设备可以统计每个栅格中的点所对应的点数据,对每个栅格中的点进行特征提取,得到点对应的点特征。具体的,计算机设备可以调用特征提取模型对栅格中的点特征进行提取。特征提取模型可以是通过大量点云样本和点特征样本进行训练后得到的。特征提取模型可以是多种神经网络模型中的一种。例如,特征提取模型可以是卷积神经网络模型,具体可以为PointNet模型。计算机设备可以将每个栅格中的点数据输入至特征提取模型,通过特征提取模型对点数据进行运算,获取特征提取模型输出的点特征。计算机设备可以统计栅格中对应多个点的点特征,生成点特征矩阵。点特征矩阵可以是三维矩阵。The computer equipment can count the point data corresponding to the points in each grid, perform feature extraction on the points in each grid, and obtain the point features corresponding to the points. Specifically, the computer device can call the feature extraction model to extract the point features in the grid. The feature extraction model can be obtained after training through a large number of point cloud samples and point feature samples. The feature extraction model can be one of a variety of neural network models. For example, the feature extraction model may be a convolutional neural network model, and specifically may be a PointNet model. The computer device can input the point data in each grid to the feature extraction model, and calculate the point data through the feature extraction model to obtain the point features output by the feature extraction model. The computer equipment can count the point features corresponding to multiple points in the grid to generate a point feature matrix. The point feature matrix can be a three-dimensional matrix.
计算机设备可以调用目标检测模型,通过目标检测模型对标准帧点云数据中的目标进行检测。目标检测模型可以是预先通过训练后配置在计算机设备中的。目标检测模型可以是基于卷积神经网络(Convolutional Neural Networks,简称CNN)模型训练后得到的,目标检测模型具体可以包括YOLO模型或者Mask RCNN模型等中的一种。计算机设备可以将生成的点特征矩阵输入至目标检测模型,通过目标检测模型对点特征矩阵进行运算,获取目标检测模型输出的检测区域。计算机设备可以将目标检测模型输出的检测区 域进行反栅格化,得到标准帧点云数据对应的点云检测区域。The computer equipment can call the target detection model, and detect the target in the standard frame point cloud data through the target detection model. The target detection model may be pre-trained and configured in the computer device. The target detection model may be obtained after training based on a convolutional neural network (Convolutional Neural Networks, referred to as CNN) model, and the target detection model may specifically include one of a YOLO model or a Mask RCNN model. The computer device can input the generated point feature matrix to the target detection model, and calculate the point feature matrix through the target detection model to obtain the detection area output by the target detection model. The computer equipment can de-rasterize the detection area output by the target detection model to obtain the point cloud detection area corresponding to the standard frame point cloud data.
由于点云检测区域是标准帧点云数据对应的三维的检测区域,计算机可以根据点云检测区域确定标准帧图像对应的标准检测区域。具体的,计算机设备可以按照标准帧点云数据投影生成标准帧图像的方式,将点云检测区域投影至对应的图像平面,得到标准帧图像对应的标准检测区域。Since the point cloud detection area is a three-dimensional detection area corresponding to the standard frame point cloud data, the computer can determine the standard detection area corresponding to the standard frame image according to the point cloud detection area. Specifically, the computer device can project the point cloud detection area to the corresponding image plane according to the standard frame point cloud data projection to generate the standard frame image, to obtain the standard detection area corresponding to the standard frame image.
在其中一个实施例中,当计算机设备根据标准帧图像进行检测时,可以得到标准帧图像对应的二维检测区域。计算机设备可以直接将标准帧图像对应的二维检测区域,记作标准帧图像对应的标准检测区域。In one of the embodiments, when the computer device performs detection based on the standard frame image, the two-dimensional detection area corresponding to the standard frame image can be obtained. The computer equipment can directly record the two-dimensional detection area corresponding to the standard frame image as the standard detection area corresponding to the standard frame image.
在本实施例中,计算机设备可以调用目标检测模型对标准帧点云数据对应的点特征矩阵进行检测,得到标准帧图像对应的标准检测区域,以便计算机设备基于标准检测区域对当前帧点云数据中的目标进行跟踪,有效的提高了目标跟踪的准确性。In this embodiment, the computer device can call the target detection model to detect the point feature matrix corresponding to the standard frame point cloud data, and obtain the standard detection area corresponding to the standard frame image, so that the computer device can compare the current frame point cloud data based on the standard detection area. Tracking the target in the target, effectively improving the accuracy of target tracking.
在其中一个实施例中,调用目标跟踪模型,基于投影图像以及标准区域图像获取候选区域对应的候选区域标签的步骤包括:提取投影图像对应的当前图像特征,以及标准区域图像对应的标准图像特征;将当前图像特征与标准图像特征输入至目标跟踪模型;基于目标跟踪模型对当前图像特征与标准图像特征进行滤波处理,得到目标跟踪模型输出的多个候选区域对应的候选区域标签。In one of the embodiments, calling the target tracking model and obtaining the candidate area label corresponding to the candidate area based on the projection image and the standard area image includes: extracting the current image feature corresponding to the projected image and the standard image feature corresponding to the standard area image; Input the current image features and standard image features into the target tracking model; filter the current image features and standard image features based on the target tracking model to obtain candidate region labels corresponding to multiple candidate regions output by the target tracking model.
计算机设备可以对当前帧点云数据对应的投影图像,以及标准帧点云数据对应的标准区域图像进行特征提取,得到投影图像对应的当前图像特征和标准区域图像对应的标准图像特征。具体的,计算机设备可以单线程依次提取投影图像和标准区域图像的图像特征,也可以多线程并行提取投影图像和标准区域图像的图像特征。计算机设备可以调用图像特征模型,对投影图像和标准区域图像进行特征提取,得到图像特征模型输出的当前图像特征和标准图像特征。图像特征模型可以是二维卷积神经网络模型。在其中一个实施例中,当计算机设备多线程并行提取图像特征时,计算机设备可以获取图像特征模型对应的孪生网络模型,并行提取投影图像和标准区域图像的特征。The computer device can perform feature extraction on the projection image corresponding to the current frame point cloud data and the standard region image corresponding to the standard frame point cloud data to obtain the current image feature corresponding to the projection image and the standard image feature corresponding to the standard region image. Specifically, the computer device may sequentially extract the image features of the projected image and the standard area image in a single thread, or it may extract the image characteristics of the projected image and the standard area image in parallel in multiple threads. The computer equipment can call the image feature model, perform feature extraction on the projection image and the standard area image, and obtain the current image features and standard image features output by the image feature model. The image feature model may be a two-dimensional convolutional neural network model. In one of the embodiments, when the computer device extracts image features in parallel in multiple threads, the computer device can obtain the twin network model corresponding to the image feature model, and extract the features of the projected image and the standard area image in parallel.
计算机设备可以将提取到的当前图像特征和标准图像特征输入至目标跟踪模型。目标跟踪模型可以是多种卷积神经网络模型中的一种。例如,目标跟踪模型具体可以包括SiamMask模型、Siamese RPN(RegionProposal Network,区域生成网络)模型等。计算机设备可以基于目标跟踪模型,对当前图像特征和标准图像特征进行处理。具体的,目标跟踪模型可以对当前图像特征和标准图像特征进行卷积滤波,将当前图像特征与标准图像特征分别进行比对,得到目标跟踪模型输出的多个候选区域各自对应的候选区域标签,候选区域的头像图像与标准区域图像相对应。在其中一个实施例中,当标准帧点云数据对应多个标准区域图像时,计算机可以获取多个目标跟踪模型对应的孪生网络模型,对多个标准区域图像对应的标准图像特征进行运算,得到与多个标准区域图像相对应的候选区域。The computer device can input the extracted current image features and standard image features into the target tracking model. The target tracking model can be one of a variety of convolutional neural network models. For example, the target tracking model may specifically include a SiamMask model, a Siamese RPN (Region Proposal Network) model, etc. The computer equipment can process the current image features and standard image features based on the target tracking model. Specifically, the target tracking model can perform convolution filtering on the current image feature and the standard image feature, and compare the current image feature with the standard image feature respectively to obtain the candidate region labels corresponding to the multiple candidate regions output by the target tracking model. The avatar image of the candidate area corresponds to the standard area image. In one of the embodiments, when the standard frame point cloud data corresponds to multiple standard area images, the computer can obtain the twin network models corresponding to the multiple target tracking models, and perform operations on the standard image features corresponding to the multiple standard area images to obtain Candidate regions corresponding to multiple standard region images.
在本实施例中,计算机设备通过调用目标跟踪模型对投影图像对应的当前图像特征,和标准区域图像对应的标准图像特征进行运算,得到多个候选区域对应的候选区域标签,充分利用了点云数据所对应图像的图像特征,通过深度学习模型确定多个候选区域。相较于对点云进行卡尔曼滤波的跟踪方式,有效的提高了目标跟踪的准确性。In this embodiment, the computer device calculates the current image feature corresponding to the projected image and the standard image feature corresponding to the standard area image by calling the target tracking model to obtain candidate area labels corresponding to multiple candidate areas, making full use of the point cloud The image features of the image corresponding to the data are used to determine multiple candidate regions through the deep learning model. Compared with the tracking method of Kalman filtering on the point cloud, the accuracy of target tracking is effectively improved.
在其中一个实施例中,基于目标跟踪模型对当前图像特征与标准图像特征进行滤波处理的步骤包括:获取历史特征矩阵;根据历史特征矩阵对标准图像特征进行调整,根据调整后的图像特征进行滤波处理。In one of the embodiments, the step of filtering the current image features and standard image features based on the target tracking model includes: obtaining a historical feature matrix; adjusting the standard image features according to the historical feature matrix, and filtering according to the adjusted image features deal with.
在调用目标跟踪模型对标准图像特征和当前图像特征进行运算之前,计算机设备还可以获取历史特征矩阵。历史特征矩阵是指计算机设备根据历史点云数据中,历史目标图像所对应的历史图像特征生成的特征矩阵。历史点云数据可以包括在当前帧点云数据之前,激光传感器采集的包括目标的点云数据。历史特征矩阵可以是由多帧历史点云数据中目标所对应的图像特征生成的,历史特征矩阵和历史点云数据可以存储在计算机设备对应的存储器中。Before calling the target tracking model to perform operations on standard image features and current image features, the computer device can also obtain a historical feature matrix. The historical feature matrix refers to a feature matrix generated by a computer device based on historical image features corresponding to historical target images in historical point cloud data. The historical point cloud data may include the point cloud data including the target collected by the laser sensor before the point cloud data of the current frame. The historical feature matrix may be generated from image features corresponding to the target in multiple frames of historical point cloud data, and the historical feature matrix and historical point cloud data may be stored in a memory corresponding to the computer device.
可以理解的,计算机设备在对当前帧点云数据结束目标跟踪之后,可以将当前帧点云数据记作历史点云数据。计算机设备可以根据当前帧点云数据 对应目标跟踪区域的图像特征调整历史特征矩阵,不断对目标对应的历史特征矩阵进行调整,有效的提高了目标对应的历史特征矩阵的准确性和鲁棒性。It is understandable that the computer device can record the current frame point cloud data as historical point cloud data after finishing target tracking on the current frame point cloud data. The computer device can adjust the historical feature matrix according to the image characteristics of the target tracking area corresponding to the current frame point cloud data, and continuously adjust the historical feature matrix corresponding to the target, which effectively improves the accuracy and robustness of the historical feature matrix corresponding to the target.
计算机设备可以根据获取的历史特征矩阵对标准图像特征进行调整,具体的,计算机设备可以通过目标跟踪模型对历史特征矩阵和标准图像特征进行卷积处理,得到调整后的图像特征。计算机设备可以根据调整后的图像特征与当前图像特征进行卷积滤波,得到多个候选区域对应的候选区域标签。The computer device can adjust the standard image feature according to the acquired historical feature matrix. Specifically, the computer device can perform convolution processing on the historical feature matrix and the standard image feature through the target tracking model to obtain the adjusted image feature. The computer device may perform convolution filtering according to the adjusted image feature and the current image feature to obtain candidate region labels corresponding to multiple candidate regions.
在本实施例中,计算机设备可以获取目标对应的历史特征矩阵对标准图像特征进行调整,根据调整后的图像特征进行滤波处理,得到多个候选区域对应的候选区域标签。通过目标对应的历史特征矩阵对标准图像特征进行调整,调整后的图像特征能够更加准确的反映目标在图像中的特征,通过历史时间中的多帧点云数据,有效的提高了目标跟踪的准确性和鲁棒性。In this embodiment, the computer device can obtain the historical feature matrix corresponding to the target to adjust the standard image features, and perform filtering processing according to the adjusted image features to obtain candidate region labels corresponding to multiple candidate regions. The standard image features are adjusted through the historical feature matrix corresponding to the target. The adjusted image features can more accurately reflect the characteristics of the target in the image. Through the multi-frame point cloud data in the historical time, the accuracy of target tracking is effectively improved. And robustness.
应该理解的是,虽然图2-3的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图2-3中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。It should be understood that, although the various steps in the flowcharts of FIGS. 2-3 are displayed in sequence as indicated by the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless specifically stated in this article, the execution of these steps is not strictly limited in order, and these steps can be executed in other orders. Moreover, at least some of the steps in Figure 2-3 may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed at the same time, but can be executed at different times. These sub-steps or stages The execution order of is not necessarily performed sequentially, but may be performed alternately or alternately with at least a part of other steps or sub-steps or stages of other steps.
在一个实施例中,如图4所示,提供了一种图像跟踪处理装置,包括:点云获取模块402、预处理模块404、标准图像获取模块406和目标跟踪模块408,其中:In one embodiment, as shown in FIG. 4, an image tracking processing device is provided, including: a point cloud acquisition module 402, a preprocessing module 404, a standard image acquisition module 406, and a target tracking module 408, wherein:
点云获取模块402,用于获取当前帧点云数据。The point cloud acquisition module 402 is used to acquire the point cloud data of the current frame.
预处理模块404,用于对当前帧点云数据进行预处理,生成投影图像。The preprocessing module 404 is used to preprocess the point cloud data of the current frame to generate a projection image.
标准图像获取模块406,用于获取标准帧点云数据对应的标准区域图像。The standard image acquisition module 406 is used to acquire the standard area image corresponding to the standard frame point cloud data.
目标跟踪模块408,用于调用目标跟踪模型,基于投影图像以及标准区域图像获取候选区域对应的候选区域标签;根据候选区域标签确定当前帧点 云数据对应的目标跟踪区域。The target tracking module 408 is used to call the target tracking model, obtain the candidate area label corresponding to the candidate area based on the projection image and the standard area image; determine the target tracking area corresponding to the point cloud data of the current frame according to the candidate area label.
在其中一个实施例中,上述预处理模块404还用于获取目标跟踪任务;根据目标跟踪任务获取相对应的图像平面;将当前帧点云数据中的点投影至图像平面,得到投影图像。In one of the embodiments, the preprocessing module 404 is also used to obtain a target tracking task; obtain a corresponding image plane according to the target tracking task; project a point in the point cloud data of the current frame onto the image plane to obtain a projected image.
在其中一个实施例中,上述标准图像获取模块406还用于根据标准帧点云数据生成标准帧图像;获取标准帧图像所对应的标准检测区域;在标准帧图像中截取与标准检测区域相匹配的标准区域图像。In one of the embodiments, the above-mentioned standard image acquisition module 406 is further configured to generate a standard frame image according to the standard frame point cloud data; acquire the standard detection area corresponding to the standard frame image; and intercept the standard frame image to match the standard detection area Standard area image.
在其中一个实施例中,上述标准图像获取模块406还用于将标准帧点云数据进行栅格化处理,得到多个栅格;提取多个栅格中标准帧点云数据对应的点特征,生成点特征矩阵;调用目标检测模型,将点特征矩阵输入至目标检测模型,得到标准帧点云数据对应的点云检测区域;根据点云检测区域确定标准帧图像对应的标准检测区域。In one of the embodiments, the above-mentioned standard image acquisition module 406 is also used for rasterizing the standard frame point cloud data to obtain multiple grids; extracting point features corresponding to the standard frame point cloud data in the multiple grids, Generate a point feature matrix; call the target detection model and input the point feature matrix to the target detection model to obtain the point cloud detection area corresponding to the standard frame point cloud data; determine the standard detection area corresponding to the standard frame image according to the point cloud detection area.
在其中一个实施例中,上述目标跟踪模块408还用于提取投影图像对应的当前图像特征,以及标准区域图像对应的标准图像特征;将当前图像特征与标准图像特征输入至目标跟踪模型;基于目标跟踪模型对当前图像特征与标准图像特征进行滤波处理,得到目标跟踪模型输出的多个候选区域对应的候选区域标签。In one of the embodiments, the target tracking module 408 is also used to extract the current image features corresponding to the projected image and the standard image features corresponding to the standard area image; input the current image features and standard image features into the target tracking model; The tracking model performs filtering processing on the current image features and standard image features to obtain candidate region labels corresponding to multiple candidate regions output by the target tracking model.
在其中一个实施例中,上述目标跟踪模块408还用于获取历史特征矩阵;根据历史特征矩阵对标准图像特征进行调整,根据调整后的图像特征进行滤波处理。In one of the embodiments, the target tracking module 408 is also used to obtain a historical feature matrix; adjust the standard image features according to the historical feature matrix, and perform filtering processing according to the adjusted image features.
在其中一个实施例中,候选区域标签包括区域置信度,上述目标跟踪模块408还用于根据区域置信度对多个候选区域进行筛选;确定筛选出的候选区域作为当前帧点云数据对应的目标跟踪区域。In one of the embodiments, the candidate area label includes the area confidence, and the target tracking module 408 is also used to screen multiple candidate areas according to the area confidence; determine the selected candidate area as the target corresponding to the point cloud data of the current frame Tracking area.
关于图像跟踪处理装置的具体限定可以参见上文中对于图像跟踪处理方法的限定,在此不再赘述。上述图像跟踪处理装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立 于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。For the specific definition of the image tracking processing device, please refer to the above definition of the image tracking processing method, which will not be repeated here. Each module in the above-mentioned image tracking processing device can be implemented in whole or in part by software, hardware, and a combination thereof. The above-mentioned modules can be embedded in the form of hardware or independent of the processor in the computer equipment, or can be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the corresponding operations of the above-mentioned modules.
在一个实施例中,提供了一种计算机设备,该计算机设备可以是服务器,其内部结构图可以如图5所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口和数据库。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机可读指令和数据库。该内存储器为非易失性存储介质中的操作系统和计算机可读指令的运行提供环境。该计算机设备的数据库用于存储图像跟踪处理数据。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机可读指令被处理器执行时以实现一种图像跟踪处理方法。In one embodiment, a computer device is provided. The computer device may be a server, and its internal structure diagram may be as shown in FIG. 5. The computer equipment includes a processor, a memory, a network interface, and a database connected through a system bus. Among them, the processor of the computer device is used to provide calculation and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer readable instructions, and a database. The internal memory provides an environment for the operation of the operating system and computer-readable instructions in the non-volatile storage medium. The database of the computer equipment is used to store image tracking processing data. The network interface of the computer device is used to communicate with an external terminal through a network connection. The computer readable instruction is executed by the processor to realize an image tracking processing method.
本领域技术人员可以理解,图5中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。Those skilled in the art can understand that the structure shown in FIG. 5 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device to which the solution of the present application is applied. The specific computer device may Including more or fewer parts than shown in the figure, or combining some parts, or having a different arrangement of parts.
在其中一个实施例中,提供了一种计算机设备,包括存储器和一个或多个处理器,存储器中储存有计算机可读指令,计算机可读指令被处理器执行时,使得一个或多个处理器执行时实现上述方法实施例中的步骤。In one of the embodiments, a computer device is provided, including a memory and one or more processors. The memory stores computer-readable instructions. When the computer-readable instructions are executed by the processor, the one or more processors When executed, the steps in the above method embodiments are implemented.
在其中一个实施例中,提供了一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行时实现上述方法实施例中的步骤。In one of the embodiments, one or more non-volatile computer-readable storage media storing computer-readable instructions are provided. When the computer-readable instructions are executed by one or more processors, one or more processing The steps in the above method embodiments are implemented when the device is executed.
在其中一个实施例中,提供了一种交通工具,交通工具具体可以包括自动驾驶车辆、电动车、自行车以及飞行器等,交通工具包括上述计算机设备,可以执行上述图像跟踪处理方法实施例中的步骤。In one of the embodiments, a vehicle is provided. The vehicle may specifically include self-driving vehicles, electric vehicles, bicycles, and aircraft. The vehicle includes the above-mentioned computer equipment and can execute the steps in the above-mentioned image tracking processing method embodiment. .
本发明创造的实施例、实施对象并不局限于自动驾驶车辆、电动车、自行车、飞行器、机器人等,也包括运用到与这些装置相关的仿真模拟装置、测试设备等。The embodiments and implementation objects created by the present invention are not limited to autonomous vehicles, electric vehicles, bicycles, aircrafts, robots, etc., but also include simulation devices and test equipment related to these devices.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一非易失性计算机可读取存储介质中,该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。A person of ordinary skill in the art can understand that all or part of the processes in the above-mentioned embodiment methods can be implemented by instructing relevant hardware through computer-readable instructions. The computer-readable instructions can be stored in a non-volatile computer. In a readable storage medium, when the computer-readable instructions are executed, they may include the processes of the above-mentioned method embodiments. Wherein, any reference to memory, storage, database, or other media used in the embodiments provided in this application may include non-volatile and/or volatile memory. Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. As an illustration and not a limitation, RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。The technical features of the above embodiments can be combined arbitrarily. In order to make the description concise, all possible combinations of the technical features in the above embodiments are not described. However, as long as there is no contradiction in the combination of these technical features, they should be It is considered as the range described in this specification.
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。The above-mentioned embodiments only express several implementation manners of the present application, and the description is relatively specific and detailed, but it should not be understood as a limitation on the scope of the invention patent. It should be pointed out that for those of ordinary skill in the art, without departing from the concept of this application, several modifications and improvements can be made, and these all fall within the protection scope of this application. Therefore, the scope of protection of the patent of this application shall be subject to the appended claims.

Claims (20)

  1. 一种图像跟踪处理方法,包括:An image tracking processing method, including:
    获取当前帧点云数据;Obtain the point cloud data of the current frame;
    对所述当前帧点云数据进行预处理,生成投影图像;Preprocessing the point cloud data of the current frame to generate a projection image;
    获取标准帧点云数据对应的标准区域图像;Obtain the standard area image corresponding to the standard frame point cloud data;
    调用目标跟踪模型,基于所述投影图像以及所述标准区域图像获取候选区域对应的候选区域标签;及Calling the target tracking model, and obtaining the candidate area label corresponding to the candidate area based on the projection image and the standard area image; and
    根据所述候选区域标签确定所述当前帧点云数据对应的目标跟踪区域。The target tracking area corresponding to the point cloud data of the current frame is determined according to the candidate area tag.
  2. 根据权利要求1所述的方法,其特征在于,所述对所述当前帧点云数据进行预处理,生成投影图像,包括:The method according to claim 1, wherein the preprocessing the point cloud data of the current frame to generate a projection image comprises:
    获取目标跟踪任务;Obtain target tracking tasks;
    根据所述目标跟踪任务获取相对应的图像平面;及Obtain a corresponding image plane according to the target tracking task; and
    将所述当前帧点云数据中的点投影至所述图像平面,得到投影图像。Projecting the points in the point cloud data of the current frame onto the image plane to obtain a projected image.
  3. 根据权利要求1所述的方法,其特征在于,所述获取标准帧点云数据对应的标准区域图像,包括:The method according to claim 1, wherein said obtaining a standard area image corresponding to standard frame point cloud data comprises:
    根据所述标准帧点云数据生成标准帧图像;Generating a standard frame image according to the standard frame point cloud data;
    获取所述标准帧图像所对应的标准检测区域;及Acquiring the standard detection area corresponding to the standard frame image; and
    在所述标准帧图像中截取与所述标准检测区域相匹配的标准区域图像。A standard area image matching the standard detection area is intercepted from the standard frame image.
  4. 根据权利要求3所述的方法,其特征在于,所述获取所述标准帧图像所对应的标准检测区域,包括:The method according to claim 3, wherein said obtaining the standard detection area corresponding to the standard frame image comprises:
    将所述标准帧点云数据进行栅格化处理,得到多个栅格;Performing rasterization processing on the standard frame point cloud data to obtain multiple rasters;
    提取多个所述栅格中标准帧点云数据对应的点特征,生成点特征矩阵;Extracting point features corresponding to standard frame point cloud data in a plurality of said grids to generate a point feature matrix;
    调用目标检测模型,将所述点特征矩阵输入至所述目标检测模型,得到所述标准帧点云数据对应的点云检测区域;及Calling a target detection model, input the point feature matrix to the target detection model, and obtain the point cloud detection area corresponding to the standard frame point cloud data; and
    根据所述点云检测区域确定所述标准帧图像对应的标准检测区域。The standard detection area corresponding to the standard frame image is determined according to the point cloud detection area.
  5. 根据权利要求1所述的方法,其特征在于,所述调用目标跟踪模型,基于所述投影图像以及所述标准区域图像获取候选区域对应的候选区域标 签,包括:The method according to claim 1, wherein the invoking the target tracking model to obtain the candidate area label corresponding to the candidate area based on the projection image and the standard area image comprises:
    提取所述投影图像对应的当前图像特征,以及所述标准区域图像对应的标准图像特征;Extracting the current image feature corresponding to the projected image and the standard image feature corresponding to the standard area image;
    将所述当前图像特征与所述标准图像特征输入至所述目标跟踪模型;及Inputting the current image feature and the standard image feature to the target tracking model; and
    基于所述目标跟踪模型对所述当前图像特征与所述标准图像特征进行滤波处理,得到所述目标跟踪模型输出的多个候选区域对应的候选区域标签。Perform filtering processing on the current image feature and the standard image feature based on the target tracking model to obtain candidate region labels corresponding to multiple candidate regions output by the target tracking model.
  6. 根据权利要求5所述的方法,其特征在于,所述基于所述目标跟踪模型对所述当前图像特征与所述标准图像特征进行滤波处理,包括:The method according to claim 5, wherein the filtering processing of the current image feature and the standard image feature based on the target tracking model comprises:
    获取历史特征矩阵;及Obtain the historical feature matrix; and
    根据所述历史特征矩阵对所述标准图像特征进行调整,根据调整后的图像特征进行滤波处理。The standard image feature is adjusted according to the historical feature matrix, and filtering processing is performed according to the adjusted image feature.
  7. 根据权利要求1所述的方法,其特征在于,所述候选区域标签包括区域置信度,所述根据所述候选区域标签确定所述当前帧点云数据对应的目标跟踪区域,包括:The method according to claim 1, wherein the candidate area label includes an area confidence, and the determining the target tracking area corresponding to the current frame point cloud data according to the candidate area label comprises:
    根据所述区域置信度对多个所述候选区域进行筛选;及Screening a plurality of the candidate regions according to the region confidence; and
    确定筛选出的候选区域作为所述当前帧点云数据对应的目标跟踪区域。Determine the selected candidate area as the target tracking area corresponding to the point cloud data of the current frame.
  8. 一种图像跟踪处理装置,包括:An image tracking processing device, including:
    点云获取模块,用于获取当前帧点云数据;Point cloud acquisition module for acquiring point cloud data of the current frame;
    预处理模块,用于对所述当前帧点云数据进行预处理,生成投影图像;The preprocessing module is used to preprocess the point cloud data of the current frame to generate a projection image;
    标准图像获取模块,用于获取标准帧点云数据对应的标准区域图像;及The standard image acquisition module is used to acquire the standard area image corresponding to the standard frame point cloud data; and
    目标跟踪模块,用于调用目标跟踪模型,基于所述投影图像以及所述标准区域图像获取候选区域对应的候选区域标签;根据所述候选区域标签确定所述当前帧点云数据对应的目标跟踪区域。The target tracking module is used to call the target tracking model, and obtain the candidate area label corresponding to the candidate area based on the projection image and the standard area image; determine the target tracking area corresponding to the current frame point cloud data according to the candidate area label .
  9. 根据权利要求8所述的装置,其特征在于,所述预处理模块还用于获取目标跟踪任务;根据所述目标跟踪任务获取相对应的图像平面;及将所述当前帧点云数据中的点投影至所述图像平面,得到投影图像。The device according to claim 8, wherein the pre-processing module is further used to obtain a target tracking task; obtain a corresponding image plane according to the target tracking task; and combine the current frame point cloud data The point is projected onto the image plane to obtain a projected image.
  10. 一种计算机设备,包括存储器及一个或多个处理器,所述存储器中 储存有计算机可读指令,所述计算机可读指令被所述一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:A computer device includes a memory and one or more processors. The memory stores computer-readable instructions. When the computer-readable instructions are executed by the one or more processors, the one or more Each processor performs the following steps:
    获取当前帧点云数据;Obtain the point cloud data of the current frame;
    对所述当前帧点云数据进行预处理,生成投影图像;Preprocessing the point cloud data of the current frame to generate a projection image;
    获取标准帧点云数据对应的标准区域图像;Obtain the standard area image corresponding to the standard frame point cloud data;
    调用目标跟踪模型,基于所述投影图像以及所述标准区域图像获取候选区域对应的候选区域标签;及Calling the target tracking model, and obtaining the candidate area label corresponding to the candidate area based on the projection image and the standard area image; and
    根据所述候选区域标签确定所述当前帧点云数据对应的目标跟踪区域。The target tracking area corresponding to the point cloud data of the current frame is determined according to the candidate area tag.
  11. 根据权利要求10所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:The computer device according to claim 10, wherein the processor further executes the following steps when executing the computer-readable instruction:
    获取目标跟踪任务;Obtain target tracking tasks;
    根据所述目标跟踪任务获取相对应的图像平面;及Obtain a corresponding image plane according to the target tracking task; and
    将所述当前帧点云数据中的点投影至所述图像平面,得到投影图像。Projecting the points in the point cloud data of the current frame onto the image plane to obtain a projected image.
  12. 根据权利要求10所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:The computer device according to claim 10, wherein the processor further executes the following steps when executing the computer-readable instruction:
    根据所述标准帧点云数据生成标准帧图像;Generating a standard frame image according to the standard frame point cloud data;
    获取所述标准帧图像所对应的标准检测区域;及Acquiring the standard detection area corresponding to the standard frame image; and
    在所述标准帧图像中截取与所述标准检测区域相匹配的标准区域图像。A standard area image matching the standard detection area is intercepted from the standard frame image.
  13. 根据权利要求12所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:The computer device according to claim 12, wherein the processor further executes the following steps when executing the computer-readable instruction:
    将所述标准帧点云数据进行栅格化处理,得到多个栅格;Performing rasterization processing on the standard frame point cloud data to obtain multiple rasters;
    提取多个所述栅格中标准帧点云数据对应的点特征,生成点特征矩阵;Extracting point features corresponding to standard frame point cloud data in a plurality of said grids to generate a point feature matrix;
    调用目标检测模型,将所述点特征矩阵输入至所述目标检测模型,得到所述标准帧点云数据对应的点云检测区域;及Calling a target detection model, input the point feature matrix to the target detection model, and obtain the point cloud detection area corresponding to the standard frame point cloud data; and
    根据所述点云检测区域确定所述标准帧图像对应的标准检测区域。The standard detection area corresponding to the standard frame image is determined according to the point cloud detection area.
  14. 根据权利要求10所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:The computer device according to claim 10, wherein the processor further executes the following steps when executing the computer-readable instruction:
    提取所述投影图像对应的当前图像特征,以及所述标准区域图像对应的标准图像特征;Extracting the current image feature corresponding to the projected image and the standard image feature corresponding to the standard area image;
    将所述当前图像特征与所述标准图像特征输入至所述目标跟踪模型;及Inputting the current image feature and the standard image feature to the target tracking model; and
    基于所述目标跟踪模型对所述当前图像特征与所述标准图像特征进行滤波处理,得到所述目标跟踪模型输出的多个候选区域对应的候选区域标签。Perform filtering processing on the current image feature and the standard image feature based on the target tracking model to obtain candidate region labels corresponding to multiple candidate regions output by the target tracking model.
  15. 根据权利要求14所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:The computer device according to claim 14, wherein the processor further executes the following steps when executing the computer-readable instruction:
    获取历史特征矩阵;及Obtain the historical feature matrix; and
    根据所述历史特征矩阵对所述标准图像特征进行调整,根据调整后的图像特征进行滤波处理。The standard image feature is adjusted according to the historical feature matrix, and filtering processing is performed according to the adjusted image feature.
  16. 一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:One or more non-volatile computer-readable storage media storing computer-readable instructions. When the computer-readable instructions are executed by one or more processors, the one or more processors execute the following steps:
    获取当前帧点云数据;Obtain the point cloud data of the current frame;
    对所述当前帧点云数据进行预处理,生成投影图像;Preprocessing the point cloud data of the current frame to generate a projection image;
    获取标准帧点云数据对应的标准区域图像;Obtain the standard area image corresponding to the standard frame point cloud data;
    调用目标跟踪模型,基于所述投影图像以及所述标准区域图像获取候选区域对应的候选区域标签;及Calling the target tracking model, and obtaining the candidate area label corresponding to the candidate area based on the projection image and the standard area image; and
    根据所述候选区域标签确定所述当前帧点云数据对应的目标跟踪区域。The target tracking area corresponding to the point cloud data of the current frame is determined according to the candidate area tag.
  17. 根据权利要求16所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:The storage medium according to claim 16, wherein the following steps are further executed when the computer-readable instructions are executed by the processor:
    获取目标跟踪任务;Obtain target tracking tasks;
    根据所述目标跟踪任务获取相对应的图像平面;及Obtain a corresponding image plane according to the target tracking task; and
    将所述当前帧点云数据中的点投影至所述图像平面,得到投影图像。Projecting the points in the point cloud data of the current frame onto the image plane to obtain a projected image.
  18. 根据权利要求16所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:The storage medium according to claim 16, wherein the following steps are further executed when the computer-readable instructions are executed by the processor:
    根据所述标准帧点云数据生成标准帧图像;Generating a standard frame image according to the standard frame point cloud data;
    获取所述标准帧图像所对应的标准检测区域;及Acquiring the standard detection area corresponding to the standard frame image; and
    在所述标准帧图像中截取与所述标准检测区域相匹配的标准区域图像。A standard area image matching the standard detection area is intercepted from the standard frame image.
  19. 根据权利要求18所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:18. The storage medium of claim 18, wherein the following steps are further performed when the computer-readable instructions are executed by the processor:
    将所述标准帧点云数据进行栅格化处理,得到多个栅格;Performing rasterization processing on the standard frame point cloud data to obtain multiple rasters;
    提取多个所述栅格中标准帧点云数据对应的点特征,生成点特征矩阵;Extracting point features corresponding to standard frame point cloud data in a plurality of said grids to generate a point feature matrix;
    调用目标检测模型,将所述点特征矩阵输入至所述目标检测模型,得到所述标准帧点云数据对应的点云检测区域;及Calling a target detection model, input the point feature matrix to the target detection model, and obtain the point cloud detection area corresponding to the standard frame point cloud data; and
    根据所述点云检测区域确定所述标准帧图像对应的标准检测区域。The standard detection area corresponding to the standard frame image is determined according to the point cloud detection area.
  20. 一种交通工具,包括执行根据权利要求1-7任一项所述的图像跟踪处理方法。A vehicle, comprising executing the image tracking processing method according to any one of claims 1-7.
PCT/CN2019/130077 2019-12-30 2019-12-30 Image tracking processing method and apparatus, and computer device and storage medium WO2021134285A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201980037486.7A CN113490965A (en) 2019-12-30 2019-12-30 Image tracking processing method and device, computer equipment and storage medium
PCT/CN2019/130077 WO2021134285A1 (en) 2019-12-30 2019-12-30 Image tracking processing method and apparatus, and computer device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/130077 WO2021134285A1 (en) 2019-12-30 2019-12-30 Image tracking processing method and apparatus, and computer device and storage medium

Publications (1)

Publication Number Publication Date
WO2021134285A1 true WO2021134285A1 (en) 2021-07-08

Family

ID=76686180

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/130077 WO2021134285A1 (en) 2019-12-30 2019-12-30 Image tracking processing method and apparatus, and computer device and storage medium

Country Status (2)

Country Link
CN (1) CN113490965A (en)
WO (1) WO2021134285A1 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113516687A (en) * 2021-07-09 2021-10-19 东软睿驰汽车技术(沈阳)有限公司 Target tracking method, device, equipment and storage medium
CN113744236A (en) * 2021-08-30 2021-12-03 阿里巴巴达摩院(杭州)科技有限公司 Loop detection method, device, storage medium and computer program product
CN114022520A (en) * 2021-10-12 2022-02-08 山西大学 Robot target tracking method based on Kalman filtering and twin network
CN114383502A (en) * 2021-12-29 2022-04-22 国能铁路装备有限责任公司 Method and device for measuring wear amount of fittings of bogie and measuring equipment
CN115035492A (en) * 2022-06-21 2022-09-09 苏州浪潮智能科技有限公司 Vehicle identification method, device, equipment and storage medium
CN115063445A (en) * 2022-08-18 2022-09-16 南昌工程学院 Target tracking method and system based on multi-scale hierarchical feature representation
CN115131694A (en) * 2021-12-02 2022-09-30 北京工商大学 Target tracking method and system based on twin network and YOLO target detection model

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106355194A (en) * 2016-08-22 2017-01-25 广东华中科技大学工业技术研究院 Treatment method for surface target of unmanned ship based on laser imaging radar
CN109345510A (en) * 2018-09-07 2019-02-15 百度在线网络技术(北京)有限公司 Object detecting method, device, equipment, storage medium and vehicle
US20190147245A1 (en) * 2017-11-14 2019-05-16 Nuro, Inc. Three-dimensional object detection for autonomous robotic systems using image proposals
CN110533693A (en) * 2019-08-29 2019-12-03 北京精英路通科技有限公司 A kind of method for tracking target and target tracker
CN110570457A (en) * 2019-08-07 2019-12-13 中山大学 Three-dimensional object detection and tracking method based on stream data

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106096516A (en) * 2016-06-01 2016-11-09 常州漫道罗孚特网络科技有限公司 The method and device that a kind of objective is followed the tracks of
CN109949347B (en) * 2019-03-15 2021-09-17 百度在线网络技术(北京)有限公司 Human body tracking method, device, system, electronic equipment and storage medium
CN110246159B (en) * 2019-06-14 2023-03-28 湖南大学 3D target motion analysis method based on vision and radar information fusion

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106355194A (en) * 2016-08-22 2017-01-25 广东华中科技大学工业技术研究院 Treatment method for surface target of unmanned ship based on laser imaging radar
US20190147245A1 (en) * 2017-11-14 2019-05-16 Nuro, Inc. Three-dimensional object detection for autonomous robotic systems using image proposals
CN109345510A (en) * 2018-09-07 2019-02-15 百度在线网络技术(北京)有限公司 Object detecting method, device, equipment, storage medium and vehicle
CN110570457A (en) * 2019-08-07 2019-12-13 中山大学 Three-dimensional object detection and tracking method based on stream data
CN110533693A (en) * 2019-08-29 2019-12-03 北京精英路通科技有限公司 A kind of method for tracking target and target tracker

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113516687A (en) * 2021-07-09 2021-10-19 东软睿驰汽车技术(沈阳)有限公司 Target tracking method, device, equipment and storage medium
CN113744236A (en) * 2021-08-30 2021-12-03 阿里巴巴达摩院(杭州)科技有限公司 Loop detection method, device, storage medium and computer program product
CN113744236B (en) * 2021-08-30 2024-05-24 阿里巴巴达摩院(杭州)科技有限公司 Loop detection method, device, storage medium and computer program product
CN114022520A (en) * 2021-10-12 2022-02-08 山西大学 Robot target tracking method based on Kalman filtering and twin network
CN114022520B (en) * 2021-10-12 2024-05-28 山西大学 Robot target tracking method based on Kalman filtering and twin network
CN115131694A (en) * 2021-12-02 2022-09-30 北京工商大学 Target tracking method and system based on twin network and YOLO target detection model
CN114383502A (en) * 2021-12-29 2022-04-22 国能铁路装备有限责任公司 Method and device for measuring wear amount of fittings of bogie and measuring equipment
CN115035492A (en) * 2022-06-21 2022-09-09 苏州浪潮智能科技有限公司 Vehicle identification method, device, equipment and storage medium
CN115035492B (en) * 2022-06-21 2024-01-23 苏州浪潮智能科技有限公司 Vehicle identification method, device, equipment and storage medium
CN115063445A (en) * 2022-08-18 2022-09-16 南昌工程学院 Target tracking method and system based on multi-scale hierarchical feature representation
CN115063445B (en) * 2022-08-18 2022-11-08 南昌工程学院 Target tracking method and system based on multi-scale hierarchical feature representation

Also Published As

Publication number Publication date
CN113490965A (en) 2021-10-08

Similar Documents

Publication Publication Date Title
CN111160302B (en) Obstacle information identification method and device based on automatic driving environment
WO2021134285A1 (en) Image tracking processing method and apparatus, and computer device and storage medium
CN111191600B (en) Obstacle detection method, obstacle detection device, computer device, and storage medium
CN110163904B (en) Object labeling method, movement control method, device, equipment and storage medium
WO2021134296A1 (en) Obstacle detection method and apparatus, and computer device and storage medium
US20210042929A1 (en) Three-dimensional object detection method and system based on weighted channel features of a point cloud
CN108297115B (en) Autonomous repositioning method for robot
JP6464337B2 (en) Traffic camera calibration update using scene analysis
WO2022099530A1 (en) Motion segmentation method and apparatus for point cloud data, computer device and storage medium
CN111080693A (en) Robot autonomous classification grabbing method based on YOLOv3
WO2021134258A1 (en) Point cloud-based target tracking method and apparatus, computer device and storage medium
JP2012221456A (en) Object identification device and program
CN111626314B (en) Classification method and device for point cloud data, computer equipment and storage medium
CN110298281B (en) Video structuring method and device, electronic equipment and storage medium
CN115049700A (en) Target detection method and device
CN109840463B (en) Lane line identification method and device
CN111354022B (en) Target Tracking Method and System Based on Kernel Correlation Filtering
CN110262487B (en) Obstacle detection method, terminal and computer readable storage medium
WO2021056501A1 (en) Feature point extraction method, movable platform and storage medium
WO2022133770A1 (en) Method for generating point cloud normal vector, apparatus, computer device, and storage medium
CN111382637A (en) Pedestrian detection tracking method, device, terminal equipment and medium
CN114692720A (en) Image classification method, device, equipment and storage medium based on aerial view
CN109492521B (en) Face positioning method and robot
WO2021114775A1 (en) Object detection method, object detection device, terminal device, and medium
CN114639159A (en) Moving pedestrian detection method, electronic device and robot

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19958406

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19958406

Country of ref document: EP

Kind code of ref document: A1