CN110570457A - Three-dimensional object detection and tracking method based on stream data - Google Patents

Three-dimensional object detection and tracking method based on stream data Download PDF

Info

Publication number
CN110570457A
CN110570457A CN201910725207.8A CN201910725207A CN110570457A CN 110570457 A CN110570457 A CN 110570457A CN 201910725207 A CN201910725207 A CN 201910725207A CN 110570457 A CN110570457 A CN 110570457A
Authority
CN
China
Prior art keywords
frame
dimensional
frames
feature
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910725207.8A
Other languages
Chinese (zh)
Other versions
CN110570457B (en
Inventor
黄凯
郭叙森
古剑锋
郭思璐
杨铖章
许子潇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Sun Yat Sen University
Original Assignee
National Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Sun Yat Sen University filed Critical National Sun Yat Sen University
Priority to CN201910725207.8A priority Critical patent/CN110570457B/en
Publication of CN110570457A publication Critical patent/CN110570457A/en
Application granted granted Critical
Publication of CN110570457B publication Critical patent/CN110570457B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20016Hierarchical, coarse-to-fine, multiscale or multiresolution image processing; Pyramid transform
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning

Abstract

The invention relates to the field of three-dimensional target detection and tracking, in particular to a three-dimensional object detection and tracking method based on stream data. Then, inputting the feature map into a candidate frame extraction network to obtain a candidate frame; and acquiring a feature block from the feature map and the related feature map through the candidate frame and inputting the feature block into a regression network to respectively obtain the three-dimensional frame and the three-dimensional frame offset of the detected object. And solving other frame pictures between the key frames by an interpolation method, and correlating the targets in all the frames to obtain a tracking result. The invention only needs to detect the key frame, accelerates the speed of stream data detection, meets the requirement of the automatic driving environment on the real-time property and has better stability; meanwhile, point cloud information and image information are fused, so that the advantages and disadvantages are complemented, and the accuracy of object detection is improved.

Description

Three-dimensional object detection and tracking method based on stream data
Technical Field
The invention relates to the field of three-dimensional target detection and tracking, in particular to a three-dimensional object detection and tracking method based on stream data.
background
At present, the automatic driving and visual perception tasks are mainly divided into image-based, point cloud-based and image-point cloud fusion-based, and specifically include:
1. image-based methods are mainly represented by Mono3D,3DOP, etc., and because image data has no depth information, additional manually designed three-dimensional features need to be added. However, the single RGB data and the special hand-designed features are not conducive to the neural network to effectively learn 3D spatial information, and also limit the extension of this scenario. Furthermore, manual feature acquisition is generally time consuming and lengthy, and such methods currently have limited effectiveness and slow progress.
2. The point cloud based approach can be subdivided into three sub-branches:
Object detection is performed directly on the point cloud using 3D CNN. Such as 3D FCN and volume 3Deep, etc., which first structure point cloud data (typically three-dimensional voxels) and then extract features using three-dimensional convolution. Since the point cloud is very sparse and the three-dimensional convolution needs to be performed in three dimensions, the detection process is extremely time consuming. In addition, the size of the receptive field is limited due to high time consumption, so that the traditional 3D CNN cannot learn local features of different scales well. Secondly, a specific network structure is proposed for the point cloud, for example, the point cloud is divided into structural units such as Voxel and the like by the VoxelNet, and the network is adopted to extract features on non-empty structural units. Recently, with the proposed models of PointNet, PointNet + +, PointCNN, PointSIFT, OctNet, DynamicGraph CNN, etc., the center of gravity of the research is shifted to the method of learning the spatial geometric representation from the disordered point cloud data more effectively. Taking PointNet as an example, the work proposes a concept of a symmetric function based on the displacement invariance and the rotation invariance of point cloud data. The point cloud features can be extracted in a college manner by fitting a symmetric function by using a fully-connected network and a pooling layer. However, since such methods use a full connection layer, all points are generally required to be processed, and thus the speed when the method is applied to a large scene (the point cloud data is very much) is still to be improved. ③ work represented by PIXOR, FaF and complete-YOLO projects the point cloud onto a plane, such as a front view and a bird's eye view. The mapping process has information loss of a certain dimension, but almost all objects in the automatic driving scene are located on the same plane, so the influence of the information loss on the detection result is very small. The method simplifies the 3D CNN into the 2D CNN, reduces the space and time complexity of the algorithm, and enables real-time detection to be possible. However, due to the sparsity of the point cloud, the projected target points are few, so that the characteristic information is insufficient, and the effect is not ideal particularly for detecting small targets and distant objects.
3. scheme based on image and point cloud fusion. The method fuses rich texture information of the image and depth information of the point cloud, and representative work includes MV3D, Fuding BEV & FV, AVOD, F-PointNet and the like. The first three map the point cloud to one or several planes and optionally add manually designed features, which are then fused with the RGB image. Where MV3D is fused at the deep network layer, while Fusing BEV & FV suggests that Fusing before RPN would achieve better detection. The method needs additional modules to fuse data, so that the running speed of the model is reduced, and the real-time performance is difficult to meet. By reducing the input of manually designed features, the AVOD is able to achieve a certain real-time. On the other hand, the F-PointNet firstly uses 2D target detection to obtain a 2D positioning frame on image data, then projects the 2D positioning frame to a three-dimensional space to obtain a corresponding view field cone, and finally uses PointNet to carry out semantic segmentation on point cloud in the view field cone to finally obtain the three-dimensional positioning frame of the target. The method has the defects that the precision is limited in the 2D target detection process, and the effect on the shielding and other conditions is poor.
Disclosure of Invention
In order to overcome the problems of poor detection effect and poor real-time property in the prior art, the invention provides the three-dimensional object detection and tracking method based on the streaming data, which can accurately detect and position the three-dimensional object, improve the detection speed and realize real-time detection.
In order to solve the technical problems, the invention adopts the technical scheme that: the method for detecting and tracking the three-dimensional object based on the streaming data comprises the following steps:
The method comprises the following steps: inputting key frame data consisting of point cloud data and image data in front and back frames, preprocessing the data, and converting a projection structure of the point cloud data in a top view direction into a BEV (best effort) diagram;
step two: performing feature extraction on the two key frame data in the step one to obtain a feature map, inputting the extracted feature map into a region feature extraction module, and respectively obtaining candidate frame sets of the two key frames;
Step three: intercepting a feature block and adjusting the size of the feature block in the feature map by the candidate frame, and then inputting a classification network and a frame regression network to obtain the category and the three-dimensional frame position of the object;
step four: performing correlation operation on the data of the two key frames extracted in the step two to obtain a correlation feature map, intercepting feature blocks and adjusting the size of the candidate frame in the correlation feature map, and then inputting the feature blocks and the adjusted size into a regression network to obtain the offset of the three-dimensional frame of the object corresponding to the two key frames;
Step five: obtaining the three-dimensional frame of the object of other frame data between two key frame data by using an interpolation method according to the three-dimensional frame of the object and the offset corresponding to the three-dimensional frame, thereby obtaining the three-dimensional detection result of the object in all the frames;
Step six: and according to the detection result, correlating the objects corresponding to all the frame data to obtain a tracking result.
preferably, in the first step, the image data is normalized and then clipped to 1200x360 px; for point cloud data, take points in the range of [ -40,40] x [0,70] x [0,2.5] m, and then remove point values outside the image range. Normalization is to subtract the image mean from each pixel value of the image and then divide the image standard deviation
preferably, in the second step, the feature extraction is based on the structure of VGG16 and is added with a feature pyramid structure, and data extraction is performed on two key frame image data to obtain a point cloud feature map and an image feature map respectively; inputting the two feature maps into an RPN (Region provider Network) for prediction to obtain a plurality of three-dimensional candidate frames;
Preferably, in the second step, a non-maximum suppression algorithm is performed on the candidate frames to obtain K extraction frames. The RPN prediction candidate frames and the finally obtained prediction frames are very dense and have a plurality of overlapped frames, and some representative frames are screened out through a non-maximum suppression algorithm.
preferably, in the third step, the extracting frame intercepts corresponding feature blocks from the point cloud feature map and the image feature map respectively, adjusts the feature blocks to the same size, performs multi-view fusion, and performs classification and regression through a full-connection network to obtain a three-dimensional frame corresponding to the object.
Preferably, in the fourth step, the feature map of the images of the two frames before and after and the convolution cross-correlation feature of the point cloud feature map are calculated to obtain a correlation feature map Cimg t,t+τAnd Cpc t,t+τ(ii) a Using said extraction box at Cimg t,t+τAnd Cpc t,t+τIntercepting corresponding feature blocks on the feature map, adjusting the feature blocks to be the same in size, and fusing the two visual features to obtain a fused feature map; and inputting the fusion feature map into a full-connection network to obtain the three-dimensional frame offset of the object corresponding to the two key frames.
Preferably, in the third step and the fourth step, a non-maximum suppression algorithm is performed on the three-dimensional frame and the three-dimensional frame offset to perform screening. The number of three-dimensional frames is reduced, and the calculation burden is reduced.
preferably, in the sixth step, after the detection results of all the frames are obtained, the three-dimensional frames of different frames are associated by using an IOU Tracker algorithm; specifically, a threshold value of the degree of overlap is set, if the degree of overlap of the three-dimensional frames of the objects of the two frames of images before and after exceeds the threshold value, the same object is determined, and on the contrary, the same object is not determined.
compared with the prior art, the beneficial effects are: 1. the invention utilizes the characteristic of information redundancy among stream data, only needs to carry out target detection on key frames, and generates detection frames of other frames through interpolation, can accelerate the speed of stream data detection under the condition of having little influence on the accuracy of target detection, and solves the problem that the existing three-dimensional target detection network has too long detection time on continuous scene data and cannot meet the requirement of the automatic driving environment on real-time property.
2. the three-dimensional object detection method provided by the invention has high accuracy. Because the point cloud information and the image information are fused at the same time, the advantages and the disadvantages of the point cloud information and the image information are complementary. Compared with an object detection method only using images, the method disclosed by the invention integrates the depth data of the point cloud, and can process the condition of object shielding; compared with a three-dimensional object detection method based on point cloud only, the method disclosed by the invention integrates abundant texture information of the image, makes up for information loss caused by the sparsity of point cloud data, and can effectively reduce the omission ratio particularly under the condition that distant objects and small objects basically have no point cloud data.
3. the method provided by the invention has stronger robustness. On one hand, the method only processes the key frames, and the result of the non-key frames is obtained by interpolation of the key frame result, so that the detection frame has good continuity between the key frames, and the stability of the track is good; on the other hand, the method integrates the point cloud and the image data, so that the method has a good effect in various scenes.
4. The network provided by the invention is end-to-end, which is beneficial to network optimization to find out a global optimal solution, is beneficial to network training and also enables the whole framework to be simpler.
drawings
FIG. 1 is a flow chart of the present invention;
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the patent; for the purpose of better illustrating the embodiments, certain features of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product; it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted. The positional relationships depicted in the drawings are for illustrative purposes only and are not to be construed as limiting the present patent.
The same or similar reference numerals in the drawings of the embodiments of the present invention correspond to the same or similar components; in the description of the present invention, it should be understood that if there are terms such as "upper", "lower", "left", "right", "long", "short", etc., indicating orientations or positional relationships based on the orientations or positional relationships shown in the drawings, it is only for convenience of description and simplicity of description, but does not indicate or imply that the device or element referred to must have a specific orientation, be constructed in a specific orientation, and be operated, and therefore, the terms describing the positional relationships in the drawings are only used for illustrative purposes and are not to be construed as limitations of the present patent, and specific meanings of the terms may be understood by those skilled in the art according to specific situations.
The technical scheme of the invention is further described in detail by the following specific embodiments in combination with the attached drawings:
Examples
Fig. 1 shows an embodiment of a method for detecting and tracking a three-dimensional object based on stream data, comprising the following steps:
The method comprises the following steps: inputting key frame data consisting of point cloud data and image data in front and back frames, and preprocessing the data; wherein the image data is normalized and then cropped to 1200x360 px; for point cloud data, take points within the range of [ -40,40] x [0,70] x [0,2.5] m, and then remove point values outside the image range. Normalization is the subtraction of the image mean from each pixel value of the image divided by the image standard deviation. The space of [ -40,40] x [0,70] x [0,2.5] is gridded into a three-dimensional tensor of 800x700x5, i.e. the tensor has each element corresponding to a small cuboid of the three-dimensional area 0.1x0.1x0.5, the value of the element being the maximum of the heights of all points in the small cuboid, and if there is no point in the small cuboid, the value is 0. Considering the difference of the number of the dots in different small cuboids, a new density channel is added, and the value is min (1.0, log (N +1)/log16), so that the size of the finally generated BEV map is 800x700x 6. And also considering the factor that the object is in motion and the origin of the coordinate system is inconsistent, and converting the image data of the front and the back key frames into the same coordinate system through an IMU data algorithm.
Step two: performing feature extraction on the two key frames in the step one, wherein the feature extraction is based on a VGG16 structure and is added into a feature pyramid structure to obtain a point cloud feature map and an image feature map; inputting the two feature maps into an RPN (Region provider network) for prediction to obtain a plurality of three-dimensional candidate frames; and carrying out a non-maximum suppression algorithm on the candidate frames to obtain K extraction frames, wherein the RPN prediction candidate frames and the finally obtained prediction frames are very dense and have a plurality of overlapped frames, screening some representative frames by the non-maximum suppression algorithm, and setting the threshold value of the algorithm to be 0.8.
Step three: intercepting a feature block and adjusting the size of the feature block in the feature map by the candidate frame, and then inputting a classification network and a frame regression network to obtain the category and the three-dimensional frame position of the object; the candidate frame is an extraction frame after screening, and the extraction frame respectively intercepts corresponding feature blocks from the point cloud feature map and the image feature map and adjusts the feature blocks to the same size. The feature blocks are further classified and regressed through a full-connection network after multi-view fusion is carried out, and the classes and the three-dimensional frame positions of the objects are obtained.
Step four: processing the mutual correlation characteristics of the two key frames extracted in the step two to obtain a correlation characteristic diagram, intercepting a characteristic block and adjusting the size of the candidate frame in the correlation characteristic diagram, and then inputting the characteristic block and the size of the candidate frame into a regression network to obtain the offset of the three-dimensional frame of the object corresponding to the image data of the two key frames;
Specifically, convolution cross-correlation characteristics of a characteristic diagram and a point cloud characteristic diagram of the image are calculated to obtain a related characteristic diagram Cimg t,t+τand Cpc t,t+τ(ii) a Using said extraction box at Cimg t,t+τand Cpc t,t+τIntercepting corresponding feature blocks on the feature map, adjusting the feature blocks to be the same in size, and fusing the two visual features to obtain a fused feature map; and inputting the fusion feature map into a full-connection network to obtain the three-dimensional frame offset of the object corresponding to the two key frames.
step five: obtaining the three-dimensional frame of the object of other frame data between two key frame data by using an interpolation method according to the three-dimensional frame of the object and the offset corresponding to the three-dimensional frame, thereby obtaining the three-dimensional detection result of the object in all the frames;
Step six: according to the detection result, using an IOU Tracker algorithm to associate three-dimensional frames of different frames; specifically, a threshold value of the degree of overlap is set, if the degree of overlap of the three-dimensional frames of the objects of the two frames of images before and after exceeds the threshold value, the same object is determined, and on the contrary, the same object is not determined.
In addition, in the third step and the fourth step, the three-dimensional frame and the three-dimensional frame offset are screened by a non-maximum suppression algorithm, and the threshold value of the algorithm is 0.63. The number of three-dimensional frames is reduced, and the calculation burden is reduced.
it should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.

Claims (8)

1. A three-dimensional object detection and tracking method based on stream data is characterized by comprising the following steps:
The method comprises the following steps: inputting key frame data consisting of point cloud data and image data in front and back frames, preprocessing the data, and converting a projection structure of the point cloud data in a top view direction into a BEV (best effort) diagram;
Step two: performing feature extraction on the two key frame data in the step one to obtain a feature map, inputting the extracted feature map into a region feature extraction module, and respectively obtaining candidate frame sets of the two key frames;
Step three: intercepting a feature block and adjusting the size of the feature block in the feature map by the candidate frame, and then inputting a classification network and a frame regression network to obtain the category and the three-dimensional frame position of the object;
step four: performing relevant operation on the data of the two key frames extracted in the step two to obtain a relevant feature map, intercepting feature blocks and adjusting the size of the candidate frame in the relevant feature map, and then inputting the feature blocks and the adjusted size into a regression network to obtain the offset of the three-dimensional frame of the object corresponding to the two key frames;
Step five: obtaining the three-dimensional frame of the object of other frame data between two key frame data by using an interpolation method according to the three-dimensional frame of the object and the offset corresponding to the three-dimensional frame, thereby obtaining the three-dimensional detection result of the object in all the frames;
Step six: and according to the detection result, correlating the objects corresponding to all the frame data to obtain a tracking result.
2. The method for detecting and tracking three-dimensional objects based on stream data as claimed in claim 1, wherein in the first step, the image data is normalized and then clipped to 1200x360 px; for point cloud data, take points in the range of [ -40,40] x [0,70] x [0,2.5] m, and then remove point values outside the image range.
3. The method for detecting and tracking three-dimensional objects based on stream data as claimed in claim 2, wherein in the second step, the feature extraction is based on a VGG16 structure and adds a feature pyramid structure, and data extraction is performed on two key frame image data to obtain a point cloud feature map and an image feature map respectively; and inputting the two feature maps into RPN prediction to obtain a plurality of three-dimensional candidate frames.
4. The method of claim 3, wherein in the second step, the candidate frame is subjected to a non-maximum suppression algorithm to obtain K extraction frames.
5. the method according to claim 4, wherein in the third step, the extracting frame respectively intercepts corresponding feature blocks from a point cloud feature map and an image feature map, and after adjustment to the same size, the extracting frame performs multi-view fusion and then performs classification and regression through a full-connection network to obtain the three-dimensional frame of the corresponding object.
6. The method as claimed in claim 5, wherein in the fourth step, the convolution cross-correlation characteristics of the image feature map and the point cloud feature map of the two frames before and after the first step are calculated to obtain a correlation feature map Cimg t,t+τAnd Cpc t,t+τ(ii) a Using said extraction box at Cimg t,t+τAnd Cpc t,t+τCorresponding feature blocks are cut out from the feature map, feature fusion of two visual angles is carried out after the feature blocks are adjusted to be the same in size, and a fusion feature map C is obtainedfusion t,t+τ(ii) a And inputting the fusion feature map into a full-connection network to obtain the three-dimensional frame offset of the object corresponding to the two key frames.
7. The stream data-based three-dimensional object detecting and tracking method according to claim 6, wherein in the third step and the fourth step, the non-maximum suppression algorithm is used for screening the three-dimensional frame and the three-dimensional frame offset.
8. The method for detecting and tracking the three-dimensional object based on the streaming data as claimed in claim 1, wherein in the sixth step, after the detection results of all the frames are obtained, the three-dimensional frames of different frames are associated by using an IOU Tracker algorithm; specifically, a threshold value of the degree of overlap is set, if the degree of overlap of the three-dimensional frames of the objects of the two frames of images before and after exceeds the threshold value, the same object is determined, and on the contrary, the same object is not determined.
CN201910725207.8A 2019-08-07 2019-08-07 Three-dimensional object detection and tracking method based on stream data Active CN110570457B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910725207.8A CN110570457B (en) 2019-08-07 2019-08-07 Three-dimensional object detection and tracking method based on stream data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910725207.8A CN110570457B (en) 2019-08-07 2019-08-07 Three-dimensional object detection and tracking method based on stream data

Publications (2)

Publication Number Publication Date
CN110570457A true CN110570457A (en) 2019-12-13
CN110570457B CN110570457B (en) 2023-01-06

Family

ID=68774806

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910725207.8A Active CN110570457B (en) 2019-08-07 2019-08-07 Three-dimensional object detection and tracking method based on stream data

Country Status (1)

Country Link
CN (1) CN110570457B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111209840A (en) * 2019-12-31 2020-05-29 浙江大学 3D target detection method based on multi-sensor data fusion
CN111209825A (en) * 2019-12-31 2020-05-29 武汉中海庭数据技术有限公司 Method and device for dynamic target 3D detection
CN111222441A (en) * 2019-12-31 2020-06-02 深圳市人工智能与机器人研究院 Point cloud target detection and blind area target detection method and system based on vehicle-road cooperation
CN111461221A (en) * 2020-04-01 2020-07-28 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Multi-source sensor fusion target detection method and system for automatic driving
CN111814674A (en) * 2020-07-08 2020-10-23 上海雪湖科技有限公司 Non-maximum suppression method of point cloud network based on FPGA
CN112365600A (en) * 2020-11-10 2021-02-12 中山大学 Three-dimensional object detection method
WO2021134285A1 (en) * 2019-12-30 2021-07-08 深圳元戎启行科技有限公司 Image tracking processing method and apparatus, and computer device and storage medium
CN115496977A (en) * 2022-09-14 2022-12-20 北京化工大学 Target detection method and device based on multi-mode sequence data fusion

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006072495A (en) * 2004-08-31 2006-03-16 Fuji Heavy Ind Ltd Three-dimensional object monitoring device
US20070208872A1 (en) * 2006-03-03 2007-09-06 Hon Hai Precision Industry Co., Ltd. System and method for processing streaming data
US20100045665A1 (en) * 2007-01-22 2010-02-25 Total Immersion Method and device for creating at least two key frames corresponding to a three-dimensional object
KR101763921B1 (en) * 2016-10-21 2017-08-01 (주)플럭스플래닛 Method and system for contents streaming
CN109146929A (en) * 2018-07-05 2019-01-04 中山大学 A kind of object identification and method for registering based under event triggering camera and three-dimensional laser radar emerging system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006072495A (en) * 2004-08-31 2006-03-16 Fuji Heavy Ind Ltd Three-dimensional object monitoring device
US20070208872A1 (en) * 2006-03-03 2007-09-06 Hon Hai Precision Industry Co., Ltd. System and method for processing streaming data
US20100045665A1 (en) * 2007-01-22 2010-02-25 Total Immersion Method and device for creating at least two key frames corresponding to a three-dimensional object
KR101763921B1 (en) * 2016-10-21 2017-08-01 (주)플럭스플래닛 Method and system for contents streaming
CN109146929A (en) * 2018-07-05 2019-01-04 中山大学 A kind of object identification and method for registering based under event triggering camera and three-dimensional laser radar emerging system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陆峰等: "基于多传感器数据融合的障碍物检测与跟踪", 《军事交通学院学报》 *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021134285A1 (en) * 2019-12-30 2021-07-08 深圳元戎启行科技有限公司 Image tracking processing method and apparatus, and computer device and storage medium
CN111209840A (en) * 2019-12-31 2020-05-29 浙江大学 3D target detection method based on multi-sensor data fusion
CN111209825A (en) * 2019-12-31 2020-05-29 武汉中海庭数据技术有限公司 Method and device for dynamic target 3D detection
CN111222441A (en) * 2019-12-31 2020-06-02 深圳市人工智能与机器人研究院 Point cloud target detection and blind area target detection method and system based on vehicle-road cooperation
CN111209840B (en) * 2019-12-31 2022-02-18 浙江大学 3D target detection method based on multi-sensor data fusion
CN111209825B (en) * 2019-12-31 2022-07-01 武汉中海庭数据技术有限公司 Method and device for dynamic target 3D detection
CN111222441B (en) * 2019-12-31 2024-04-23 深圳市人工智能与机器人研究院 Point cloud target detection and blind area target detection method and system based on vehicle-road cooperation
CN111461221A (en) * 2020-04-01 2020-07-28 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Multi-source sensor fusion target detection method and system for automatic driving
CN111814674A (en) * 2020-07-08 2020-10-23 上海雪湖科技有限公司 Non-maximum suppression method of point cloud network based on FPGA
CN112365600A (en) * 2020-11-10 2021-02-12 中山大学 Three-dimensional object detection method
CN112365600B (en) * 2020-11-10 2023-11-24 中山大学 Three-dimensional object detection method
CN115496977A (en) * 2022-09-14 2022-12-20 北京化工大学 Target detection method and device based on multi-mode sequence data fusion

Also Published As

Publication number Publication date
CN110570457B (en) 2023-01-06

Similar Documents

Publication Publication Date Title
CN110570457B (en) Three-dimensional object detection and tracking method based on stream data
US10719940B2 (en) Target tracking method and device oriented to airborne-based monitoring scenarios
CN112435325B (en) VI-SLAM and depth estimation network-based unmanned aerial vehicle scene density reconstruction method
CN110688905B (en) Three-dimensional object detection and tracking method based on key frame
CN108648161B (en) Binocular vision obstacle detection system and method of asymmetric kernel convolution neural network
CN111832655B (en) Multi-scale three-dimensional target detection method based on characteristic pyramid network
CN107204010A (en) A kind of monocular image depth estimation method and system
CN108648194B (en) Three-dimensional target identification segmentation and pose measurement method and device based on CAD model
US10679369B2 (en) System and method for object recognition using depth mapping
KR100560464B1 (en) Multi-view display system with viewpoint adaptation
CN111160291B (en) Human eye detection method based on depth information and CNN
CN110827312B (en) Learning method based on cooperative visual attention neural network
CN111340922A (en) Positioning and mapping method and electronic equipment
CN103020606A (en) Pedestrian detection method based on spatio-temporal context information
CN113744315B (en) Semi-direct vision odometer based on binocular vision
CN110706269A (en) Binocular vision SLAM-based dynamic scene dense modeling method
CN110599522A (en) Method for detecting and removing dynamic target in video sequence
CN112085031A (en) Target detection method and system
CN112446882A (en) Robust visual SLAM method based on deep learning in dynamic scene
CN106651921B (en) Motion detection method and method for avoiding and tracking moving target
CN116051980B (en) Building identification method, system, electronic equipment and medium based on oblique photography
CN113920254B (en) Monocular RGB (Red Green blue) -based indoor three-dimensional reconstruction method and system thereof
CN114648639B (en) Target vehicle detection method, system and device
WO2023030062A1 (en) Flight control method and apparatus for unmanned aerial vehicle, and device, medium and program
KR20160039447A (en) Spatial analysis system using stereo camera.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant