CN109784315B

CN109784315B - Tracking detection method, device and system for 3D obstacle and computer storage medium

Info

Publication number: CN109784315B
Application number: CN201910126019.3A
Authority: CN
Inventors: 杜新新
Original assignee: Suzhou Fengtu Intelligent Technology Co ltd
Current assignee: Suzhou Fengtu Intelligent Technology Co ltd
Priority date: 2019-02-20
Filing date: 2019-02-20
Publication date: 2021-11-09
Anticipated expiration: 2039-02-20
Also published as: CN109784315A

Abstract

The invention provides a tracking detection method, a device and a system for a 3D obstacle and a computer storage medium. The tracking detection method comprises the following steps: determining a second 2D feature vector corresponding to a 2D image area of the current obstacle in the 2D image, wherein the second 2D feature vector is detected from the 3D point cloud of the current frame and the 2D image; comparing each of the second 2D feature vectors with each of the first 2D feature vectors in the set of obstacle feature vectors to obtain a plurality of difference feature vectors, wherein the set of obstacle feature vectors stores first 2D feature vectors representing previously detected previous obstacles; performing a deep learning calculation on the plurality of difference feature vectors to generate a plurality of corresponding probability values, each probability value indicating a probability that a current obstacle and a previous obstacle are the same obstacle; and determining the corresponding relation between the current obstacle and the previous obstacle according to the probability values so as to realize obstacle tracking.

Description

Tracking detection method, device and system for 3D obstacle and computer storage medium

Technical Field

The present invention relates to an obstacle tracking and detecting technology, and in particular, to a tracking and detecting method for a 3D obstacle, a tracking and detecting apparatus for a 3D obstacle, a tracking and detecting system for a 3D obstacle, and a computer storage medium.

Background

The existing obstacle detection technology mainly performs 2D obstacle detection based on a camera, or performs 3D obstacle detection based on a 3D laser radar alone.

In an autonomous vehicle application, the 2D bounding box can only provide limited information to the planning unit and the decision unit, however, for an autonomous vehicle, detailed and accurate 3D information of the vehicle including the vehicle size, the driving direction, and the relative position of the other vehicle to the own vehicle is also required for decision making. Furthermore, although the 2D image-based deep learning technique has shown performance with high detection accuracy in obstacle detection applications for vehicles, it cannot support speed estimation, which is essential for planning algorithms with temporal advance and obstacle tracking detection techniques.

Cameras and LiDAR (Light Detection and Ranging) scanners are the two most commonly used sensors in autonomous vehicle sensing systems. Due to perspective distortion, accurate 3D information that the autonomous vehicle system needs to use cannot be acquired using only a camera. Even with stereo camera systems, depth estimation of the acquired image still does not achieve a satisfactory level of performance.

A typical 64-beam lidar can easily generate more than 100000 points per scan to obtain accurate 3D information including vehicle size, direction of travel, and other relative positions of the vehicle and the host vehicle. However, as the detection space expands, the size and resolution of the required lidar point cloud increases to the third power. Due to the limitation of a memory and calculation time, it is infeasible to completely apply a search algorithm or convolution operation through the whole point cloud, and the tracking accuracy is greatly limited, thereby resulting in missed detection and false detection. Thus, a major challenge in processing lidar point clouds is to maintain accuracy of the 3D spatial pattern and information while reducing the computational burden.

In summary, there is a need in the art for an obstacle tracking detection technique that can efficiently acquire high-quality 3D spatial patterns and information to improve the obstacle tracking detection efficiency and accuracy of an autonomous vehicle.

Disclosure of Invention

The following presents a simplified summary of one or more aspects in order to provide a basic understanding of such aspects. This summary is not an extensive overview of all contemplated aspects, and is intended to neither identify key or critical elements of all aspects nor delineate the scope of any or all aspects. Its sole purpose is to present some concepts of one or more aspects in a simplified form as a prelude to the more detailed description that is presented later.

In order to efficiently acquire high-quality 3D spatial patterns and information to improve the efficiency and accuracy of tracking and detecting obstacles of an autonomous vehicle, the present invention provides a tracking and detecting method for a 3D obstacle, a tracking and detecting apparatus for a 3D obstacle, a tracking and detecting system for a 3D obstacle, and a computer storage medium.

The tracking detection method for the 3D obstacle provided by the invention is used for tracking the detected obstacle, and comprises the following steps:

determining at least one second 2D feature vector corresponding to a 2D image area of at least one current obstacle in the 2D image, wherein the at least one second 2D feature vector is detected from the 3D point cloud and the 2D image of the current frame;

comparing each of the at least one second 2D feature vector with each of first 2D feature vectors in a set of obstacle feature vectors to obtain a plurality of difference feature vectors, wherein the set of obstacle feature vectors stores first 2D feature vectors characterizing at least one previously detected obstacle;

performing a deep learning calculation on the plurality of difference feature vectors to generate a plurality of corresponding probability values, each probability value indicating a probability that a current obstacle and a previous obstacle are the same obstacle; and

determining a correspondence between the at least one current obstacle and the at least one previous obstacle according to the plurality of probability values to enable obstacle tracking.

Preferably, in the tracking detection method provided by the present invention, determining the at least one second 2D feature vector may include: respectively performing feature extraction on 2D image regions of the at least one current obstacle in the 2D images to generate corresponding at least one 2D feature vector as the at least one second 2D feature vector.

Preferably, in the tracking detection method provided by the present invention, the performing feature extraction may further include: performing, at an image overall depth feature layer of a deep learning framework for 2D obstacle recognition, an ROI pooling operation for each 2D image region to generate the at least one second 2D feature vector.

Optionally, in the tracking detection method provided by the present invention, the method may further include:

inputting at least one 2D feature vector extracted from a 2D image region in the 2D image into a convolutional layer and its associated linear rectifying layer, and performing a calculation for the fully-concatenated layer and its associated linear rectifying layer to generate at least one enhanced 2D feature vector as the at least one second 2D feature vector.

Optionally, in the tracking detection method provided by the present invention, performing a deep learning calculation on the plurality of difference feature vectors may include: inputting each difference feature vector into two full-concatenation layers to perform calculation so as to obtain the probability values corresponding to the difference feature vectors.

Optionally, in the above tracking detection method provided by the present invention, the determining a correspondence between the at least one current obstacle and the at least one previous obstacle according to the plurality of probability values may include:

for each current obstacle, matching a previous obstacle with the current obstacle having the highest probability value above the threshold; and

and considering the matched current obstacle and the previous obstacle as the same obstacle, updating a first 2D feature vector corresponding to the same obstacle in the obstacle feature vector set into a corresponding second 2D feature vector, and adding the newly identified second 2D feature vector of the current obstacle into the obstacle feature vector set.

Preferably, in the above tracking detection method provided by the present invention, regarding the matched current obstacle and previous obstacle as the same obstacle may specifically include:

performing 3D position information confirmation on each pair of the current obstacle and the previous obstacle which are successfully matched, and in response to the confirmation passing, regarding the current obstacle and the previous obstacle which are matched as the same obstacle; otherwise, adding a second 2D feature vector of the current obstacle which fails to pass the confirmation into the obstacle feature vector set.

Preferably, in the above tracking detection method provided by the present invention, the performing 3D position information confirmation on each pair of the current obstacle and the previous obstacle that are successfully matched may include:

and determining a space range according to the position, the moving speed and the potential steering of the previous obstacle, the position, the moving speed and the potential steering of the observation point and the time difference, and confirming to pass in response to the current obstacle being in the space range, otherwise, confirming to fail.

the speed of the obstacle is determined from the position change and the time difference between the current obstacle and the previous obstacle, which are regarded as the same obstacle.

According to another aspect of the present invention, there is also provided herein a tracking detection apparatus for a 3D obstacle.

The tracking detection device according to the present invention is configured to perform tracking on a detected obstacle, and the tracking detection device includes:

a memory having stored therein a first 2D feature vector characterizing at least one previous obstacle previously detected; and

a processor coupled to the memory, the processor configured to:

comparing each of the at least one second 2D feature vector to each first 2D feature vector in the set of obstacle feature vectors to obtain a plurality of difference feature vectors;

performing a deep learning calculation on the plurality of difference feature vectors to generate a plurality of corresponding probability values, each probability value indicating a probability that a current obstacle and a previous obstacle are the same obstacle;

and

Preferably, in the above tracking detection apparatus provided by the present invention, the processor may be further configured to:

respectively performing feature extraction on 2D image areas of the at least one current obstacle in the 2D images to generate corresponding at least one 2D feature vector as the at least one second 2D feature vector.

performing, at an image overall depth feature layer of a deep learning framework for 2D obstacle recognition, an ROI pooling operation for each 2D image region to generate the at least one second 2D feature vector.

Optionally, in the tracking detection apparatus provided in the present invention, the processor may be further configured to:

inputting at least one 2D feature vector extracted from a 2D image region in the 2D image into the convolutional layer and its associated linear rectifying layer, and performing a calculation on the fully-concatenated layer and its associated linear rectifying layer to generate at least one enhanced 2D feature vector as the at least one second 2D feature vector.

performing a deep learning computation on the plurality of difference feature vectors comprises: inputting each difference feature vector into two full-concatenation layers to perform calculation so as to obtain the probability values corresponding to the difference feature vectors.

According to another aspect of the present invention, there is also provided herein a tracking detection system for 3D obstacles.

The tracking detection system provided by the invention comprises:

an image capturing device for acquiring a 2D image;

the point cloud data capturing device is used for acquiring a 3D point cloud; and

any of the above tracking detection devices.

According to another aspect of the present invention, there is also provided a computer storage medium having a computer program stored thereon, which when executed by a processor, may implement the steps of any of the above-mentioned methods for tracking and detecting 3D obstacles.

Drawings

The above features and advantages of the present disclosure will be better understood upon reading the detailed description of embodiments of the disclosure in conjunction with the following drawings. In the drawings, components are not necessarily drawn to scale, and components having similar relative characteristics or features may have the same or similar reference numerals.

Fig. 1 is a schematic flow chart of a tracking detection method for a 3D obstacle according to an embodiment of the present invention.

Fig. 2 is a flowchart illustrating a method for determining the spatial range R according to an embodiment of the present invention.

Fig. 3 is a schematic structural diagram of a tracking detection apparatus for a 3D obstacle according to an embodiment of the present invention.

Fig. 4 is a schematic structural diagram of a tracking detection system for a 3D obstacle according to an embodiment of the present invention.

Reference numerals

101-104 is used for a tracking detection method of the 3D obstacle;

30 tracking detection means for 3D obstacles;

31 a memory;

32 a processor;

40 a tracking detection system for 3D obstacles;

41 an image capturing device;

42 point cloud data capture means.

Detailed Description

The following description of the embodiments of the present invention is provided for illustrative purposes, and other advantages and effects of the present invention will become apparent to those skilled in the art from the present disclosure. While the invention will be described in connection with the preferred embodiments, there is no intent to limit its features to those embodiments. On the contrary, the invention is described in connection with the embodiments for the purpose of covering alternatives or modifications that may be extended based on the claims of the present invention. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. The invention may be practiced without these particulars. Moreover, some of the specific details have been left out of the description in order to avoid obscuring or obscuring the focus of the present invention.

In the description of the present invention, it should be noted that, unless otherwise explicitly specified or limited, the terms "mounted," "connected," and "connected" are to be construed broadly, e.g., as meaning either a fixed connection, a removable connection, or an integral connection; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.

Additionally, the terms "upper," "lower," "left," "right," "top," "bottom," "horizontal," "vertical" and the like as used in the following description are to be understood as referring to the segment and the associated drawings in the illustrated orientation. The relative terms are used for convenience of description only and do not imply that the described apparatus should be constructed or operated in a particular orientation and therefore should not be construed as limiting the invention.

It will be understood that, although the terms first, second, third, etc. may be used herein to describe various elements, regions, layers and/or sections, these elements, regions, layers and/or sections should not be limited by these terms, but rather are used to distinguish one element, region, layer and/or section from another element, region, layer and/or section. Thus, a first component, region, layer or section discussed below could be termed a second component, region, layer or section without departing from some embodiments of the present invention.

In order to be able to efficiently acquire high-quality 3D spatial patterns and information to improve obstacle tracking detection efficiency and accuracy of an autonomous vehicle, embodiments of a tracking detection method for a 3D obstacle, embodiments of a tracking detection apparatus for a 3D obstacle, embodiments of a tracking detection system for a 3D obstacle, and embodiments of a computer storage medium are provided.

As shown in fig. 1, the tracking detection method for a 3D obstacle provided by this embodiment may be used to perform tracking on a detected obstacle, and the tracking detection method may include:

101: and determining at least one second 2D feature vector corresponding to the 2D image area of the at least one current obstacle in the 2D image, which is detected from the 3D point cloud and the 2D image of the current frame.

In autonomous vehicle applications, the obstacles may include, but are not limited to, other vehicles on the road. In the driving process of the automatic driving vehicle, the relative distance between the automobile obstacles and the vehicle is constantly changed, so that the automobile obstacles need to be accurately and efficiently detected, and real-time tracked to make driving decisions.

The 2D image of the obstacle may be obtained by photographing and capturing using a camera. The 3D point cloud can be obtained by scanning the surrounding environment of the vehicle with a laser radar while taking a picture with a camera. According to the relative position relationship between the laser radar and the camera and the internal parameters (focal length, focal position and the like) of the camera, the boundary frame of each obstacle scanned by the laser radar can find a corresponding image area in the corresponding 2D image.

According to a frame of 3D point cloud and a corresponding 2D image thereof, each obstacle around the own vehicle at the current moment can be accurately and efficiently detected by a person skilled in the art through existing or future technical means, and detailed and accurate 3D information including but not limited to the size, driving direction and relative position of the vehicle. By mapping the 3D point cloud into the 2D image, one skilled in the art can determine the corresponding 2D image region of the detected obstacle in the 2D image. For example, each vertex of the bounding box of the obstacle in the 3D point cloud may be projected into the 2D image as a point cloud into the image to determine the contour of the 2D image region.

Those skilled in the art will appreciate that the above-described manner of mapping the 3D point cloud into the 2D image is only one specific approach to determining the 2D image region. In other embodiments, a person skilled in the art may determine the 2D image area in other ways.

A person skilled in the art may determine the at least one second 2D feature vector by generating a corresponding number of 2D feature vectors by performing feature extraction on each detected current obstacle in the 2D image area in the 2D image. The 2D feature vector may be a multi-dimensional vector used to characterize the relevant information of the automobile obstacle in the 2D image.

The performing of the feature extraction may include: at the image overall depth feature layer of the deep learning framework for 2D obstacle identification (e.g., the Conv5_3 layer in the fast rcnn deep learning framework), roi (region of interest) pooling operation is performed for each 2D image region to generate the second 2D feature vector.

As an example, the setting parameters of the Faster-Rcnn framework described above are shown in the following table:

TABLE 1

The image overall depth feature layer may be a deep learning layer, which is a technical means known to those skilled in the art and is described in detail in http:// coffee.

Those skilled in the art will appreciate that the conv5_3 layer described above is only one specific overall depth feature layer of the image. In other embodiments, corresponding to other deep learning frameworks such as fast rcnn deep learning framework, MS-Cnn deep learning framework, etc., those skilled in the art may also perform the ROI pooling operation on each 2D image region at the depth feature layer of the whole image to generate the 2D feature vector corresponding to each 2D image region.

Alternatively, a person skilled in the art may also input the 2D feature vectors extracted from the 2D image area in the 2D image into the convolutional layer and its associated linear rectifying layer (channel:512, pad 1, kernel 3), and the Fully connected layer and its associated linear rectifying layer (channel:256) to perform calculation to generate a corresponding number of enhanced 2D feature vectors, which are used as the second 2D feature vectors. The enhanced 2D feature vector described above may further include some features for object tracking.

The convolutional layer (convolution layer) may be composed of several convolution units, and the parameters of each convolution unit may be optimized by a back propagation algorithm. The purpose of the convolution operation includes, but is not limited to, extracting different features of the input. The first convolutional layer may only extract some low-level features (e.g., edge, line, corner, etc. levels), and the 3D feature extraction convolutional neural network composed of more convolutional layers can iteratively extract more complex features from the low-level features.

102: comparing each of the at least one second 2D feature vector with each of the first 2D feature vectors in the set of obstacle feature vectors to obtain a plurality of difference feature vectors, wherein the set of obstacle feature vectors stores first 2D feature vectors characterizing at least one previously detected obstacle.

As described above, since the relative distance between the obstacle and the vehicle changes constantly during the driving of the autonomous vehicle, it is necessary to accurately and efficiently detect the obstacle and perform real-time tracking to make a driving decision.

One skilled in the art may store the at least one second 2D feature vector of the at least one obstacle detected in each previous frame of the 3D point cloud and 2D image into the obstacle feature vector set as the first 2D feature vector.

By comparing (for example, subtracting) the second 2D feature vector of the obstacle detected by the current frame with the first 2D feature vector in the obstacle feature vector set, a corresponding number of difference feature vectors can be obtained to represent the change of the information related to the obstacle in the time difference between the two frames of 3D point cloud and 2D image. The difference feature vector may be a multi-dimensional vector of the same latitude as the 2D feature vector.

Those skilled in the art can understand that the obstacle feature vector set may include first 2D feature vectors of a plurality of different obstacles, and in response to determining that a certain obstacle detected by the subsequent frame of 3D point cloud and 2D image is the same obstacle as one obstacle in the obstacle feature vector set, the second 2D feature vector of the obstacle in the subsequent frame of 3D point cloud and 2D image is stored in the obstacle feature vector set, and the first 2D feature vector of the obstacle is updated, so as to compare the obstacle detected by the subsequent frame of 3D point cloud and 2D image.

103: performing a deep learning calculation on the plurality of difference feature vectors to generate a plurality of corresponding probability values, each probability value indicating a probability that a current obstacle is the same obstacle as a previous obstacle.

The above-described deep learning calculation may include inputting each difference feature vector into two full-connected layers (first layer channel 256, second layer channel 2) to perform calculation to obtain probability values corresponding to the difference feature vectors. Through the deep learning calculation, the probability that each current obstacle is the same as each previous obstacle can be obtained.

104: and determining the corresponding relation between at least one current obstacle and at least one previous obstacle according to the plurality of probability values so as to realize obstacle tracking.

By presetting a threshold value for determining whether the current obstacle and the previous obstacle are the same obstacle, a person skilled in the art can match, for each current obstacle, a previous obstacle having a maximum probability value higher than the threshold value with the current obstacle; and

the matched current obstacle and the matched previous obstacle are regarded as the same obstacle, and the first 2D feature vector corresponding to the same obstacle is updated into the corresponding second 2D feature vector in the obstacle feature vector set; or adding the second 2D feature vector of the non-matching, i.e. newly identified, current obstacle to the set of obstacle feature vectors.

Preferably, the person skilled in the art may further perform 3D position information confirmation for each pair of the current obstacle and the previous obstacle that successfully match. In response to the confirmation passing, considering the current obstacle and the previous obstacle as the same obstacle; otherwise, adding the second 2D feature vector of the current obstacle which is not confirmed to the obstacle feature vector set as a new obstacle.

The performing of the 3D position information confirmation on each pair of the current obstacle and the previous obstacle that are successfully matched may include:

a spatial range is determined based on the position, the moving speed, and the potential steering of the previous obstacle, the position, the moving speed, and the potential steering of the observation point, and the time difference. In response to the current obstacle being within the spatial range, determining that the obstacle is likely to arrive at a new location within the time difference, and confirming a pass; otherwise, the obstacle is considered unlikely to reach the new position within the time difference, and the validation fails.

Specifically, the above-mentioned spatial range R may be determined by a method as shown in fig. 2.

As shown in fig. 2, assume that the observation point (own vehicle) is originally at point O; the position of the previous obstacle is point A, and the vehicle runs along the direction of the vehicle body (B-C) at the fastest running speed of 135 km/h.

From this maximum speed and the time difference, the farthest distance B and C to which it is likely to travel can be found. If the vehicle is also traveling at the fastest speed in the positive direction along the z-axis during this time difference, the relative area where the obstacle may be located at the current time point with respect to the observation point may be represented by a polygon BCDE.

It is considered that the obstacle may turn left or right within the time difference, rather than travel forward in the vehicle body direction. To compensate for the turn, the straight lines BC and DE can be translated a distance D (0.5 meters), respectively, to yield B 'C' and D 'E'. In this case, the smallest rectangle that can cover B 'C' D 'E' is B 'C "D' E".

Similarly, it is contemplated that the vehicle may turn. To compensate for this, the lines C "B" and D' E "may be rotated by an angle θ (0.05) around point O to obtain lines FG and KH, thereby determining the spatial range R.

It will be appreciated by those skilled in the art that the method of determining the spatial range R as illustrated in fig. 2 is only one embodiment. In other embodiments, a person skilled in the art may determine the above spatial range R in other ways.

Optionally, one skilled in the art may further determine the speed of the obstacle according to the position change and the time difference between the current obstacle and the previous obstacle, which are regarded as the same obstacle, so as to perform a planning algorithm with time advance and the tracking detection of the obstacle.

While, for purposes of simplicity of explanation, the methodologies are shown and described as a series of acts, it is to be understood and appreciated that the methodologies are not limited by the order of acts, as some acts may, in accordance with one or more embodiments, occur in different orders and/or concurrently with other acts from that shown and described herein or not shown and described herein, as would be understood by one skilled in the art.

According to another aspect of the present invention, there is also provided herein embodiments of a tracking detection apparatus for 3D obstacles.

As shown in fig. 3, the tracking detection apparatus 30 provided in the present embodiment may be used to perform tracking on a detected obstacle. The detection device 30 may include a memory 31, and a processor 32 coupled to the memory 31.

The memory may store a first 2D feature vector representing at least one previously detected obstacle, and compare the first 2D feature vector with a second 2D feature vector of at least one current obstacle, so as to determine whether the two are the same obstacle.

The processor 32 may be configured to determine at least one second 2D feature vector corresponding to a 2D image region of the at least one current obstacle in the 2D image, which is detected from the 3D point cloud of the current frame and the 2D image; comparing each of the at least one second 2D feature vector with each of the first 2D feature vectors in the set of obstacle feature vectors to obtain a plurality of difference feature vectors; performing a deep learning calculation on the plurality of difference feature vectors to generate a plurality of corresponding probability values, each probability value indicating a probability that a current obstacle and a previous obstacle are the same obstacle; and determining a corresponding relation between at least one current obstacle and at least one previous obstacle according to the probability values so as to realize obstacle tracking.

Those skilled in the art will appreciate that the above-described configuration of the processor 32 is but one specific approach to implementing a tracking detection method for 3D obstacles. In other embodiments, the processor 32 may be further configured to implement any of the above tracking detection methods for 3D obstacles.

According to another aspect of the present invention, there is also provided herein an embodiment of a tracking detection system 40 for 3D obstacles.

As shown in fig. 4, the detection system 40 may include an image capturing device 41 for acquiring a 2D image; a point cloud data capture device 42 for acquiring a 3D point cloud; and any of the tracking detection devices 30 described above. The image capturing device 41 may include, but is not limited to, a camera and a video camera. The point cloud data capture device 42 may include, but is not limited to, a lidar.

According to another aspect of the present invention, there is also provided herein an embodiment of a computer storage medium.

The computer storage medium has a computer program stored thereon. The computer program, when executed by a processor, may implement the steps of any of the above-described methods for tracking detection of 3D obstacles.

Those of skill would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

The processors described herein may be implemented using electronic hardware, computer software, or any combination thereof. Whether such processors are implemented as hardware or software depends upon the particular application and the overall design constraints imposed on the system. As an example, a processor, any portion of a processor, or any combination of processors presented in this disclosure may be implemented with a microprocessor, a microcontroller, a Digital Signal Processor (DSP), a Field Programmable Gate Array (FPGA), a Programmable Logic Device (PLD), a state machine, gated logic, discrete hardware circuitry, and other suitable processing components configured to perform the various functions described throughout this disclosure. The functionality of a processor, any portion of a processor, or any combination of processors presented in this disclosure may be implemented in software executed by a microprocessor, microcontroller, DSP, or other suitable platform.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an ASIC. The ASIC may reside in a user terminal. In the alternative, the processor and the storage medium may reside as discrete components in a user terminal.

In one or more exemplary embodiments, the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software as a computer program product, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a computer. By way of example, and not limitation, such computer-readable media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer. Any connection is properly termed a computer-readable medium. For example, if the software is transmitted from a web site, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, Digital Subscriber Line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. Disk (disk) and disc (disc), as used herein, includes Compact Disc (CD), laser disc, optical disc, Digital Versatile Disc (DVD), floppy disk and blu-ray disc where disks (disks) usually reproduce data magnetically, while discs (discs) reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.

The previous description of the disclosure is provided to enable any person skilled in the art to make or use the disclosure. Various modifications to the disclosure will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other variations without departing from the spirit or scope of the disclosure. Thus, the disclosure is not intended to be limited to the examples and designs described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A tracking detection method for a 3D obstacle for performing tracking on a detected obstacle, the tracking detection method comprising:

comparing each of the at least one second 2D feature vector with each first 2D feature vector in a set of obstacle feature vectors to obtain a plurality of difference feature vectors, wherein the set of obstacle feature vectors stores first 2D feature vectors characterizing at least one previously detected obstacle;

determining a correspondence between the at least one current obstacle and the at least one previous obstacle according to the plurality of probability values to achieve obstacle tracking, specifically comprising:

for each current obstacle, matching a previous obstacle with the current obstacle having the highest probability value above the threshold; and performing 3D position information confirmation on each pair of the current obstacle and the previous obstacle successfully matched, and regarding the current obstacle and the previous obstacle matched as the same obstacle in response to passing of the confirmation.

2. The tracking detection method of claim 1, wherein determining the at least one second 2D feature vector comprises performing feature extraction on a 2D image region of the at least one current obstacle in the 2D image to generate a corresponding at least one 2D feature vector as the at least one second 2D feature vector, respectively.

3. The tracking detection method of claim 2, wherein said performing feature extraction comprises performing an ROI pooling operation at an image global depth feature layer of a deep learning framework of 2D obstacle recognition for each 2D image region to generate the at least one second 2D feature vector.

4. The tracking detection method of claim 2, further comprising:

inputting at least one 2D feature vector extracted from a 2D image region in the 2D image into a convolutional layer and its associated linear rectifying layer and a fully-connected layer and its associated linear rectifying layer to perform a calculation to generate at least one enhanced 2D feature vector as the at least one second 2D feature vector.

5. The tracking detection method of claim 1, wherein performing a deep learning computation on the plurality of difference feature vectors comprises inputting each difference feature vector into two fully connected layers to perform a computation to obtain the plurality of probability values corresponding to the plurality of difference feature vectors.

6. The tracking detection method as claimed in claim 1, wherein said determining a correspondence between said at least one current obstacle and said at least one previous obstacle according to said plurality of probability values further comprises:

and updating a first 2D feature vector corresponding to the same obstacle in the obstacle feature vector set into a corresponding second 2D feature vector, and adding the newly identified second 2D feature vector of the current obstacle into the obstacle feature vector set.

7. The tracking detection method of claim 1, wherein in response to the 3D location information confirming failure, a second 2D feature vector of the current obstacle failing to confirm is added to the set of obstacle feature vectors.

8. The tracking detection method of claim 7, wherein said performing 3D position information validation for each pair of current and previous obstacles that successfully matched comprises:

9. The tracking detection method according to claim 6, further comprising:

10. A tracking detection apparatus for a 3D obstacle for performing tracking on a detected obstacle, the tracking detection apparatus comprising:

a processor coupled to the memory, the processor configured to:

11. The tracking detection apparatus of claim 10, wherein the processor is further configured to:

respectively performing feature extraction on 2D image regions of the at least one current obstacle in the 2D images to generate corresponding at least one 2D feature vector as the at least one second 2D feature vector.

12. The tracking detection apparatus of claim 11, wherein the processor is further configured to:

performing an ROI pooling operation at an image overall depth feature layer of a deep learning framework of 2D obstacle recognition for each 2D image region to generate the at least one second 2D feature vector.

13. The tracking detection apparatus of claim 11, wherein the processor is further configured to:

14. The tracking detection apparatus of claim 10, wherein the processor is further configured to:

performing a deep learning computation on the plurality of difference feature vectors comprises inputting each difference feature vector into two fully-connected layers to perform a computation to obtain the plurality of probability values corresponding to the plurality of difference feature vectors.

15. The tracking detection apparatus of claim 10, wherein the processor is further configured to:

16. The tracking detection apparatus of claim 10, wherein the processor is further configured to:

in response to the 3D position information confirming failure, adding a second 2D feature vector of the current obstacle failing to confirm to the obstacle feature vector set.

17. The tracking detection apparatus of claim 16, wherein the processor is further configured to:

18. The tracking detection apparatus of claim 15, wherein the processor is further configured to:

19. A tracking detection system for 3D obstacles, comprising:

an image capturing device for acquiring a 2D image;

a tracking detection apparatus as claimed in any one of claims 10 to 18.

20. A computer storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method according to any one of claims 1-9.