CN114677555B

CN114677555B - Iterative optimization type end-to-end intelligent vehicle sensing method and device and electronic equipment

Info

Publication number: CN114677555B
Application number: CN202210200266.5A
Authority: CN
Inventors: 郑四发; 吴浩然; 张创; 许庆; 王建强; 李克强
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2022-03-02
Filing date: 2022-03-02
Publication date: 2024-06-28
Anticipated expiration: 2042-03-02
Also published as: CN114677555A

Abstract

The application relates to the technical field of vehicles, in particular to an iterative optimization type end-to-end intelligent vehicle sensing method, an iterative optimization type end-to-end intelligent vehicle sensing device and electronic equipment, wherein the method comprises the following steps: acquiring perception information of an intelligent vehicle; inputting perception information into an end-to-end network after iterative optimization, executing a detection task, a tracking task and a prediction task, and simultaneously obtaining a detection result, a tracking result and a prediction result, wherein the end-to-end network comprises a multi-target detection network, a multi-target tracking network and a multi-target track prediction network; and obtaining a perception result of the end-to-end intelligent vehicle based on the detection result, the tracking result and the prediction result. Therefore, the problems of interdependence of three sensing tasks, independent algorithm, lower cooperativity, target loss caused by shielding and the like in tracking, detecting and predicting are solved, and the tracking rate of the shielded object, the robustness of a tracking result and the instantaneity and precision of a sensing scheme are improved through end-to-end detection tracking, three sensing module integration and implementation of an iterative optimization scheme.

Description

Iterative optimization type end-to-end intelligent vehicle sensing method and device and electronic equipment

Technical Field

The application relates to the technical field of vehicles, in particular to an iterative optimization type end-to-end intelligent vehicle sensing method, an iterative optimization type end-to-end intelligent vehicle sensing device and electronic equipment.

Background

The existing intelligent vehicle perception scheme divides perception into three steps of detection, tracking and prediction, and is mainly realized in sequence through a deep learning algorithm. The method of detection-followed-by-tracking (Tracking by Detection) relies on a given accurate recognition model to detect the object, and then connects the results detected at different times through a separate network to complete the tracking task. The detection-followed-by-tracking method effectively utilizes the capability of a target detector based on deep learning, and is a currently dominant detection-tracking paradigm. However, the method takes detection-tracking as two tasks, two networks are required to be constructed for learning, the calculated amount is large, and the real-time performance is relatively poor. Joint Detection AND TRACKING method, such as CENTERTRACK, places the detector on two consecutive frames and on a thermodynamic diagram of the previous trajectory in dots. The detector outputs an offset vector from the center of the current object to its center in the previous frame, which is computationally inexpensive and based on which the offset is sufficient to achieve target correlation, target tracking is accomplished. Such a method greatly reduces the amount of computation in the detection-tracking step, but has poor tracking effect on the occluded object because the offset is computed based on two consecutive frames. Therefore, how to solve the tracking error caused by shielding under the condition of meeting the real-time requirement is a problem to be solved in the current detection-tracking algorithm.

For a prediction algorithm, the main stream method mainly performs feature extraction on factors such as a target historical track, an environment image, road topology information and the like based on a feature extraction network, simulates a feature interaction relation between a target and the environment through a graph neural network, and outputs a target future track prediction result considering interaction. Among them, the target history track information is the most important factor, and once the target history track is missing and shifted (especially, the starting point of the history track), the input of the neural network changes, and the prediction effect may be different from the day to day. Thus, the prediction algorithm is very strongly dependent on the detection-tracking result. Meanwhile, the current detection-tracking algorithm only provides detection results for the prediction algorithm, but cannot be synchronously optimized with the prediction algorithm, and the robustness of the prediction algorithm to error data is also not strong.

Therefore, how to improve the robustness of the prediction algorithm to the error detection-tracking data and make it possible to perform iterative optimization with the detection-tracking algorithm is a problem that needs to be solved at present under the condition that the detection-tracking result cannot be ensured to be completely correct.

Disclosure of Invention

The application provides an iterative optimization type end-to-end intelligent vehicle sensing method, an iterative optimization type end-to-end intelligent vehicle sensing device and electronic equipment, and aims to solve the problems of interdependence of three sensing tasks of tracking, detecting and predicting, independent algorithm, low cooperativity, target loss caused by shielding and the like.

An embodiment of a first aspect of the present application provides an iterative optimization type end-to-end intelligent vehicle sensing method, including the following steps:

acquiring perception information of an intelligent vehicle;

Inputting the perception information into an end-to-end network after iterative optimization, executing a detection task, a tracking task and a prediction task, and simultaneously obtaining a detection result, a tracking result and a prediction result, wherein the end-to-end network comprises a multi-target detection network, a multi-target tracking network and a multi-target track prediction network; and

And acquiring a perception result of the end-to-end intelligent vehicle based on the detection result, the tracking result and the prediction result.

According to one embodiment of the application, before inputting the perception information into the iteratively optimized end-to-end network, the method further comprises:

and performing end-to-end training on the end-to-end network, wherein in the training process, a preset iterative optimization type training mode is adopted to perform forward propagation of the network to respectively obtain initial results of detection, tracking and prediction, iterative optimization is performed, the prediction network is used for solving an intersection ratio matrix of a tracking part to obtain a new tracking result, the detection result and the tracking result are used for the prediction network to obtain the new prediction result, the new tracking result is obtained by tracking, and iteration is repeated until a network convergence condition is met to obtain the end-to-end network after iterative optimization.

According to one embodiment of the present application, the detection formula of the multi-target detection network is:

Fⁱ＝CNN(IMGⁱ,IMG^i-1,O^i-1)

Cⁱ＝FC(Fⁱ)

Wherein F ⁱ is a feature vector, IMG ⁱ is image information at the current time, IMG ^i-1 is image information at the last time, O ^i-1 is detection network output information at the last time, C ⁱ is a low-dimensional encoding vector, To detect important features related to the object, O ⁱ is the decoding vector at the current time.

According to one embodiment of the application, the tracking formula of the multi-target tracking network is:

If The target tracking is successful, and the kth target tracking result is that

IfThe kth target tracking result is fromBecomes as follows

IfTarget tracking fails, but the target k is kept;

If Observation result In order to be a new goal of the present invention,

Wherein,For the decoding vector of the kth _i object of the ith frame,The predicted output of the ith frame for the kth _i-1 th target of the ith-1 th frame,And (3) calculating the intersection ratio of the kth prediction of the ith frame and the jth detection, wherein i is the ith frame in the detection, and sigma and beta are preset thresholds.

According to one embodiment of the present application, the prediction formula of the multi-target trajectory prediction network is:

Or (b)

Wherein,For a predictive vector based on historical observation information, k is the target,As the predictive vector of the history information,For the interactive results of all targets of the current frame,Predictive vectors for all targets of the current frame.

According to the iterative optimization type end-to-end intelligent vehicle sensing method, sensing information of an intelligent vehicle is obtained and is input into an end-to-end network after iterative optimization, a detection task, a tracking task and a prediction task are executed, and a sensing result of the end-to-end intelligent vehicle is obtained based on the obtained detection result, tracking result and prediction result. The end-to-end network comprises a multi-target detection network, a multi-target tracking network and a multi-target track prediction network. Therefore, the problems of interdependence of three sensing tasks, independent algorithm, lower cooperativity, target loss caused by shielding and the like in tracking, detecting and predicting are solved, and the tracking rate of the shielded object, the robustness of a tracking result and the instantaneity and precision of a sensing scheme are improved through end-to-end detection tracking, three sensing module integration and implementation of an iterative optimization scheme.

An embodiment of a second aspect of the present application provides an iterative optimized end-to-end intelligent vehicle sensing device, including:

The acquisition module is used for acquiring the perception information of the intelligent vehicle;

The execution module is used for inputting the perception information into the end-to-end network after iterative optimization, executing a detection task, a tracking task and a prediction task, and simultaneously obtaining a detection result, a tracking result and a prediction result, wherein the end-to-end network comprises a multi-target detection network, a multi-target tracking network and a multi-target track prediction network; and

And the perception module is used for acquiring the perception result of the end-to-end intelligent vehicle based on the detection result, the tracking result and the prediction result.

According to one embodiment of the present application, the sensing module is specifically configured to:

Fⁱ＝CNN(IMGⁱ,IMG^i-1,O^i-1)

Cⁱ＝FC(Fⁱ)

IfThe kth target tracking result is fromBecomes as follows

IfTarget tracking fails, but the target k is kept;

If Observation result In order to be a new goal of the present invention,

Or (b)

According to the iterative optimization type end-to-end intelligent vehicle sensing device, sensing information of an intelligent vehicle is obtained and input into an end-to-end network after iterative optimization, a detection task, a tracking task and a prediction task are executed, and a sensing result of the end-to-end intelligent vehicle is obtained based on the obtained detection result, tracking result and prediction result. The end-to-end network comprises a multi-target detection network, a multi-target tracking network and a multi-target track prediction network. Therefore, the problems of interdependence of three sensing tasks, independent algorithm, lower cooperativity, target loss caused by shielding and the like in tracking, detecting and predicting are solved, and the tracking rate of the shielded object, the robustness of a tracking result and the instantaneity and precision of a sensing scheme are improved through end-to-end detection tracking, three sensing module integration and implementation of an iterative optimization scheme.

An embodiment of a third aspect of the present application provides an electronic device, including: the system comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor executes the program to realize the iterative optimized end-to-end intelligent vehicle sensing method according to the embodiment.

An embodiment of a fourth aspect of the present application provides a computer readable storage medium having stored thereon a computer program for execution by a processor for implementing an iterative optimized end-to-end intelligent vehicle sensing method as described in the above embodiments.

Additional aspects and advantages of the application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the application.

Drawings

The foregoing and/or additional aspects and advantages of the application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings, in which:

FIG. 1 is a flow chart of one embodiment of the present application;

FIG. 2 is a schematic diagram of an iterative optimized end-to-end intelligent vehicle sensing scheme provided in accordance with one embodiment of the present application;

FIG. 3 is a schematic diagram of an end-to-end network detection portion according to an embodiment of the present application;

FIG. 4 is a schematic diagram of an end-to-end network tracking portion provided in accordance with one embodiment of the present application;

FIG. 5 is a schematic diagram of a portion of an end-to-end network prediction according to one embodiment of the present application;

FIG. 6 is a block diagram illustration of an iterative optimized end-to-end intelligent vehicle awareness apparatus according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

Embodiments of the present application are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative and intended to explain the present application and should not be construed as limiting the application.

The iterative optimization type end-to-end intelligent vehicle sensing method, device and electronic equipment of the embodiment of the application are described below with reference to the accompanying drawings. Aiming at the problems of interdependence of three sensing tasks, low robustness and instantaneity of data and poor tracking effect of the shielded object, which are mentioned in the background center, the application provides an iterative optimization type end-to-end intelligent vehicle sensing method, in the method,

And acquiring perception information of the intelligent vehicle, inputting the perception information into the end-to-end network after iterative optimization, executing a detection task, a tracking task and a prediction task, and acquiring the perception result of the end-to-end intelligent vehicle based on the acquired detection result, tracking result and prediction result. The end-to-end network comprises a multi-target detection network, a multi-target tracking network and a multi-target track prediction network. Therefore, the problems of interdependence of three sensing tasks, independent algorithm, lower cooperativity, target loss caused by shielding and the like in tracking, detecting and predicting are solved, and the tracking rate of the shielded object, the robustness of a tracking result and the instantaneity and precision of a sensing scheme are improved through end-to-end detection tracking, three sensing module integration and implementation of an iterative optimization scheme.

Specifically, fig. 1 is a schematic flow chart of an iterative optimization type end-to-end intelligent vehicle sensing method according to an embodiment of the present application.

As shown in fig. 1, the iterative optimization type end-to-end intelligent vehicle sensing method comprises the following steps:

In step S101, sensing information of the intelligent vehicle is acquired.

Specifically, the iterative optimization type end-to-end intelligent vehicle sensing method provided by the embodiment of the application mainly comprises two parts, namely an end-to-end detection-tracking-prediction network and an iterative optimization training method, as shown in fig. 2. The method can be divided into three parts of a multi-target detection network, a multi-target tracking network and a multi-target track prediction network according to output results required to be provided by an end-to-end network. The detailed description will be given by way of the following specific examples.

In step S102, the perception information is input to the end-to-end network after iterative optimization, and a detection task, a tracking task and a prediction task are executed, and a detection result, a tracking result and a prediction result are obtained at the same time, where the end-to-end network includes a multi-target detection network, a multi-target tracking network and a multi-target track prediction network.

Further, in some embodiments, before the perceptual information is input to the iteratively optimized end-to-end network, further comprising: and performing end-to-end training on the end-to-end network, wherein in the training process, a preset iterative optimization type training mode is adopted to perform forward propagation of the network to respectively obtain initial results of detection, tracking and prediction, iterative optimization is performed, the prediction network is used for solving an intersection ratio matrix of a tracking part to obtain a new tracking result, the detection result and the tracking result are used for the prediction network to obtain the new prediction result, the new tracking result is obtained by tracking, and iteration is repeated until a network convergence condition is met to obtain the end-to-end network after iterative optimization.

Further, in some embodiments, the detection formula of the multi-target detection network is:

Fⁱ＝CNN(IMGⁱ,IMG^i-1,O^i-1)

Cⁱ＝FC(Fⁱ)

Further, in some embodiments, the tracking formula for the multi-target tracking network is:

IfThe kth target tracking result is fromBecomes as follows

IfTarget tracking fails, but the target k is kept;

If Observation result In order to be a new goal of the present invention,

Further, in some embodiments, the prediction formula for the multi-target trajectory prediction network is:

Or (b)

In step S103, a sensing result of the end-to-end intelligent vehicle is obtained based on the detection result, the tracking result, and the prediction result.

Specifically, first, a multi-objective monitoring network part in an end-to-end network according to an embodiment of the present application will be described. Fig. 3 is a schematic diagram of an end-to-end network monitoring portion according to an embodiment of the present application, as shown in fig. 3. In the multi-target monitoring network, firstly, when multi-target detection is carried out on an ith frame of image, the network input is image information IMG ⁱ at the current moment, image information IMG ^i-1 at the last moment and detection network output information O ^i-1 at the last moment, which are acquired through a plurality of cameras, in order to simultaneously consider the follow-up tracking requirement; secondly, extracting features of the image information by using CNN (Convolutional Neural Network ) to obtain a feature vector F ⁱ, and then reducing the dimension of the feature vector F ⁱ by using FC (Full Connected layer, fully connected layer) to convert the feature vector into a coding vector C ⁱ with lower dimension; finally, the important features related to the detection target in the code vector C ⁱ are known by using the attention mechanismAnd fusing with the code vector, and then using the full connection layer or other decoding network (such as Transformer, etc.) to fuse the code vectorDecoding is carried out to obtain a decoding vector O ⁱ at the current moment. The dimensions of the decoding vector O ⁱ are defined according to the multi-target detection result required by the practical application, and may include a target center position, a target position offset, a target bounding box coordinate, a target bounding box size, an environment detection thermodynamic diagram, and the like. Thus, the specific formula of the detection section is as follows:

Fⁱ＝CNN(IMGⁱ,IMG^i-1,O^i-1)； (1)

Cⁱ＝FC(Fⁱ)； (2)

Next, a multi-target tracking network part in the end-to-end network according to the embodiment of the present application will be described. Fig. 4 is a schematic diagram of an end-to-end network tracking portion according to an embodiment of the present application, as shown in fig. 4. In a multi-target tracking network, the input of the tracking network at the ith frame is the output of the current frame detection network And predicting the current frame image position of the target by the previous frame prediction networkDefining the ith frame target as K _i, tracking network needs to output all the predicted targets of the previous frameDetection result of current framePerforming IOU (Intersection over Union, cross-over ratio) test to obtain cross-over ratio matrixSecondly, regarding the target of the (i-1), if any detection result of the i frame and the IOU of the predicted result of the target are greater than a certain threshold sigma, the target is considered to be successfully tracked, and the displacement vector of the target of the current frame is output as a tracking result (the highest IOU of a plurality of detection results and the predicted result is obtained when the IOU of the plurality of detection results and the predicted result is greater than the threshold); if the predicted result of a certain target of the (i-1) frame is not consistent with any detection result of the current frame, keeping the target in the beta frame after considering the influence caused by shielding, and predicting based on the existing observation result until the predicted result is consistent with the detection result of the future frame, namely returning to the steps again; finally, if the target lost frame number exceeds the set threshold value beta, the target is removed. Observations that do not match any object are defined as new objects. In practical experiments, σ may be set to 0.6 and β to 10. The specific formula of the tracking part is as follows:

Again, the multi-objective trajectory prediction network portion of the end-to-end network of the embodiments of the present application will be described. Fig. 5 is a schematic diagram of an end-to-end network prediction part according to an embodiment of the present application. In the end-to-end network prediction, the track prediction problem needs to observe the image positions of the past continuous m frames of the target, and output a prediction result of the image positions of the future n frames of the target. Therefore, when track prediction is performed on the target k in the ith frame, the three-dimensional convolutional neural network, or the convolutional neural network plus RNN (Recurrent Neural Network, cyclic neural network), is used to predict the image position information of the previous consecutive m frames Feature extraction is carried out to obtain a prediction vector based on historical observation informationPredictive vectors for all targets (K total) based on current frameUsing interactive prediction networks such as GNN (Graph neural network, graphic neural network) and the like, the interactive results of all targets of the current frame can be solvedPredictive vector based on historical observation information for target k using interaction resultsUpdating to obtain a prediction vector considering interaction and history informationFinally, decoding the track prediction result by using a cyclic neural network and other modes, and obtaining the prediction resultAnd outputting. The specific formula of the prediction part is as follows:

Where concat indicates that the two vectors are to be spliced.

Finally, the iterative optimization training method of the embodiment of the application is introduced. As indicated by the arrow in fig. 2, an iterative optimization training mode is adopted during the training process. The output interfaces of the definition detection, tracking and prediction modules are respectively (a, b and c). In the single training process, firstly, forward transmission of the network, namely a- & gt b- & gt c is carried out to obtain initial results a ₀,b₀ and c ₀ of detection, tracking and prediction; and secondly, performing iterative optimization. The prediction network is firstly used for solving the cross-correlation matrix M of the tracking part to obtain a new tracking result b ₁, then the detection result a ₀ and the tracking result b ₁ are used for the prediction network to obtain a new prediction result c ₁, and the new prediction result can be used for the tracking module to obtain a new tracking result b ₂. The iterative process is repeated until the network convergence condition is satisfied, namely:

|c_n-c_n-1|<₁； (15)

|b_n-b_n-1|<₂； (16)

And at this time, ending the iterative optimization process, and outputting a ₀,b_n and c _n of the final iteration as the output result of the detection, tracking and prediction network at the current moment.

It should be noted that, in the internal iterative process of the method described above, because of the difficulty of internal iterative training, the iterative optimization training method is also suitable for external iteration, i.e. after the end-to-end detection-tracking-prediction network training is completed under the non-iterative condition, the same test data is input again for iterative training. The iterative training process is exactly the same as the above description and will not be repeated here. It should be noted that since the network has undergone non-iterative preliminary training, the iterative training at this time has less influence on network parameters, and the network may converge on a local optimum point, but since the training is easier, the expected training period is shorter, and therefore, in practical applications, a trade-off may be performed according to the required accuracy.

Thus, the present example specifically addresses the problems existing in the related art as follows:

(1) Aiming at the problems that the existing detection-tracking-prediction algorithms are independent and have weak cooperativity, the embodiment of the application provides an end-to-end detection tracking framework, three kinds of perception tasks are completed in the same network, and corresponding results are output simultaneously. The perception scheme can greatly improve the instantaneity and the accuracy of the perception tasks, and can realize end-to-end optimization, so that three kinds of interdependent perception tasks are closely related.

(2) Aiming at the problem that the existing detection and tracking scheme is difficult to track the blocked object, the embodiment of the application creatively uses the prediction network for the tracking part, when the object is blocked and cannot be detected, the similarity matching is carried out between the prediction result and the tracking result, if the similarity of the prediction result and the tracking result is higher, the detection position of the blocked object is considered to be the weighted result of the prediction position and the tracking position, and the reliability of the tracking algorithm is improved.

(3) Aiming at the problem that the detection, tracking and prediction are interdependent, the embodiment of the application provides an iterative optimization type sensing scheme, a prediction network result is used for a tracking part, the tracking rate of a shielded object is improved, the tracking result is input into the prediction network, the robustness of the prediction network to the tracking result is improved, and the steps are continuously and circularly carried out until the detection, tracking and prediction network finally converges to an acceptable local optimal point, namely the iterative updating is completed.

Next, an iterative optimized end-to-end intelligent vehicle sensing device according to an embodiment of the present application is described with reference to the accompanying drawings.

FIG. 6 is a block schematic diagram of an iterative optimized end-to-end intelligent vehicle awareness apparatus according to an embodiment of the present application.

As shown in fig. 6, the iterative optimization type end-to-end intelligent vehicle sensing device 10 includes: the device comprises an acquisition module 100, an execution module 200 and a perception module 300.

The acquisition module 100 is configured to acquire perception information of an intelligent vehicle;

the execution module 200 is configured to input perception information to an end-to-end network after iterative optimization, execute a detection task, a tracking task and a prediction task, and obtain a detection result, a tracking result and a prediction result at the same time, where the end-to-end network includes a multi-target detection network, a multi-target tracking network and a multi-target track prediction network; and

The sensing module 300 is configured to obtain a sensing result of the end-to-end intelligent vehicle based on the detection result, the tracking result, and the prediction result.

Further, in some embodiments, the sensing module 300 is specifically configured to:

Fⁱ＝CNN(IMGⁱ,IMG^i-1,O^i-1)

Cⁱ＝FC(Fⁱ)

IfThe kth target tracking result is fromBecomes as follows

IfTarget tracking fails, but the target k is kept;

If Observation result In order to be a new goal of the present invention,

Or (b)

Fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application. The electronic device may include:

Memory 701, processor 702, and computer programs stored on memory 701 and executable on processor 702.

The processor 702, when executing the program, implements the iterative optimized end-to-end intelligent vehicle awareness method provided in the above embodiments.

Further, the electronic device further includes:

A communication interface 703 for communication between the memory 701 and the processor 702.

Memory 701 for storing a computer program executable on processor 702.

The memory 701 may include a high-speed RAM memory or may further include a non-volatile memory (non-volatile memory), such as at least one magnetic disk memory.

If the memory 701, the processor 702, and the communication interface 703 are implemented independently, the communication interface 703, the memory 701, and the processor 702 may be connected to each other through a bus and perform communication with each other. The bus may be an industry standard architecture (Industry Standard Architecture, abbreviated ISA) bus, an external device interconnect (PERIPHERAL COMPONENT, abbreviated PCI) bus, or an extended industry standard architecture (Extended Industry Standard Architecture, abbreviated EISA) bus, among others. The buses may be divided into address buses, data buses, control buses, etc. For ease of illustration, only one thick line is shown in fig. 7, but not only one bus or one type of bus.

Alternatively, in a specific implementation, if the memory 701, the processor 702, and the communication interface 703 are integrated on a chip, the memory 701, the processor 702, and the communication interface 703 may communicate with each other through internal interfaces.

The processor 702 may be a central processing unit (Central Processing Unit, abbreviated as CPU), or an Application SPECIFIC INTEGRATED Circuit (ASIC), or one or more integrated circuits configured to implement embodiments of the application.

The present embodiment also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements an iterative optimized end-to-end intelligent vehicle awareness method as described above.

In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or N embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.

Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present application, "N" means at least two, for example, two, three, etc., unless specifically defined otherwise.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more N executable instructions for implementing specific logical functions or steps of the process, and further implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the embodiments of the present application.

Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or N wires, a portable computer cartridge (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium may even be paper or other suitable medium upon which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.

It is to be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the N steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. As with the other embodiments, if implemented in hardware, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.

Those of ordinary skill in the art will appreciate that all or part of the steps carried out in the method of the above-described embodiments may be implemented by a program to instruct related hardware, and the program may be stored in a computer readable storage medium, where the program when executed includes one or a combination of the steps of the method embodiments.

In addition, each functional unit in the embodiments of the present application may be integrated in one processing module, or each unit may exist alone physically, or two or more units may be integrated in one module. The integrated modules may be implemented in hardware or in software functional modules. The integrated modules may also be stored in a computer readable storage medium if implemented as software functional modules and sold or used as a stand-alone product.

The above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, or the like. While embodiments of the present application have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the application, and that variations, modifications, alternatives and variations may be made to the above embodiments by one of ordinary skill in the art within the scope of the application.

Claims

1. An iterative optimization type end-to-end intelligent vehicle sensing method is characterized by comprising the following steps of:

acquiring perception information of an intelligent vehicle;

Acquiring a perception result of the end-to-end intelligent vehicle based on the detection result, the tracking result and the prediction result;

the detection formula of the multi-target detection network is as follows:

Fⁱ＝CNN(IMGⁱ,IMG^i-1,O^i-1)

Cⁱ＝FC(Fⁱ)

Wherein F ⁱ is a feature vector, IMG ⁱ is image information at the current time, IMG ^i-1 is image information at the last time, O ^i-1 is detection network output information at the last time, C ⁱ is a low-dimensional encoding vector, For detecting important features related to the target, O ⁱ is a decoding vector at the current time;

The tracking formula of the multi-target tracking network is as follows:

IfThe kth target tracking result is fromBecomes as follows

IfTarget tracking fails, but the target k is kept;

If Observation result In order to be a new goal of the present invention,

Wherein,For the decoding vector of the kth _i object of the ith frame,The predicted output of the ith frame for the kth _i-1 th target of the ith-1 th frame,I represents the i frame in detection, and sigma and beta are preset thresholds;

the prediction formula of the multi-target track prediction network is as follows:

Or (b)

2. The method of claim 1, further comprising, prior to inputting the awareness information into the iteratively optimized end-to-end network:

3. An iterative optimization type end-to-end intelligent vehicle sensing device, which is characterized by comprising:

The sensing module is used for acquiring a sensing result of the end-to-end intelligent vehicle based on the detection result, the tracking result and the prediction result

The detection formula of the multi-target detection network is as follows:

Fⁱ＝CNN(IMGⁱ,IMG^i-1,O^i-1)

Cⁱ＝FC(Fⁱ)

The tracking formula of the multi-target tracking network is as follows:

IfThe kth target tracking result is fromBecomes as follows

IfTarget tracking fails, but the target k is kept;

If Observation result In order to be a new goal of the present invention,

Wherein,For the decoding vector of the kth _i object of the ith frame,The predicted output of the ith frame for the kth _i-1 th target of the ith-1 th frame,Calculating the calculation result of the intersection ratio of the kth prediction of the ith frame and the jth detection, wherein i is the ith frame in the detection, and sigma and beta are preset thresholds;

Or (b)

4. A device according to claim 3, wherein the sensing module is specifically configured to:

5. An electronic device, comprising: a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor executing the program to implement the iterative optimized end-to-end intelligent vehicle awareness method of claim 1 or 2.

6. A computer readable storage medium having stored thereon a computer program, the program being executable by a processor for implementing the iterative optimized end-to-end intelligent vehicle awareness method according to claim 1 or 2.