CN115424233A

CN115424233A - Target detection method and target detection device based on information fusion

Info

Publication number: CN115424233A
Application number: CN202210422754.0A
Authority: CN
Inventors: 衣春雷; 尹荣彬; 李兵; 王秋
Original assignee: FAW Group Corp
Current assignee: FAW Group Corp
Priority date: 2022-04-21
Filing date: 2022-04-21
Publication date: 2022-12-02

Abstract

The embodiment of the disclosure provides a target detection method, a device, a storage medium and an electronic device based on information fusion, wherein the target detection method comprises the following steps: performing first fusion on a first voxel based on laser radar point cloud data and a second voxel based on millimeter wave radar point cloud data to obtain a pseudo image; detecting the pseudo image through a detection network and acquiring state information of a target; and inputting the state information of the target and the information of the target acquired by the millimeter wave radar into a depth network model for second fusion to obtain an actual speed value of the target. The method is based on early fusion and late fusion of the point cloud data of the laser radar and the point cloud data of the millimeter wave radar, namely, the early fusion of the point cloud data of the laser radar and the millimeter wave radar is realized in a voxel mode, and the detection of the actual speed value of the target is realized in a mode of fusing the speed of the target detected by the depth network and the speed of the target collected by the associated millimeter wave radar.

Description

Target detection method and target detection device based on information fusion

Technical Field

The embodiment of the disclosure relates to the technical field of information detection in the field of automatic driving, and in particular relates to a target detection method and device based on information fusion, a storage medium and an electronic device.

Background

In order to realize accurate sensing of obstacles (or targets) around a vehicle during driving of an autonomous vehicle (SDVs), various heterogeneous sensors, such as a vehicle-mounted laser radar, a millimeter-wave radar, a camera, an ultrasonic radar, and the like, need to be mounted on the vehicle, and information collected by the various sensors needs to be fused, so that the vehicle can realize 360 ° omnidirectional sensing during autonomous driving.

At present, a fusion method for collecting information by a vehicle-mounted perception sensor mainly comprises a front-end fusion method and a rear-end fusion method, wherein the front-end fusion method belongs to original data level or characteristic level fusion, and mainly comprises the steps of carrying out target detection by using one sensor and forming an interested area, projecting the interested area to original data of the other sensor for data screening, and carrying out target detection in the screened original data by using a deep learning method, wherein the method has the main defect that the detection accuracy is seriously dependent on the perception effect of the first sensor; the so-called back-end fusion method belongs to target-level fusion, namely, targets are detected by each sensor independently, and then target detection results of a plurality of sensors are fused, so that the method has larger information loss, and compared with the method, the information loss of the front-end fusion scheme is smaller.

Disclosure of Invention

In view of the above deficiencies of the prior art, embodiments of the present disclosure provide a method and an apparatus for target detection based on information fusion, a storage medium, and an electronic device, so as to solve the problem that the current method of using a single sensor to obtain an area of interest leads to an excessive dependence on the target detection effect of a certain sensor.

In order to solve the technical problem, the embodiment of the present disclosure adopts the following technical solutions:

an information fusion-based target detection method, comprising: performing first fusion on a first voxel based on laser radar point cloud data and a second voxel based on millimeter wave radar point cloud data to obtain a pseudo image; detecting the pseudo image through a detection network and acquiring state information of a target; and inputting the state information of the target and the information of the target acquired by the millimeter wave radar into a depth network model for second fusion to obtain an actual speed value of the target.

In some embodiments, the first fusing the first voxel based on the lidar point cloud data and the second voxel based on the millimeter wave radar point cloud data to obtain the pseudo-image comprises: acquiring mutual interaction information between the laser radar point cloud data and the millimeter wave radar point cloud data; acquiring overall characteristics including characteristics of a laser radar and characteristics of a millimeter wave radar having the interaction information; and mapping the overall characteristics to a plane according to position codes so as to form a pseudo image.

In some embodiments, the detecting the pseudo image and acquiring the state information of the target through the detection network includes: extracting characteristic information from the pseudo image through a backbone network of the detection network; and inputting the characteristic information into a full convolution detection head of the detection network to output target state information.

In some embodiments, the state information is described by way of D = (c, x, y, w, l, θ, v), where c represents a confidence score, x and y represent a center position of the target from a BEV perspective, w represents a width of the target, l represents a length of the target, θ represents an orientation of the target, and v represents a two-dimensional velocity of the target.

In some embodiments, the inputting the state information of the target and the information of the target collected by the millimeter wave radar into a depth network model to obtain the actual velocity value of the target includes: associating a first target acquired by the detection network with a second target acquired by the millimeter wave radar; acquiring an association score between the first target and the second target; and acquiring an actual speed value of each target based on the association score.

In some embodiments, the deep network model is trained by a multitasking overall loss function, the overall loss function including at least a detection loss of the detection network to detect the first target, a speed loss of a detection phase, and a speed loss of a second fusion phase.

In some embodiments, before the first fusing the first voxel based on the lidar point cloud and the second voxel based on the millimeter wave radar to obtain the pseudo image, the method further comprises: respectively acquiring and obtaining laser radar point cloud data and millimeter wave radar point cloud data through a laser radar and a millimeter wave radar; and carrying out voxelization processing on the laser radar point cloud data and the millimeter wave radar point cloud data based on a bird-eye view angle and a perspective view angle to obtain a first voxel and a second voxel.

The present disclosure also provides a target detection device based on information fusion, which includes: the first fusion module is used for performing first fusion on a first voxel based on laser radar point cloud data and a second voxel based on millimeter wave radar point cloud data to obtain a pseudo image; the acquisition module is used for detecting the pseudo image through a detection network and acquiring the state information of a target; and the second fusion module is used for inputting the state information of the target and the information of the target acquired by the millimeter wave radar into a depth network model for second fusion so as to acquire the actual speed value of the target.

The present disclosure also provides a storage medium storing a computer program which, when executed by a processor, performs the steps of any of the methods described above.

The present disclosure also provides an electronic device comprising at least a memory having a computer program stored thereon, and a processor implementing the steps of any of the above methods when executing the computer program on the memory.

The beneficial effects of this disclosed embodiment lie in: the embodiment of the disclosure is based on early fusion and late fusion of the point cloud data of the laser radar and the point cloud data of the millimeter wave radar, namely, the early fusion of the point cloud data of the laser radar and the millimeter wave radar is realized in a voxel mode, and the detection of the actual speed value of the target is realized in a mode of fusing the speed of the target detected by the depth network and the speed of the target collected by the associated millimeter wave radar.

Drawings

In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present disclosure, and other drawings can be obtained by those skilled in the art without creative efforts.

FIG. 1 is a schematic diagram illustrating a method for information fusion-based target detection according to an embodiment of the present disclosure;

FIG. 2 is a schematic diagram illustrating steps of a target detection method based on information fusion according to an embodiment of the present disclosure;

FIG. 3 is a schematic diagram illustrating a method for target detection based on information fusion according to an embodiment of the present disclosure;

fig. 4 is a schematic step diagram of a target detection method based on information fusion according to an embodiment of the present disclosure.

Detailed Description

Various aspects and features of the disclosure are described herein with reference to the drawings.

It should be understood that various modifications may be made to the embodiments of the present application. Accordingly, the foregoing description should not be considered as limiting, but merely as exemplifications of embodiments. Those skilled in the art will envision other modifications within the scope and spirit of the disclosure.

The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate embodiments of the disclosure and, together with a general description of the disclosure given above, and the detailed description of the embodiments given below, serve to explain the principles of the disclosure.

These and other characteristics of the present disclosure will become apparent from the following description of preferred forms of embodiment, given as non-limiting examples, with reference to the attached drawings.

It should also be understood that, although the present disclosure has been described with reference to some specific examples, a person of skill in the art shall certainly be able to achieve many other equivalent forms of the disclosure, having the characteristics as set forth in the claims and hence all coming within the field of protection defined thereby.

The above and other aspects, features and advantages of the present disclosure will become more apparent in view of the following detailed description when taken in conjunction with the accompanying drawings.

Specific embodiments of the present disclosure are described hereinafter with reference to the drawings; however, it is to be understood that the disclosed embodiments are merely exemplary of the disclosure that may be embodied in various forms. Well-known and/or repeated functions and constructions are not described in detail to avoid obscuring the disclosure in unnecessary or unnecessary detail. Therefore, specific structural and functional details disclosed herein are not to be interpreted as limiting, but merely as a basis for the claims and as a representative basis for teaching one skilled in the art to variously employ the present disclosure in virtually any appropriately detailed structure.

The description may use the phrases "in one embodiment," "in another embodiment," "in yet another embodiment," or "in other embodiments," which may each refer to one or more of the same or different embodiments in accordance with the disclosure.

A first embodiment of the present disclosure relates to an information fusion-based target detection method, which is particularly used in an autonomous driving scenario, and in particular, the target detection method herein is implemented by a detection device provided on an autonomous driving vehicle, where the detection device includes at least an image pickup device, a laser radar, a millimeter wave radar, and the like, and detection of motion information such as an actual velocity value of a 3D target can be implemented by the target detection method.

Currently, in three-dimensional (3D) object detection, object detection is generally performed by using an image pickup device, a laser radar, or a combination thereof, where the image pickup device can provide images with rich appearance characteristics, and the laser radar can provide accurate 3D distance information. However, on the one hand, the target detection capability is limited by the sparsity of the point cloud data obtained by lidar detection (e.g., during long-range detection) and the sensitivity of the lidar to weather (e.g., fog, rain, and snow); certainly, the camera is also seriously influenced by weather during detection, and the target detection results of the two detection devices do not have speed information, so that the speed estimation of the target can be further realized through a subsequent tracking process. On the other hand, although millimeter wave radar has high robustness to various weather conditions and can provide speed information through a single measurement, it can only calculate the radial speed of a target relative to an autonomous vehicle through the doppler effect, cannot provide the ground speed of the target, and its data is very sparse (usually much more sparse than lidar data), and there is a large deviation in shape estimation for the target. For this reason, the image pickup device, the laser radar, and the millimeter wave radar are simultaneously employed in the process of target detection by the target detection method of the present embodiment, so that accurate estimation of the target speed can be achieved.

Specifically, the target detection method related to the embodiment of the present disclosure includes an early fusion step and a late fusion step, where the early fusion step mainly implements detection of information such as position, shape, orientation, and the like of a target, and the late fusion step mainly implements estimation of a target speed, and an overall architecture implemented by the method is as shown in fig. 1.

As shown in fig. 2, an embodiment of the present disclosure provides a target detection method, which includes the following steps:

s101, performing first fusion on a first voxel based on laser radar point cloud data and a second voxel based on millimeter wave radar point cloud data to obtain a pseudo image.

In this step, a first voxel based on the lidar point cloud and a second voxel based on the millimeter wave radar point cloud are fused to obtain a pseudo image. Specifically, the step is to input the geometric information, that is, the voxels, of the laser radar point cloud data and the millimeter wave radar point cloud data into the information fusion model to realize the early fusion of the information.

Specifically, in this step, at first, laser radar point cloud data and millimeter wave radar point cloud data need to be respectively acquired and obtained by a laser radar and a millimeter wave radar which are arranged on a vehicle, and the two point cloud data are respectively subjected to voxelization processing at a BEV viewing angle and a perspective viewing angle. The BEV view here refers to point cloud data that can connect a plurality of height slices and a plurality of scans by using Bird's Eye View (BEV).

Further, for example, laser radar point cloud data and millimeter wave radar point cloud data obtained by performing scanning for a plurality of times within a predetermined time (for example, within the past 0.5 second) are used as input of the information fusion model, so that the information of the target can be estimated by sufficient information, and the real-time performance of the algorithm can be ensured. In the process of acquiring the point cloud data through the laser radar and the millimeter wave radar, the attitude of the automatic driving vehicle can be estimated by a positioning system due to the calibration of related sensors, so that all the point cloud data can be further converted into data under the coordinate system of the automatic driving vehicle of the current frame.

And after the point cloud data of the laser radar is obtained, further obtaining a first voxel based on the point cloud data of the laser radar. In particular, the weighted value is used to represent a value that is characteristic of each of said first voxels, in particular, for eachA first voxel having a characteristic value of 0 if no point falls therein; if one or more points are for example { (x) _i ,y _i ,z _i ) I =1.. N } falls within the first voxel, then the characteristic value of the first voxel is

Where (a, b, c) is the center of the first voxel and (dx, dy, dz) is the size of the shape of the first voxel.

Similar to the above description for acquiring the first voxel of the lidar point cloud, after acquiring the point cloud data of the millimeter wave radar, further acquiring second voxels based on the millimeter wave radar point cloud, where the positions of millimeter wave radar points are represented by (x, y), wherein for each of the second voxels, if no point falls therein, the characteristic value of the second voxel is 0, if at least one motion point falls therein, the characteristic value of the second voxel is 1, and if all points falling within the second voxel are stationary, the characteristic value of the second voxel is-1.

The first voxel based on the laser radar point cloud data and the second voxel input information fusion model based on the millimeter wave radar point cloud data are connected together to achieve early fusion, specifically, interaction information between the laser radar point cloud data and the millimeter wave radar point cloud data can be learned from a BEV (beam intensity vector) view angle and a perspective view angle through early fusion of information, and characteristics of a laser radar carrying millimeter wave radar interaction information and characteristics of a millimeter wave radar carrying the laser radar interaction information can be obtained; finally, the features are spliced in channel dimensions to obtain overall features, and the output overall features are further mapped to an x-y plane according to position codes, so that a 128-channel pseudo image is finally formed.

S102, detecting the pseudo image through a detection network and acquiring state information of a target.

After the first voxel based on the lidar point cloud and the second voxel based on the millimeter wave radar are fused to acquire the pseudo image through the above step S101, in this step, the pseudo image is detected through the detection network and the detection result of the target is acquired. The detection network used here includes the same backbone network structure as pnnet and uses a full convolution detector head including classification branches and regression branches to achieve target detection and acquisition.

Specifically, multi-scale feature information is extracted from the pseudo image of 128 channels through a backbone network of the detection network, and the feature information is output to a full convolution detection head of the detection network; the full convolution detection head processes the characteristic information output by the backbone network and outputs a target detection result.

The detection result of the target, that is, the state information, is described here by way of D = (c, x, y, w, l, θ, v), where c denotes a confidence score, x and y denote the center position of the target in a BEV view, w denotes the width of the target, l denotes the length of the target, θ denotes the orientation of the target, and v denotes the two-dimensional velocity of the target, where v = (v, y, w, l, θ, v) _x ,v _y )。

Further, the confidence score c is predicted by the classification branch in the full convolution detector head, and the other parameters are predicted by the regression branch in the full convolution detector head, and the prediction results of the other parameters can be expressed as (x-p), for example _x ,y-p _y ,w,l,cos(θ),sin(θ),m,v _x ,v _y ) Wherein p is _x ,p _y And m is used for representing the probability of the target moving for the two-dimensional coordinate of each point cloud data center. Wherein, if the predicted m is less than 50%, the two-dimensional velocity v is directly set to zero.

And S103, inputting the state information of the target and the information of the target acquired by the millimeter wave radar into a depth network model for second fusion to obtain an actual speed value of the target.

After the pseudo image is detected by the detection network and the detection result of the target is acquired by the above step S102, in this step, the speed of the target is acquired based on the state information of the target acquired by the detection and the target acquired by the millimeter wave radar. As shown in fig. 3, for example, the state information of the target obtained by detection and the target collected by the millimeter wave radar may be input to the depth network to obtain an actual speed value of the target, so as to implement information fusion in a later stage by using dynamic information.

In the process of predicting and judging the motion of a target in an automatic driving scene, the real speed of the target cannot be accurately judged only by considering the radial speed due to the lack of tangential information of the motion of the target. To solve this problem, the radial velocity direction can be unified by estimating the target velocity direction, i.e. projecting the radial velocity of the target towards the detected target motion direction.

In this step, after a set of state information of the target detected by the detection network and the target collected by the millimeter wave radar are given, the determination of the speed of the target can be realized by using the fusion of dynamic information between the two. In the process of collecting information by using millimeter wave radars, how to correctly correlate the target collected by each millimeter wave radar with the state information of the target detected by the detection network needs to be solved, which mainly takes into account that the correlation here is not one-to-one mapping, because many targets detected by the detection network may not be correlated with any target collected by the millimeter wave radars, and a situation that the targets collected by a plurality of millimeter wave radars may also be correlated may occur. Furthermore, millimeter wave radar targets have false positives and positional noise, which also makes correlation difficult. There is also a need to address how effectively an accurate target speed estimation is achieved based on the correlated results.

For this reason, in order to realize the correlation between two targets in this step, an attention-based mechanism is proposed to realize the correlation, as shown in fig. 3, fig. 3 shows an example of information fusion between a target collected by the millimeter wave radar and a target detected by the detection network, and in practice, this process will be applied to all targets detected by the detection network in parallel.

Firstly, aligning the radial speed of a millimeter wave radar acquisition target with the motion direction of the target detected by the detection network in a projection mode, then predicting the association scores between all targets detected by the detection network and a target pair acquired by the millimeter wave radar, and estimating the speed of the target as the result of weighted summation of the speed of the target detected by the detection network and the projection speeds of the targets acquired by all associated millimeter wave radars. Specifically, as shown in fig. 4, the method comprises the following steps:

s201, associating the first target acquired by the detection network with a second target acquired by the millimeter wave radar.

In this step, the first target acquired by the detection network is associated with the second target acquired by the millimeter wave radar. Specifically, a given detection result of a first target D detected and acquired by the detection network is set to D = (c, x, y, w, l, θ, v), and a given second target Q of a millimeter wave radar collection is set to Q = (Q, v) _|| M, t), first, the association formula of two targets is determined, as follows:

f(D,Q)＝(f ^det (D),f ^det-radar (D,Q)) (1)

f ^det-radar (D,Q)＝(dx,dy,dt,v ^bp ) (3)

wherein, (-) represents a series operation, γ represents an angle between a moving direction of the first object D and a radial direction of D, φ represents an angle between a moving direction of the first object D and a radial direction of Q, and v represents an angle between a moving direction of the first object D and a radial direction of Q ^bp A back projection representing the radial velocity of the second target Q (the upper limit is set to 40m/s in order to avoid excessive velocity), v is the scalar of the velocity of the second target Q, (dx, dy, dt) is the position and time offset between the first target D and the second target Q under BEV view.

S202, obtaining the association score between the first target and the second target.

After the first target acquired by the detection network and the second target acquired by the millimeter wave radar are associated by the above step S201, in this step, an association score between the first target and the second target is acquired. Specifically, the relevance score between two targets is calculated by inputting the above relevance formula into a learnable matching function model, specifically represented by the following formula:

s _i,j ＝MLP _match (f(D _i ,Q _j )) (5)

in the formula, the matching function model is a Multi-Layer perceptron (MLP) with five layers of neural networks (32, 64, 1 neuron respectively), when D is _i And Q _j When the size of any element in (dx, dy, dt) exceeds a predetermined threshold, the result of equation (5) is directly set to zero, where the predetermined threshold is artificially set according to the actual data situation.

S203, acquiring an actual speed value of the target based on the association score.

After the association score between the first target and the second target is acquired through the above step S202, in this step, the actual velocity value of the target is acquired based on the association score.

In this step, the association scores between the first targets detected by all the detection networks and the second targets collected by the millimeter wave radar are obtained through calculation, and the speed information of the second targets collected by all the millimeter wave radars is aggregated to estimate each target D _i The speed of (2).

For this purpose, the relevance scores based on the first and second objects are first normalized, before the sequence s of relevance scores is normalized _i,： Formula (1) is added to process the results without correlation, and the normalization method is as follows:

then, all candidate targets are weighted and summed to optimize the speed as follows, wherein the candidate targets comprise all first targets acquired by the detection network and all second targets acquired by the millimeter wave radar:

in the formula, superscript T denotes a transpose operation.

The actual two-dimensional velocity of the target is detected and obtained here by:

wherein, in the above formula, the index i is omitted for simplicity.

Furthermore, in the embodiment of the present disclosure, for the deep network model involved in step S103, the deep network model may be trained through a multitask overall loss function, that is, the overall loss function is obtained through a weighted summation manner of the detection loss of the detection network detection target, the speed loss of the detection stage, and the speed loss of the late stage fusion stage, where the formula is as follows:

in the formula (I), the compound is shown in the specification,

is a cross entropy loss function of the classification score c,

is a smoothed sum of the target position, size, orientation ₁ The function of the loss is a function of,

is a cross entropy loss function of the motion probability m,

is the smoothing of v ₁ The function of the loss is a function of,

is the smoothing of v ₁ A loss function. α, β, δ are scalar values that balance the different tasks. The method does not need to perform explicit supervised learning on the correlation result between the first target detected by the detection network and the second target collected by the millimeter wave radar, and is realized based on late fusion of attention.

The embodiment of the disclosure is based on early fusion and late fusion of the point cloud data of the laser radar and the point cloud data of the millimeter wave radar, namely, the early fusion of the point cloud data of the laser radar and the millimeter wave radar is realized in a voxel mode, and the detection of the actual speed value of the target is realized in a mode of fusing the speed of the target detected by the depth network and the speed of the target collected by the associated millimeter wave radar.

A second embodiment of the present disclosure relates to an information fusion-based target detection apparatus, which includes a first fusion module, an acquisition module, and a second fusion module, which cooperate with each other, wherein:

the first fusion module is used for carrying out first fusion on a first voxel based on laser radar point cloud data and a second voxel based on millimeter wave radar point cloud data so as to obtain a pseudo image;

the acquisition module is used for detecting the pseudo image through a detection network and acquiring the state information of a target;

and the second fusion module is used for inputting the state information of the target and the information of the target collected by the millimeter wave radar into a depth network model for second fusion so as to obtain the actual speed value of the target.

Further, the first fusion module comprises:

the interactive information acquisition unit is used for acquiring interactive information between the laser radar point cloud data and the millimeter wave radar point cloud data;

a feature acquisition unit configured to acquire an overall feature including a feature of the laser radar and a feature of the millimeter wave radar having the mutual information;

a mapping unit for mapping the global features to a plane by a position code, thereby forming a pseudo-image.

Further, the obtaining module comprises:

an extraction unit, configured to extract feature information from the pseudo image through a backbone network of the detection network;

and the output unit is used for inputting the characteristic information into a full convolution detection head of the detection network so as to output the state information of the target.

Further, the state information is described by means of D = (c, x, y, w, l, θ, v), where c denotes a confidence score, x and y denote a center position of the target under BEV view, w denotes a width of the target, l denotes a length of the target, θ denotes an orientation of the target, and v denotes a two-dimensional velocity of the target.

Further, the second fusion module comprises:

the association unit is used for associating the first target acquired by the detection network with the second target acquired by the millimeter wave radar;

the association score determining unit is used for acquiring an association score between the first target and the second target;

and the speed acquisition unit is used for acquiring the actual speed value of each target based on the association score.

Further, the deep network model is trained through a multitask overall loss function, and the overall loss function at least comprises detection loss of the detection network for detecting the first target, speed loss of a detection stage and speed loss of a second fusion stage.

The disclosed embodiment further includes a preprocessing module, which includes:

the acquisition unit is used for respectively acquiring and obtaining laser radar point cloud data and millimeter wave radar point cloud data through a laser radar and a millimeter wave radar;

and the voxelization unit is used for carrying out voxelization processing on the laser radar point cloud data and the millimeter wave radar point cloud data based on a bird view angle and a perspective view angle so as to obtain a first voxel and a second voxel.

A third embodiment of the present disclosure provides a storage medium, which is a computer-readable medium storing a computer program that, when executed by a processor, implements the methods provided by the first and third embodiments of the present disclosure, including the following steps S11 to S13:

s11, performing first fusion on a first voxel based on laser radar point cloud data and a second voxel based on millimeter wave radar point cloud data to obtain a pseudo image;

s12, detecting the pseudo image through a detection network and acquiring state information of a target;

and S13, inputting the state information of the target and the information of the target acquired by the millimeter wave radar into a depth network model for second fusion to obtain an actual speed value of the target.

Further, the computer program realizes the other methods provided by the first embodiment of the disclosure when being executed by the processor

A fourth embodiment of the present disclosure provides an electronic device, which includes at least a memory and a processor, where the memory stores a computer program thereon, and the processor implements the method provided by any of the embodiments of the present disclosure when executing the computer program on the memory. Illustratively, the electronic device computer program steps are as follows S21 to S23:

s21, performing first fusion on a first voxel based on laser radar point cloud data and a second voxel based on millimeter wave radar point cloud data to obtain a pseudo image;

s22, detecting the pseudo image through a detection network and acquiring state information of a target;

and S23, inputting the state information of the target and the information of the target acquired by the millimeter wave radar into a depth network model for second fusion to obtain an actual speed value of the target.

Further, the processor also executes the computer program in the third embodiment described above

The storage medium may be included in the electronic device; or may exist separately and not be incorporated into the electronic device.

The storage medium carries one or more programs that, when executed by the electronic device, cause the electronic device to: acquiring at least two internet protocol addresses; sending a node evaluation request comprising at least two internet protocol addresses to node evaluation equipment, wherein the node evaluation equipment selects the internet protocol addresses from the at least two internet protocol addresses and returns the internet protocol addresses; receiving an internet protocol address returned by the node evaluation equipment; wherein the obtained internet protocol address indicates an edge node in the content distribution network.

Alternatively, the storage medium carries one or more programs that, when executed by the electronic device, cause the electronic device to: receiving a node evaluation request comprising at least two internet protocol addresses; selecting an internet protocol address from at least two internet protocol addresses; returning the selected internet protocol address; wherein the received internet protocol address indicates an edge node in the content distribution network.

Computer program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming languages, including but not limited to an object oriented programming language such as Java, smalltalk, C + +, and including conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the passenger computer, partly on the passenger computer, as a stand-alone software package, partly on the passenger computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the passenger computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

It should be noted that the storage media described above in this disclosure can be either computer-readable signal media or computer-readable storage media or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any storage medium that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a storage medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present disclosure may be implemented by software or hardware. Wherein the name of an element does not in some cases constitute a limitation on the element itself.

The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), system on a chip (SOCs), complex Programmable Logic Devices (CPLDs), and the like.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be understood by those skilled in the art that the scope of the disclosure herein is not limited to the particular combination of features described above, but also encompasses other embodiments in which any combination of the features described above or their equivalents does not depart from the spirit of the disclosure. For example, the above features may be replaced with (but not limited to) technical features disclosed in the present disclosure having similar functions.

Further, while operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limitations on the scope of the disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.

Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.

While the present disclosure has been described in detail with reference to the embodiments, the present disclosure is not limited to the specific embodiments, and those skilled in the art can make various modifications and alterations based on the concept of the present disclosure, and the modifications and alterations should fall within the scope of the present disclosure as claimed.

Claims

1. A target detection method based on information fusion is characterized by comprising the following steps:

performing first fusion on a first voxel based on laser radar point cloud data and a second voxel based on millimeter wave radar point cloud data to obtain a pseudo image;

detecting the pseudo image through a detection network and acquiring state information of a target;

and inputting the state information of the target and the information of the target acquired by the millimeter wave radar into a depth network model for second fusion to obtain an actual speed value of the target.

2. The target detection method of claim 1, wherein the first fusing the first voxel based on the lidar point cloud data and the second voxel based on the millimeter wave radar point cloud data to obtain the pseudo image comprises:

acquiring mutual information between the laser radar point cloud data and the millimeter wave radar point cloud data;

acquiring overall characteristics including characteristics of a laser radar and characteristics of a millimeter wave radar with the interaction information;

the overall features are mapped to a plane in a position code, thereby forming a pseudo-image.

3. The object detection method of claim 1, wherein the detecting the pseudo image and obtaining the state information of the object through the detection network comprises:

extracting characteristic information from the pseudo image through a backbone network of the detection network;

and inputting the characteristic information into a full convolution detection head of the detection network to output state information of a target.

4. The object detection method according to claim 3, characterized in that the state information is described by means of D = (c, x, y, w, l, θ, v), where c denotes a confidence score, x and y denote a center position of the object in a BEV view, w denotes a width of the object, l denotes a length of the object, θ denotes an orientation of the object, and v denotes a two-dimensional velocity of the object.

5. The object detection method according to claim 1, wherein the inputting the state information of the object and the information of the object collected by the millimeter wave radar into a depth network model to obtain an actual velocity value of the object comprises:

associating a first target acquired by the detection network with a second target acquired by the millimeter wave radar;

acquiring an association score between the first target and the second target;

and acquiring an actual speed value of each target based on the association score.

6. The method of claim 5, wherein the deep network model is trained by a multitasking overall loss function, the overall loss function including at least a detection loss of the detection network to detect the first target, a speed loss of a detection stage, and a speed loss of a second fusion stage.

7. The object detection method according to claim 1, further comprising, before the first fusing the first voxel based on the lidar point cloud and the second voxel based on the millimeter-wave radar to obtain the pseudo image:

respectively acquiring and obtaining laser radar point cloud data and millimeter wave radar point cloud data through a laser radar and a millimeter wave radar;

and carrying out voxelization processing on the laser radar point cloud data and the millimeter wave radar point cloud data based on a bird view angle and a perspective view angle to obtain a first voxel and a second voxel.

8. An object detection device based on information fusion, characterized by comprising:

the first fusion module is used for performing first fusion on a first voxel based on laser radar point cloud data and a second voxel based on millimeter wave radar point cloud data to obtain a pseudo image;

and the second fusion module is used for inputting the state information of the target and the information of the target acquired by the millimeter wave radar into a depth network model for second fusion so as to acquire the actual speed value of the target.

9. A storage medium storing a computer program, characterized in that the computer program, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.

10. An electronic device comprising at least a memory, a processor, the memory having a computer program stored thereon, characterized in that the processor realizes the steps of the method of any one of claims 1 to 7 when executing the computer program on the memory.