CN114677655A

CN114677655A - Multi-sensor target detection method and device, electronic equipment and storage medium

Info

Publication number: CN114677655A
Application number: CN202210136782.6A
Authority: CN
Inventors: 徐辉; 叶汇贤
Original assignee: Shanghai Core Technology Co ltd
Current assignee: Shanghai Core Technology Co ltd
Priority date: 2022-02-15
Filing date: 2022-02-15
Publication date: 2022-06-28
Also published as: WO2023155387A1

Abstract

The embodiment of the invention discloses a multi-sensor target detection method and device, electronic equipment and a storage medium. The method comprises the following steps: determining first target detection results respectively acquired by at least two types of sensors; the target detection result comprises a confidence coefficient and a spatial position, and the spatial position is represented by target central point coordinates and size; performing confidence fusion and spatial position fusion respectively according to the first target detection result acquired by each sensor to obtain fused confidence and fused spatial position; and determining a second target detection result according to the fused confidence coefficient and the fused spatial position for executing automatic driving operation, solving the influence of the limitation of a single type of sensor on the target detection precision, realizing the complementary fusion of different types of sensors and improving the target detection precision.

Description

Multi-sensor target detection method and device, electronic equipment and storage medium

Technical Field

The present invention relates to the field of target detection in the automatic driving technology, and in particular, to a multi-sensor target detection method, apparatus, electronic device, and storage medium.

Background

The target detection algorithm is one of the computer vision focus research directions, and can promote the exploration of an environment perception technology and the development of an automatic driving technology.

In recent years, many studies in 2D and 3D object detection, semantic segmentation, and object tracking have been excited with the ongoing development of object detection. Because each type of sensor has certain limitations, the automatic driving automobile can be simultaneously provided with different types of sensors so as to improve the accuracy of target detection by utilizing the complementary characteristics of the sensors. However, with the increase of the number of sensors, when different types of sensors such as radars and images are fused, the difficulty and accuracy of fusing sensor data are increased, and the accuracy and efficiency of target detection are reduced.

Disclosure of Invention

The invention provides a multi-sensor target detection method, a multi-sensor target detection device, electronic equipment and a storage medium, and aims to solve the problem that different types of sensor target detection cannot be well integrated.

According to an aspect of the present invention, there is provided a multi-sensor target detection method including:

determining first target detection results respectively acquired by at least two types of sensors; the target detection result comprises a confidence coefficient and a spatial position, and the spatial position is represented by target central point coordinates and size;

Performing confidence fusion and spatial position fusion respectively according to first target detection results acquired by each sensor to obtain fused confidence and fused spatial position;

and determining a second target detection result according to the fused confidence coefficient and the fused spatial position so as to execute automatic driving operation.

According to another aspect of the present invention, there is provided a multi-sensor object detecting device including:

the acquisition module is used for determining first target detection results acquired by the at least two types of sensors respectively; the target detection result comprises a confidence coefficient and target position characteristics, and the target position characteristics are represented by target central point coordinates and size;

the fusion module is used for respectively carrying out confidence fusion and spatial position fusion according to the first target detection result acquired by each sensor to obtain a fused confidence and a fused spatial position;

and the detection module is used for determining a second target detection result according to the fused confidence coefficient and the fused spatial position and executing automatic driving operation.

According to another aspect of the present invention, there is provided an electronic apparatus including:

at least one processor; and

A memory communicatively coupled to the at least one processor; wherein, the first and the second end of the pipe are connected with each other,

the memory stores a computer program executable by the at least one processor, the computer program being executable by the at least one processor to enable the at least one processor to perform a multi-sensor object detection method according to any embodiment of the invention.

According to another aspect of the present invention, there is provided a computer-readable storage medium storing computer instructions for causing a processor to implement the multi-sensor object detection method according to any one of the embodiments of the present invention when executed.

According to the technical scheme of the embodiment of the invention, first target detection results respectively acquired by at least two types of sensors are determined; the target detection result comprises a confidence coefficient and a spatial position, and the spatial position is represented by target central point coordinates and size; performing confidence fusion and spatial position fusion respectively according to the first target detection result acquired by each sensor to obtain fused confidence and fused spatial position; and determining a second target detection result according to the fused confidence coefficient and the fused spatial position for executing automatic driving operation, solving the influence of the limitation of a single type of sensor on the target detection precision, realizing the complementary fusion of different types of sensors and improving the target detection precision.

It should be understood that the statements in this section are not intended to identify key or critical features of the embodiments of the present invention, nor are they intended to limit the scope of the invention. Other features of the present invention will be readily apparent from the following specification.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings required to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the description below are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a flow chart of a multi-sensor object detection method provided in accordance with an embodiment of the present invention;

FIG. 2 is a diagram of a multi-sensor object detection architecture suitable for use in accordance with embodiments of the present invention;

FIG. 3 is a schematic illustration of confidence fusion in multi-sensor target detection, in accordance with an embodiment of the present invention;

FIG. 4 is a schematic illustration of spatial position fusion in multi-sensor object detection, suitable for use in accordance with embodiments of the present invention;

FIG. 5 is a block diagram illustrating the overall architecture of spatial position fusion in multi-sensor target detection, in accordance with an embodiment of the present invention;

FIG. 6 is a spatial position fusion map for multi-sensor target detection, adapted according to embodiments of the present invention;

FIG. 7 is a spatial position fusion diagram in another multi-sensor target detection suitable for use in accordance with embodiments of the present invention;

FIG. 8 is a schematic diagram of a multi-sensor object detection apparatus provided in accordance with an embodiment of the present invention;

fig. 9 is a schematic structural diagram of an electronic device implementing the multi-sensor target detection method according to an embodiment of the present invention.

Detailed Description

In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used are interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

The full link interface test method, apparatus, electronic device and storage medium provided in the present application are described in detail in the following embodiments and optional solutions thereof.

Fig. 1 is a flowchart of a multi-sensor target detection method provided in an embodiment of the present invention, where the embodiment is applicable to a case where a target in front of an autonomous vehicle is detected, and the method may be performed by a multi-sensor target detection device, which may be implemented in hardware and/or software, and may be configured in any electronic device having a network communication function. As shown in fig. 1, the method may include the steps of:

s110, determining first target detection results respectively acquired by at least two types of sensors; the target detection result comprises a confidence coefficient and a space position, and the space position is represented by the coordinates and the size of the target center point.

The multi-sensor target detection method of the scheme of the application can be applied to the terminal equipment and the server, wherein the electronic equipment can be used for detecting the target in front of the automatic driving vehicle, but is not limited to the terminal equipment and the server. The terminal device may include, but is not limited to, a mobile phone, a tablet, a vehicle-mounted computer, and the like.

Pre-configured sensors may be employed during autonomous driving to detect targets of the environment ahead of the vehicle, where the targets may include, but are not limited to, obstacles ahead of the vehicle, other vehicles, pedestrians, etc. The at least two types of sensors may include a millimeter wave radar sensor, a laser radar sensor, and an image sensor.

The object detection results of the various types of sensors may include the spatial position of the detected object ahead of the travel and the corresponding confidence. The confidence coefficient can be a numerical value in the interval [0,1], and the confidence coefficient can indicate the possibility that the object exists in the front of the vehicle through the sensor, and the greater the value of the confidence coefficient, the higher the confidence coefficient is, the greater the possibility that the object exists in the front of the vehicle is. The spatial position may be represented in terms of target center point coordinates and size; the center point of the target is represented by coordinates of three points of an abscissa x, an ordinate y and an ordinate z, and the size is represented by the length, width and height w, l and h.

And S120, performing confidence fusion and spatial position fusion respectively according to the first target detection results acquired by the sensors to obtain a fused confidence and a fused spatial position.

Considering that each type of sensor has certain limitations, for example, millimeter wave radar can provide accurate 3D measurement, but the point cloud generated thereby becomes sparse at long distances, thereby reducing the ability to accurately detect distant targets. Images provide rich appearance characteristics but are not a good source of information for depth prediction. However, lidar requires high weather, and the ability to accurately detect distant targets in extreme weather is reduced.

Therefore, the automatic driving automobile can be provided with different types of sensors, so that the mutual complementary fusion of the confidence coefficient and the spatial position in the target detection results acquired by the different types of sensors is carried out by utilizing the complementary characteristics of the sensors, the subsequent target detection by adopting the confidence coefficient of the complementary fusion and the fused spatial position is realized, and the great influence caused by the adoption of a single type of sensor for target detection precision is reduced.

In an alternative of this embodiment, performing confidence fusion and spatial position fusion respectively according to the first target detection result acquired by each sensor to obtain a fused confidence and a fused spatial position may include the following steps a 1-A3:

and A1, splicing the characteristic results of the confidence level and the space position in the first target detection results acquired by the sensors of different types to obtain a matrix characteristic after splicing.

Referring to fig. 2, taking millimeter wave radar, laser radar, and image sensor as examples, the first target detection results collected by different types of sensors are described as follows: the preliminary first target detection result of the millimeter wave radar is recorded as x1, y1, z1, w1, l1, h1 and c1, wherein c is the confidence level; the primary first target detection results of the laser wave radar are recorded as x2, y2, z2, w2, l2, h2 and c 2; and the image sensor preliminary target detection results are recorded as x3, y3,0, w3, l3,0, c3, and since the image sensor obtained detection results are two-dimensional, z and h are 0.

Referring to fig. 2, the first target detection result acquired by each type of sensor may include 7 element values, and a matrix feature with a dimension of 3x7 may be obtained by performing feature splicing on the first target detection results acquired by the three types of sensors.

Optionally, the feature result splicing of the confidence level and the spatial position in the first target detection results acquired by the sensors of different types includes: and normalizing the confidence degrees and the space positions in the first target detection results acquired by the sensors of different types, and splicing the characteristic results of the normalized confidence degrees and the space positions corresponding to the first target detection results.

And step A2, respectively adjusting confidence levels in the first target detection results according to the differences among the spatial positions in the spliced matrix characteristics to obtain the fused confidence levels.

Referring to fig. 2, after the feature of the matrix after stitching is obtained, for the confidence of any row, the feature extraction operation may be implemented by analyzing the difference between the spatial position of the row and the spatial positions of other rows to perform complementary fine tuning on the confidence of the row, so that after the feature extraction operation, fused confidence coefficients C1, C2, C3 with a dimension of 3x1 may be obtained. Wherein, the feature extraction can be realized by adopting a series of convolutional layers or a certain neural network model. For example, the feature extraction is set to be 1x7 convolution kernel operation, so that the 3x7 dimensional matrix features can be subjected to convolution operation to obtain 3x1 dimensional fused features.

In an optional example of this embodiment, the adjusting the confidence level in each first target detection result according to the difference between each spatial position in the matrix feature after the stitching to obtain the fused confidence level may include the following steps:

inputting the spliced matrix characteristics into a preset confidence fusion extraction model to obtain a fused confidence; the confidence fusion extraction model is used for analyzing and judging whether the same target can be detected by different types of sensors in the same position preset range, and adjusting the confidence in each first target detection result according to the analysis and judgment result; when the same target is detected by different types of sensors in the same position preset range, the confidence coefficient is increased after the confidence coefficient fusion is triggered.

Referring to fig. 3, taking millimeter wave radar, laser radar, and an image sensor as examples, the preliminary detection results of the three sensors are feature-spliced to obtain a matrix feature of 3 × 7. The confidence fusion extraction model can be realized by a series of convolutional layers or a certain neural network model. For example, the confidence fusion extraction model is set to be 1x7 convolution kernel operation, and then 3x7 dimensional features are subjected to convolution operation to obtain 3x1 dimensional fused features. The results of these 3x1 dimensions are the confidence levels of the detection results of the three sensors, respectively.

Referring to fig. 3, the confidence fusion extraction model is configured to determine, based on each spatial position, a result that the same target is detected by different sensors in the same position preset range, and adjust confidence in each first target detection result according to the determination result. Confidence fusion may be that when the same object is detected by different sensors all near the same location, then their confidence should be higher, i.e., objects are more likely to be present. Conversely, if only one sensor detects a target, then the confidence level may be reduced.

And A3, respectively adjusting the spatial position in each first target detection result according to each confidence coefficient in the spliced matrix characteristics to obtain the fused spatial position.

Referring to fig. 2 and fig. 3, for example, millimeter wave radar, laser radar, and an image sensor, the spatial transformation matrix T with a dimension of 3x2x3x4 may be obtained after feature extraction operation is performed on the spliced features. The spatial transformation matrix may be used to change the detected target position to obtain new fused position coordinates. The dimension of the fused position information is still 3x 6. For example, the millimeter wave radar position fusion results are recorded as X1, Y1, Z1, W1, L1, H1; the laser radar position fusion results are recorded as X2, Y2, Z2, W2, L2 and H2; the image sensor position fusion results are noted as X3, Y3, Z3, W3, L3, H3. Here, the position information after the image sensor fusion is no longer a two-dimensional result, and the spatial position after the spatial position transformation may have three dimensions.

In an optional example of this embodiment, the adjusting the spatial position in each first target detection result according to the confidence level in the matrix feature after the stitching to obtain the fused spatial position may include the following steps B1-B2:

b1, inputting the spliced matrix characteristics into a preset spatial position fusion extraction model to obtain a spatial transformation matrix; the spatial position fusion extraction model is used for resolving a spatial transformation matrix for changing and adjusting the spatial position in the target detection result based on the confidence degree in the spliced matrix characteristics.

And step B2, converting the spatial position in the first target detection result through the affine transformation matrix and the translation transformation matrix included in the spatial transformation matrix to obtain the fused spatial position.

Referring to fig. 4, in spatial position fusion, the spatial position fusion extraction model may be implemented by using a series of convolution operations or a neural network model, and a spatial transformation matrix including an affine transformation matrix and a translation transformation matrix (e.g., affine transformation matrix a of 3x3 and translation transformation matrix S of 3x1 shown in fig. 4, a and S may be combined into a spatial transformation matrix with a dimension of 3x4) may be obtained through feature extraction. Since each sensor spatial position information not only contains 3 central point coordinates, but also contains 3 size information, each sensor training obtains 2 spatial transformation matrices, namely dimensions 2x3x 4. The dimensions of the spatial transformation matrix T corresponding to the three sensors are 3x2x3x 4. After the spatial transformation matrix, the position information in the original detection result is updated to the fused position information, and the dimensionality is 3x 6.

Referring to fig. 5, the overall architecture of spatial position fusion is shown, and the overall architecture includes three spatial position information transformation processes of a millimeter wave radar, a laser radar and an image sensor. In the process of spatial position fusion, new spatial position coordinates X, Y and Z can be obtained by matrix transformation of original spatial position coordinates X, Y and Z. As shown in the following formula, the spatial transformation matrix includes an affine transformation matrix a and a translation transformation matrix S. The affine transformation matrix A is a matrix with the dimension of 3x3 formed by a-i in the formula, and the translation transformation matrix S is a matrix with the dimension of 3x1 formed by j-l in the formula.

Referring to fig. 6, the spatial position fusion process of the millimeter wave radar or the laser radar is shown. The basic dimension of the spatial transformation matrix T is 3x4, the part of the spatial transformation matrix T corresponding to 3x3 in the left three columns is an affine transformation matrix S, and the part of the spatial transformation matrix T corresponding to 3x1 in the right one column is a translation transformation matrix S. In the process of realizing spatial position fusion, the original spatial position information can be subjected to affine transformation matrix, and then translation transformation is added to obtain new spatial position information. Of course, the original spatial position information may be subjected to translation transformation first, and then affine transformation is added to obtain new spatial position information.

Referring to fig. 7, a process for fusing the spatial positions of the image sensors is shown. The fusion process is similar to the spatial position fusion process of radar. The difference lies in that in the fusion implementation process, the original spatial position information must be subjected to translation transformation first, and then affine transformation is added to obtain new spatial position information. If the radial transformation is performed first, the parameters of the corresponding position in the radial matrix are invalid due to the lack of one dimension in the original position information of the image sensor.

And S130, determining a second target detection result according to the fused confidence coefficient and the fused spatial position, and executing automatic driving operation.

Determining a second target detection result according to the fused confidence and the fused spatial position may include the following steps: and performing non-maximum suppression on the fused confidence coefficient and the fused spatial position to obtain a second target detection result.

After the confidence fusion and the spatial position fusion, the final three-dimensional target detection result can be obtained after the new confidence and the spatial position are subjected to non-maximum suppression (NMS), so that the final second target detection result can be used for indicating the automatic driving operation.

According to the technical scheme of the embodiment of the invention, first target detection results respectively acquired by at least two types of sensors are determined; the target detection result comprises a confidence coefficient and a spatial position, and the spatial position is represented by target central point coordinates and size; performing confidence fusion and spatial position fusion respectively according to the first target detection result acquired by each sensor to obtain fused confidence and fused spatial position; and determining a second target detection result according to the fused confidence and the fused spatial position for executing automatic driving operation, solving the influence of the limitation of a single type of sensor on the target detection precision, realizing the complementary fusion of different types of sensors and improving the target detection precision, and particularly fusing the millimeter wave radar, the laser radar and the image detection result to improve the target detection precision. Meanwhile, the fusion scheme has expansibility, is not limited to data fusion of three sensors, namely a millimeter wave radar sensor, a laser radar sensor and an image sensor, and is also suitable for the fusion process of two sensors or more than three sensors.

Fig. 8 is a block diagram of a multi-sensor object detection device provided in an embodiment of the present invention, where the embodiment is applicable to a situation of detecting an object in front of an autonomous vehicle, the multi-sensor object detection device may be implemented in a form of hardware and/or software, and the multi-sensor object detection device may be configured in any electronic device with a network communication function. As shown in fig. 8, the apparatus may include: an acquisition module 810, a fusion module 820, and a detection module 830. Wherein:

an acquisition module 810, configured to determine first target detection results respectively acquired by at least two types of sensors; the target detection result comprises a confidence coefficient and target position characteristics, and the target position characteristics are expressed by target center point coordinates and size;

a fusion module 820, configured to perform confidence fusion and spatial position fusion respectively according to the first target detection result acquired by each sensor, so as to obtain a fused confidence and a fused spatial position;

and the detecting module 830 is configured to determine a second target detection result according to the fused confidence and the fused spatial position, so as to perform an automatic driving operation.

On the basis of the foregoing embodiment, optionally, the fusion module 820 includes:

The characteristic splicing unit is used for splicing the characteristic results of the confidence level and the space position in the first target detection results acquired by the sensors of different types to obtain matrix characteristics after splicing;

the confidence fusion unit is used for respectively adjusting the confidence in each first target detection result according to the difference between each space position in the spliced matrix characteristics to obtain the fused confidence;

and the spatial position fusion unit is used for respectively adjusting the spatial position in each first target detection result according to the confidence degree in the spliced matrix characteristics to obtain the fused spatial position.

On the basis of the foregoing embodiment, optionally, the confidence fusion unit includes:

On the basis of the foregoing embodiment, optionally, the spatial position fusion unit includes:

inputting the spliced matrix characteristics into a preset spatial position fusion extraction model to obtain a spatial transformation matrix; the spatial position fusion extraction model is used for analyzing a spatial transformation matrix for changing and adjusting the spatial position in the target detection result based on the sizes of all the confidence degrees in the spliced matrix characteristics;

and converting the spatial position in the first target detection result through an affine transformation matrix and a translation transformation matrix included in the spatial transformation matrix to obtain a fused spatial position.

On the basis of the foregoing embodiment, optionally, the detecting module 830 includes:

and carrying out non-maximum suppression on the fused confidence coefficient and the fused spatial position to obtain a second target detection result.

On the basis of the foregoing embodiment, optionally, the feature result splicing between the confidence level and the spatial position in the first target detection results acquired by the sensors of different types includes:

and normalizing the confidence degrees and the space positions in the first target detection results acquired by the sensors of different types, and splicing the normalized confidence degrees corresponding to the first target detection results and the space positions into characteristic results.

On the basis of the foregoing embodiments, optionally, the at least two types of sensors include a millimeter wave radar sensor, a laser radar sensor, and an image sensor.

The multi-sensor target detection device provided in the embodiment of the present invention may execute the multi-sensor target detection method provided in any embodiment of the present invention, and has corresponding functions and beneficial effects for executing the multi-sensor target detection method, and the detailed process refers to the related operations of the multi-sensor target detection method in the foregoing embodiments.

FIG. 9 illustrates a block diagram of an electronic device 10 that may be used to implement embodiments of the present invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital assistants, cellular phones, smart phones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.

As shown in fig. 9, the electronic device 10 includes at least one processor 11, and a memory communicatively connected to the at least one processor 11, such as a Read Only Memory (ROM)12, a Random Access Memory (RAM)13, and the like, wherein the memory stores computer programs executable by the at least one processor, and the processor 11 can perform various suitable actions and processes according to the computer programs stored in the Read Only Memory (ROM)12 or the computer programs loaded from the storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data necessary for the operation of the electronic apparatus 10 can also be stored. The processor 11, the ROM 12, and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.

A number of components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, or the like; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.

The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, or the like. The processor 11 performs the various methods and processes described above, such as a multi-sensor object detection method.

In some embodiments, the multi-sensor object detection method may be implemented as a computer program tangibly embodied in a computer-readable storage medium, such as storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19. When the computer program is loaded into RAM 13 and executed by processor 11, one or more steps of the multi-sensor object detection method described above may be performed. Alternatively, in other embodiments, the processor 11 may be configured to perform the multi-sensor object detection method by any other suitable means (e.g., by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Computer programs for implementing the methods of the present invention can be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be performed. A computer program can execute entirely on a machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. A computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium may include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described herein may be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), blockchain networks, and the internet.

The computing system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service are overcome.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present invention may be executed in parallel, sequentially, or in different orders, and are not limited herein as long as the desired results of the technical solution of the present invention can be achieved.

The above-described embodiments should not be construed as limiting the scope of the invention. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A multi-sensor target detection method, comprising:

performing confidence fusion and spatial position fusion respectively according to the first target detection result acquired by each sensor to obtain fused confidence and fused spatial position;

2. The method of claim 1, wherein performing confidence fusion and spatial position fusion respectively according to the first target detection result acquired by each sensor to obtain a fused confidence and a fused spatial position comprises:

splicing feature results of confidence levels and spatial positions in first target detection results acquired by different types of sensors to obtain matrix features after splicing;

respectively adjusting confidence degrees in the first target detection results according to the difference between the space positions in the spliced matrix characteristics to obtain a fused confidence degree;

and respectively adjusting the spatial position in each first target detection result according to each confidence coefficient in the spliced matrix characteristics to obtain a fused spatial position.

3. The method of claim 2, wherein the step of adjusting confidence levels in the first target detection results according to differences between the spatial positions in the matrix features after the stitching to obtain the fused confidence level comprises:

4. The method of claim 2, wherein the step of adjusting the spatial position of each first target detection result according to the confidence level of each spliced matrix feature to obtain the fused spatial position comprises:

inputting the spliced matrix characteristics into a preset spatial position fusion extraction model to obtain a spatial transformation matrix; the spatial position fusion extraction model is used for resolving a spatial transformation matrix for changing and adjusting the spatial position in the target detection result based on the confidence degree in the spliced matrix characteristics;

5. The method of claim 1, wherein determining the second target detection result according to the fused confidence level and the fused spatial position comprises:

6. The method of claim 2, wherein the stitching confidence levels of the first target detection results collected by the different types of sensors with the feature results of the spatial locations comprises:

And normalizing the confidence coefficient and the space position in the first target detection results acquired by the sensors of different types, and splicing the characteristic results of the normalized confidence coefficient and the space position corresponding to each first target detection result.

7. The method of any of claims 1-6, wherein the at least two types of sensors include millimeter wave radar sensors, lidar sensors, and image sensors.

8. A multi-sensor object detection device, comprising:

the acquisition module is used for determining first target detection results respectively acquired by at least two types of sensors; the target detection result comprises a confidence coefficient and a target position characteristic, and the target position characteristic is represented by target center point coordinates and size;

the fusion module is used for respectively carrying out confidence fusion and spatial position fusion according to the first target detection result acquired by each sensor to obtain the fused confidence and the fused spatial position;

and the detection module is used for determining a second target detection result according to the fused confidence coefficient and the fused spatial position so as to execute automatic driving operation.

9. An electronic device, characterized in that the electronic device comprises:

At least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the multi-sensor object detection method of any one of claims 1-7.

10. A computer-readable storage medium storing computer instructions for causing a processor to perform the multi-sensor object detection method of any one of claims 1-7 when executed.