CN117809145A

CN117809145A - Fusion method, device, mobile device and storage medium of sensor semantic information

Info

Publication number: CN117809145A
Application number: CN202211167242.0A
Authority: CN
Inventors: 严旭; 任巨龙; 邵枭虎; 管沁朴; 张哲�; 赵凭
Original assignee: Beijing Idriverplus Technologies Co Ltd
Current assignee: Beijing Idriverplus Technologies Co Ltd
Priority date: 2022-09-23
Filing date: 2022-09-23
Publication date: 2024-04-02

Abstract

The embodiment of the invention provides a fusion method, equipment, a mobile device and a storage medium of sensor semantic information. The method comprises the following steps: generating RGB images by utilizing videos acquired by a camera sensor which are time-synchronized with radar point cloud data acquired by a laser radar sensor; selecting a 3D point cloud falling into an interesting area divided by taking an origin of a coordinate system of a mobile device as a center from radar point cloud data, and generating a radar aerial view of the ground; semantic segmentation is carried out on partial images except the ground in the RGB image, so that segmented RGB images are obtained; semantic segmentation is carried out on the radar aerial view to obtain a segmented radar aerial view; and fusing the segmented RGB image with the segmented radar aerial view. According to the embodiment of the invention, the semantic segmentation information fused by utilizing RGB image semantics and radar aerial view has rich semantic features and two-dimensional plane panoramic information around the vehicle, so that the passable area in the road is more accurately determined.

Description

Fusion method, device, mobile device and storage medium of sensor semantic information

Technical Field

The invention relates to the unmanned field, in particular to a fusion method, equipment, a mobile device and a storage medium of sensor semantic information.

Background

With the common progress of robotics and artificial intelligence, rapid development is brought to unmanned technology. Unmanned vehicles are increasingly being used in various industries such as unmanned warehouses, public transportation operations, sanitation cleaning, etc. To enable unmanned vehicles to travel safely and autonomously in a variety of environments, reliable environmental awareness plays a critical role, particularly in accurately distinguishing passable areas in front of the vehicle travel.

The passable region detection in the prior art generally uses image semantic segmentation information, including RGB (Red green blue) image segmentation, 3D laser radar point cloud semantic segmentation and multi-sensor fusion semantic segmentation (fusion of RGB images and radar point cloud semantic segmentation), the multi-sensor fusion semantic segmentation method projects the point cloud onto an image in a spherical projection or perspective projection mode to obtain related image semantic segmentation information, then projects related image pixels back to a point cloud space, and multi-sensor fusion is performed on the point cloud space.

In the process of implementing the present invention, the inventor finds that at least the following problems exist in the related art:

in the existing image semantic segmentation method, simple RGB image semantic segmentation is easily interfered by light, noise is often generated in the acquired data, and the acquired data is dangerous for applications such as automatic driving and the like. Simple lidar can provide reliable environmental information and spatial geometric information, but the data collected by the lidar is often very sparse and lacks color and texture information, which makes fine-grained semantic segmentation based only on the lidar data difficult. The multi-sensor fusion semantic segmentation method for projecting the point cloud onto the image by using a spherical projection or perspective projection mode is also available, and the ideal effect is difficult to obtain because the point cloud is projected into the image and the image texture information is seriously lost.

Disclosure of Invention

In order to at least solve the defects of simple RGB image semantic segmentation and laser radar in the prior art, the multi-sensor fusion semantic segmentation of spherical projection or perspective projection can cause the defect of image texture information, and the problem that the passable area in the road cannot be accurately distinguished. In a first aspect, an embodiment of the present invention provides a method for fusing sensor semantic information, which is applied to a mobile device, and includes:

generating RGB images by utilizing videos acquired by a camera sensor which are time-synchronized with radar point cloud data acquired by a laser radar sensor;

selecting 3D point clouds falling into an interesting area divided by taking an origin of a coordinate system of the mobile device as a center from the radar point cloud data, and generating a radar aerial view of the ground, wherein a pixel value of each channel in the radar aerial view is determined by a height value of the 3D point clouds;

semantic segmentation is carried out on partial images except the ground in the RGB image, so that a segmented RGB image is obtained;

semantic segmentation is carried out on the radar aerial view to obtain a segmented radar aerial view;

and fusing the segmented RGB image with the segmented radar aerial view to obtain semantic information for sensing the two-dimensional plane panorama around the mobile device.

In a second aspect, an embodiment of the present invention provides a fusion execution device for sensor semantic information, including:

the image generation module is used for generating RGB images by utilizing videos acquired by the camera sensor, which are time-synchronized with radar point cloud data acquired by the laser radar sensor;

the aerial view generation module is used for selecting 3D point clouds falling into an interested area divided by taking an origin of the coordinate system of the mobile device as a center from the radar point cloud data to generate a radar aerial view of the ground, wherein a pixel value of each channel in the radar aerial view is determined by a height value of the 3D point clouds;

the image segmentation module is used for carrying out semantic segmentation on partial images except the ground in the RGB image to obtain a segmented RGB image;

the aerial view segmentation module is used for carrying out semantic segmentation on the radar aerial view to obtain a segmented radar aerial view;

and the fusion module is used for fusing the segmented RGB image with the segmented radar aerial view to obtain semantic information for sensing the two-dimensional plane panorama around the mobile device.

In a third aspect, there is provided an electronic device, comprising: the system comprises at least one processor and a memory communicatively connected with the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of the fusion method of sensor semantic information of any one of the embodiments of the present invention.

In a fourth aspect, an embodiment of the present invention provides a mobile device, including a body and an electronic apparatus according to any one of the embodiments of the present invention mounted on the body.

In a fifth aspect, an embodiment of the present invention provides a storage medium having stored thereon a computer program, wherein the program when executed by a processor implements the steps of the fusion method of sensor semantic information of any embodiment of the present invention.

In a sixth aspect, an embodiment of the present invention further provides a computer program product, which when run on a computer, causes the computer to execute the method for fusing sensor semantic information according to any one of the embodiments of the present invention.

The embodiment of the invention has the beneficial effects that: the multi-sensor fusion semantic segmentation based on RGB image semantic segmentation and radar aerial view semantic segmentation is characterized in that the RGB image semantic segmentation semantic features are rich, and the radar aerial view generated by laser point cloud can sense two-dimensional plane panoramic information around a vehicle, so that the fused semantic segmentation information has rich semantic features and two-dimensional plane panoramic information around the vehicle, and a passable area in a road is determined more accurately.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of a method for fusing sensor semantic information according to an embodiment of the present invention;

FIG. 2 is a camera RGB diagram of a method for fusing sensor semantic information according to an embodiment of the present invention;

FIG. 3 is a 3D point cloud ROI area of a fusion method of sensor semantic information according to an embodiment of the present invention;

FIG. 4 is a schematic view of a region of interest (ROI) set for generating a radar bird's eye view for a method for fusing sensor semantic information according to an embodiment of the present invention;

FIG. 5 is a radar bird's eye view of a method for fusing sensor semantic information according to an embodiment of the present invention;

FIG. 6 is an RGB image segmentation effect diagram of a fusion method of sensor semantic information according to an embodiment of the present invention;

FIG. 7 is a view of a radar bird's eye view segmentation effect diagram of a method for fusing sensor semantic information according to an embodiment of the present invention;

FIG. 8 is a semantic segmentation fusion effect diagram of a method for fusing sensor semantic information according to an embodiment of the present invention;

FIG. 9 is a schematic structural diagram of a device for performing fusion of sensor semantic information according to an embodiment of the present invention;

fig. 10 is a schematic structural diagram of an embodiment of an electronic device for fusing sensor semantic information according to an embodiment of the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Those skilled in the art will appreciate that embodiments of the present application may be implemented as a system, apparatus, device, method, or computer program product. Accordingly, the present disclosure may be embodied in the following forms, namely: complete hardware, complete software (including firmware, resident software, micro-code, etc.), or a combination of hardware and software.

For ease of understanding, the technical terms referred to in this application are explained as follows:

the term "mobile device" as used herein includes, but is not limited to, six classes of automated driving technology vehicles, such as those specified by the International society of automaton (Society of Automotive Engineers International, SAE International) or the national Standard for automotive Automation Classification, L0-L5.

In some embodiments, the mobile device may be a vehicle device or a robotic device having various functions:

(1) Manned functions such as home cars, buses, etc.;

(2) Cargo functions such as common trucks, van type trucks, swing trailers, closed trucks, tank trucks, flatbed trucks, container trucks, dump trucks, special structure trucks, and the like;

(3) Tool functions such as logistics distribution vehicles, automatic guided vehicles AGVs, patrol vehicles, cranes, excavators, bulldozers, shovels, road rollers, loaders, off-road engineering vehicles, armored engineering vehicles, sewage treatment vehicles, sanitation vehicles, dust collection vehicles, floor cleaning vehicles, watering vehicles, floor sweeping robots, meal delivery robots, shopping guide robots, mowers, golf carts, and the like;

(4) Entertainment functions such as recreational vehicles, casino autopilots, balance cars, etc.;

(5) Special rescue functions such as fire trucks, ambulances, electric power emergency vehicles, engineering emergency vehicles and the like.

Fig. 1 is a flowchart of a method for fusing sensor semantic information according to an embodiment of the present invention, including the following steps:

s11: generating RGB images by utilizing videos acquired by a camera sensor which are time-synchronized with radar point cloud data acquired by a laser radar sensor;

s12: selecting 3D point clouds falling into an interesting area divided by taking an origin of a coordinate system of the mobile device as a center from the radar point cloud data, and generating a radar aerial view of the ground, wherein a pixel value of each channel in the radar aerial view is determined by a height value of the 3D point clouds;

s13: semantic segmentation is carried out on partial images except the ground in the RGB image, so that a segmented RGB image is obtained;

s14: semantic segmentation is carried out on the radar aerial view to obtain a segmented radar aerial view;

s15: and fusing the segmented RGB image with the segmented radar aerial view to obtain semantic information for sensing the two-dimensional plane panorama around the mobile device.

In the present embodiment, the present method can be applied to various types of mobile devices, and these mobile devices are generally equipped with cameras and lidars. For example, the system is applied to mobile devices such as intelligent ground washing vehicles in industrial parks, express delivery vehicles in communities, cargo transport vehicles in warehouses and the like.

For step S11, taking an intelligent ground washing vehicle in an industrial park as an example, when the intelligent ground washing vehicle cleans tasks in a road, video is collected in real time by using a camera sensor carried by the intelligent ground washing vehicle, and radar point cloud data is collected in real time by using a laser radar sensor. Each sensor is synchronously acquired in real time, and the obtained time-synchronous radar point cloud data and video are used for generating RGB images from the video in real time, and the RGB images are extracted from the video acquired in an industrial park as shown in fig. 2. In the RGB image, it can be seen that the left side of the image has stopped with the vehicle, the right side of the image has trees and walls, the middle area in the image has an object, and the ground in the image extends to the right.

For step S12, a radar aerial view is generated according to the radar point cloud data collected in step S11, and the laser radar of the intelligent ground washing vehicle diverges to obtain surrounding laser point cloud data with the intelligent ground washing vehicle as the center. Specifically, as shown in fig. 3, a laser radar 3D point cloud is selected by taking the origin of the coordinate system of the intelligent ground washing vehicle as the center, taking the regions with the front 48m, the rear and the left and right being 16m as the ROI (region of interest, the region of interest) regions (wherein 48m and 16m are only examples, specific numerical values can be adaptively adjusted according to specific situations, and the corresponding radar aerial view ROI region setting schematic is shown in fig. 4). A radar aerial view of the ground as shown in fig. 5 is generated, specifically, a pixel value of each channel in the radar aerial view is determined by a height value of the 3D point cloud.

For a pixel value of each channel in the radar aerial view, determining the pixel value of each channel in the radar aerial view from the height value of the 3D point cloud includes:

counting the maximum height difference value of the 3D point cloud, the standard deviation value of the 3D point cloud, the maximum height value of the 3D point cloud and the average height of the 3D point cloud in each cell in the radar aerial view;

determining a pixel value of an R channel in the radar aerial view based on the maximum height difference value of the 3D point cloud and a preset pixel value transformation coefficient;

determining a pixel value of a G channel in the radar aerial view based on the standard deviation value of the 3D point cloud and the preset pixel value transformation coefficient;

determining a pixel value of a B channel in the radar aerial view based on the maximum height value of the 3D point cloud and the preset pixel value transformation coefficient;

and determining the pixel value of the HSV channel in the radar aerial view based on the average height value of the 3D point cloud and the preset pixel value transformation coefficient.

In the present embodiment, the pixel value of R, G, B, HSV in the radar bird's eye view is Z _i As shown in fig. 3, a plurality of cells are divided, each laser point cloud falls in each cell of the radar aerial view, and since the point cloud records three-dimensional data in space, the height Z of each 3D point cloud falls in the cell _i . In each unit cell, the lowest height is MIN (Z _i ) The highest height is MAX (Z _i ) The standard deviation is defined as std (Z) which is the square root of the arithmetic mean of the standard value of each unit of the population and the square of the mean dispersion thereof _i ) Average height is MEAN (Z _i )。

The pixel value determination relationship for the R channel is as follows:

the pixel value determination relationship of the G channel is as follows:

the pixel value determination relationship of the B channel is as follows:

the pixel value determination relationship of the HSV (H) channel is as follows:

wherein,representing the pixel value transition coefficient, which can be set to 40, N represents the total number of 3D point clouds in each cell

For step S13, semantic segmentation is performed on a part of images except the ground in the RGB image by using the RGB image semantic segmentation model, and the image collected by the camera has rich color and texture information, so that the image can be segmented into segmentation graphs with various semantic categories during semantic segmentation. The image semantic segmentation is to assign an advanced semantic label of the image to each pixel, namely to classify each pixel, wherein the advanced semantic label refers to various object categories (such as people, animals, automobiles and the like) and background categories (such as sky) in the image. The semantic segmentation task has high requirements on classification precision and positioning precision: on one hand, the contour boundary of the object needs to be accurately positioned, and on the other hand, the region in the contour needs to be accurately classified, so that the specific object can be well segmented from the background (as shown in fig. 6, which is a schematic diagram of the segmentation effect of the RGB image, the segmentation image and the original image are fused for expressing the display effect).

In step S14, the radar bird ' S eye view may be similarly segmented by semantic segmentation, and as an embodiment, the semantic segmentation of the radar bird ' S eye view to obtain a segmented radar bird ' S eye view includes:

semantic segmentation is carried out on the radar aerial view to obtain a segmented radar aerial view with double semantic categories on the ground, wherein the double semantic categories comprise: passable areas and obstructions.

In this embodiment, the radar bird's eye view map is divided into two categories (as shown in fig. 7, a road (a region part with a waveform in the figure) indicating a passable region and an object (a region part with a dot cluster in a non-channel shape in the figure) indicating an obstacle due to lack of abundant color texture information.

For step S15, the segmented RGB images obtained in the segmentation steps of steps S13 and S14 are fused with the segmented radar aerial view, so that semantic information for sensing the two-dimensional planar panorama around the mobile device is obtained after fusion, and a passable area can be accurately determined, as shown in fig. 8, which is a final fused semantic segmentation fusion effect diagram, it can be seen that semantic features of pedestrians are fused into the radar aerial view.

As one embodiment, the fusing the segmented RGB image with the segmented radar bird's eye view includes:

and fusing the segmented RGB image with a passable area in the segmented radar aerial view.

In the present embodiment, the segmented RGB image is fused with only the passable region in the radar bird's eye view.

Specifically, the fusion mode projects the segmented RGB image to the segmented radar aerial view for fusion, so that the problem that in the prior art, due to the fact that point cloud projection is projected to the image, the image texture information is seriously lost and an ideal effect is difficult to obtain is avoided, and the image can only see the front part, but the panoramic information of the 2D plane around the vehicle cannot be determined. The radar aerial view can accurately represent the 2D plane information around the vehicle, and the 2D plane panoramic semantic information can be obtained more abundantly by projecting the RGB image into the radar aerial view.

Since the RGB image is more abundant for the segmentation of the content outside the ground, and is less abundant for the segmentation of the ground, the radar aerial view has accurate segmentation for the ground. On the basis of reducing hardware resource consumption, fusion characteristics of the segmented RGB image and the segmented radar aerial view are complementary, and the passable area is determined efficiently and accurately.

As an embodiment, the fusing the segmented RGB image with the segmented radar bird's eye view further includes:

and fusing the segmented RGB image with the obstacle in the segmented radar aerial view.

In the present embodiment, the segmented RGB image is fused with only the obstacle in the radar bird's eye view, and the obstacle in the road is considered to be more accurately specified. Further, the RGB image is segmented to a content which is more abundant outside the ground, and the characteristics of the obstacle are further enhanced by the obstacle in the radar bird's eye view.

The two fusion modes are respectively applied to different actual scenes, the segmented RGB image is only fused with the passable area in the radar aerial view, the passable area can be determined efficiently, for example, a user hopes for an efficient and accurate mode, and fusion with the passable area can be used. If the user wishes to more accurately identify the characteristics of each obstacle in the road, a precise mode which is only fused with the obstacle in the radar bird's eye view can be used, and the user can adapt to the own needs.

According to the embodiment, semantic segmentation is fused by multiple sensors based on RGB image semantic segmentation and radar aerial view semantic segmentation, semantic features of the RGB image semantic segmentation are rich, and two-dimensional plane panoramic information around a vehicle can be perceived through the radar aerial view generated by laser point cloud, so that the fused semantic segmentation information has rich semantic features and two-dimensional plane panoramic information around the vehicle, and a passable area in a road is determined more accurately.

Fig. 9 is a schematic structural diagram of a device for performing fusion of sensor semantic information according to an embodiment of the present invention, where the system may perform the method for fusion of sensor semantic information according to any of the foregoing embodiments and be configured in a terminal.

The fusion execution device 10 for sensor semantic information provided in this embodiment includes: an image generation module 11, a bird's-eye view generation module 12, an image segmentation module 13, a bird's-eye view segmentation module 14 and a fusion module 15.

The image generation module 11 is used for generating RGB images by utilizing videos acquired by a camera sensor which are time-synchronous with radar point cloud data acquired by a laser radar sensor; the aerial view generation module 12 is configured to select, from the radar point cloud data, a 3D point cloud that falls into a region of interest divided by taking an origin of the mobile device coordinate system as a center, and generate a radar aerial view of the ground, where a pixel value of each channel in the radar aerial view is determined by a height value of the 3D point cloud; the image segmentation module 13 is used for carrying out semantic segmentation on partial images except the ground in the RGB image to obtain a segmented RGB image; the aerial view segmentation module 14 is configured to perform semantic segmentation on the radar aerial view to obtain a segmented radar aerial view, and the fusion module 15 is configured to fuse the segmented RGB image with the segmented radar aerial view to obtain semantic information for sensing a two-dimensional planar panorama around the mobile device.

Further, the bird's eye view segmentation module is configured to:

Further, the fusion module is configured to:

Further, the fusion module is further configured to:

Further, the fusion module is configured to:

and projecting the segmented RGB image to the segmented radar aerial view for fusion.

Further, the bird's eye view generating module is configured to:

The embodiment of the invention also provides a nonvolatile computer storage medium, wherein the computer storage medium stores computer executable instructions, and the computer executable instructions can execute the fusion method of the sensor semantic information in any method embodiment;

as one embodiment, the non-volatile computer storage medium of the present invention stores computer-executable instructions configured to:

As a non-volatile computer readable storage medium, it may be used to store a non-volatile software program, a non-volatile computer executable program, and modules, such as program instructions/modules corresponding to the methods in the embodiments of the present invention. One or more program instructions are stored in a non-transitory computer readable storage medium that, when executed by a processor, perform the method of fusing sensor semantic information in any of the method embodiments described above.

The embodiment of the invention also provides electronic equipment, which comprises: the system comprises at least one processor and a memory communicatively connected with the at least one processor, wherein the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor to enable the at least one processor to perform a fusion method of sensor semantic information.

In some embodiments, the present disclosure further provides a mobile device, including a body and the electronic apparatus according to any one of the foregoing embodiments mounted on the body. The mobile device may be an unmanned vehicle, such as an unmanned sweeper, an unmanned ground washing vehicle, an unmanned logistics vehicle, an unmanned passenger vehicle, an unmanned sanitation vehicle, an unmanned trolley/bus, a truck, a mine car, etc., or may be a robot, etc.

In some embodiments, the present embodiments further provide a computer program product, which when run on a computer, causes the computer to perform the method of fusion of sensor semantic information according to any of the embodiments of the present invention.

Fig. 10 is a schematic hardware structure diagram of an electronic device of a method for fusing sensor semantic information according to another embodiment of the present application, where, as shown in fig. 10, the device includes:

one or more processors 1010, and a memory 1020, one processor 1010 being illustrated in fig. 10. The device of the fusion method of the sensor semantic information may further include: an input device 1030 and an output device 1040.

The processor 1010, memory 1020, input device 1030, and output device 1040 may be connected by a bus or other means, for example in fig. 10.

The memory 1020 is used as a non-volatile computer readable storage medium, and may be used to store a non-volatile software program, a non-volatile computer executable program, and a module, such as a program instruction/module corresponding to the fusion method of sensor semantic information in the embodiment of the present application. The processor 1010 executes various functional applications of the server and data processing by running nonvolatile software programs, instructions and modules stored in the memory 1020, i.e., implements the fusion method of sensor semantic information of the above-described method embodiment.

Memory 1020 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, at least one application program required for a function; the storage data area may store data, etc. In addition, memory 1020 may include high-speed random access memory and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, memory 1020 may optionally include memory located remotely from processor 1010, which may be connected to the mobile device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The input device 1030 may receive input numeric or character information. The output 1040 may include a display device such as a display screen.

The one or more modules are stored in the memory 1020 that, when executed by the one or more processors 1010, perform the fusion method of sensor semantic information in any of the method embodiments described above.

The product can execute the method provided by the embodiment of the application, and has the corresponding functional modules and beneficial effects of the execution method. Technical details not described in detail in this embodiment may be found in the methods provided in the embodiments of the present application.

The non-transitory computer readable storage medium may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the device, etc. Further, the non-volatile computer-readable storage medium may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, the non-transitory computer readable storage medium may optionally include memory remotely located relative to the processor, which may be connected to the apparatus via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The embodiment of the invention also provides electronic equipment, which comprises: the system comprises at least one processor and a memory communicatively connected with the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of the fusion method of sensor semantic information of any one of the embodiments of the present invention.

The electronic device of the embodiments of the present application exist in a variety of forms including, but not limited to:

(1) Mobile communication devices, which are characterized by mobile communication functionality and are aimed at providing voice, data communication. Such terminals include smart phones, multimedia phones, functional phones, low-end phones, and the like.

(2) Ultra mobile personal computer equipment, which belongs to the category of personal computers, has the functions of calculation and processing and generally has the characteristic of mobile internet surfing. Such terminals include PDA, MID, and UMPC devices, etc., such as tablet computers.

(3) Portable entertainment devices such devices can display and play multimedia content. The device comprises an audio player, a video player, a palm game machine, an electronic book, an intelligent toy and a portable vehicle navigation device.

(4) Other electronic devices with data processing functions.

In this document, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," comprising, "or" includes not only those elements but also other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.

The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. A fusion method of sensor semantic information is applied to a mobile device, and comprises the following steps:

2. The method of claim 1, wherein semantically segmenting the radar aerial view to obtain a segmented radar aerial view comprises:

3. The method of claim 2, wherein the fusing the segmented RGB image with the segmented radar bird's eye view comprises:

4. The method of claim 2, wherein the fusing the segmented RGB image with the segmented radar bird's eye view further comprises:

5. The method of claim 1, wherein the fusing the segmented RGB image with the segmented radar bird's eye view comprises:

6. The method of claim 1, wherein determining a pixel value for each channel in the radar bird's eye view from the height value of the 3D point cloud comprises:

7. A fusion execution device of sensor semantic information, comprising:

8. An electronic device, comprising: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of the method of any one of claims 1-6.

9. A mobile device comprising a body and the electronic apparatus of claim 8 mounted on the body.

10. A storage medium having stored thereon a computer program, which when executed by a processor performs the steps of the method according to any of claims 1-6.