WO2023162497A1

WO2023162497A1 - Image-processing device, image-processing method, and image-processing program

Info

Publication number: WO2023162497A1
Application number: PCT/JP2023/000707
Authority: WO
Inventors: 友城門野
Original assignee: ソニーセミコンダクタソリューションズ株式会社
Priority date: 2022-02-22
Filing date: 2023-01-13
Publication date: 2023-08-31

Abstract

An image processing device (100) according to the present disclosure comprises: an acquisition unit (131) which acquires, from light detection and ranging (LiDAR) constituted by a sensor that uses a laser, point cloud data indicating that an object in the surroundings has been detected; and an identification unit (133) which, in a case in which the point cloud data has been superimposed on an image in which the range of laser irradiation by the LiDAR is included in an imaging range, identifies, from among the superimposed point cloud data and on the basis of irradiation information of the LiDAR, false transmission points which are points that are not actually irradiated onto the object in the image.

Description

Image processing device, image processing method and image processing program

The present disclosure relates to an image processing device, an image processing method, and an image processing program. More specifically, the present invention relates to image processing applied to an image captured by a camera provided in a moving object such as an automobile.

As one of the technologies related to autonomous driving, a technology has been proposed that improves the accuracy of object detection by using multiple sensors installed in the car.

For example, there is a known technology that removes inappropriate points (outliers) from the measured point cloud data by using multiple LiDAR (Light Detection and Ranging) sensors that use lasers.

JP 2021-47157 A

Conventional technology removes outliers by using multiple LiDARs. However, depending on the shape and structure of a mobile object that automatically operates, it may be difficult to provide multiple LiDARs.

Also, for example, when using point cloud data obtained by LiDAR to analyze information captured by other sensors (for example, two-dimensional image data captured by a camera), the installation position of each sensor is different. Therefore, point cloud data may not be used appropriately. As an example, regarding an object on the image captured by the camera, even though the object is not actually irradiated with the laser, when the point cloud data is superimposed on the image, it overlaps with the object, resulting in image analysis. , the point cloud data may be used as erroneous depth information.

Therefore, this disclosure proposes an image processing device, an image processing method, and an image processing program that can appropriately utilize point cloud data obtained by sensing.

In order to solve the above problems, an image processing apparatus according to one embodiment of the present disclosure provides point cloud data indicating that surrounding objects have been detected from LiDAR (Light Detection and Ranging), which is a sensor using a laser. and when the point cloud data is superimposed on an image including the laser irradiation range of the LiDAR in the imaging range, based on the irradiation information of the LiDAR, out of the superimposed point cloud data, an identifying unit for identifying a false transmission point, which is actually a point not illuminated on an object in the image.

1 is a diagram (1) for explaining an overview of image processing according to the present disclosure; FIG. FIG. 2 is a diagram (2) for explaining an overview of image processing according to the present disclosure; FIG. 3 is a diagram (3) for explaining an overview of image processing according to the present disclosure; 3 is a block diagram for explaining the flow of image processing according to the present disclosure; FIG. 1 is a diagram illustrating a configuration example of an image processing apparatus according to an embodiment; FIG. 4 is a flowchart showing the flow of processing according to the embodiment; 1 is a block diagram showing a schematic functional configuration example of a vehicle control system to which the present technology can be applied; FIG. FIG. 4 is a diagram illustrating an example of a sensing area by a vehicle control system to which the present technology can be applied; 1 is a hardware configuration diagram showing an example of a computer that implements functions of an image processing apparatus according to the present disclosure; FIG.

Below, embodiments of the present disclosure will be described in detail based on the drawings. In addition, in each of the following embodiments, the same parts are denoted by the same reference numerals, thereby omitting redundant explanations.

The present disclosure will be described according to the order of items shown below.
1. Embodiment 1-1. Outline of image processing according to present disclosure 1-2. Configuration example of image processing apparatus according to embodiment 1-3. Image processing procedure according to embodiment 1-4. Modified example according to the embodiment 2. Other Embodiments 2-1. Configuration of moving body 2-2. Others 3. Effects of the image processing apparatus according to the present disclosure4. Hardware configuration

(1. Embodiment)
(1-1. Overview of image processing according to the present disclosure)
First, an overview of image processing according to the present disclosure will be described with reference to FIGS. 1 to 3. FIG. FIG. 1 is a diagram (1) for explaining an overview of image processing according to the present disclosure. Image processing according to the present disclosure is performed by an image processing device 100 mounted on a vehicle 1 shown in FIG.

The image processing apparatus 100 executes image processing according to the embodiment (acquisition of point cloud data using LiDAR, superimposition processing of point cloud data on an image, etc.) by operating various functional units to be described later. .

The vehicle 1 is, for example, a four-wheeled vehicle equipped with technology related to automatic driving. For example, using the object detection function performed by the image processing device 100, the vehicle 1 automatically parks in a predetermined parking space, controls behavior to avoid objects, and selects an appropriate route.

In the object detection performed by a moving object that performs automatic driving, such as the vehicle 1, the external situation is detected by various sensors, and from the detected information, it is detected whether or not the object is actually located.

Examples of sensors include LiDAR, which uses lasers to measure the position of an object and the distance to an object, image sensors mounted on cameras, and millimeter-wave radar (radar sensor), which uses the reflection of radio waves such as millimeter waves. etc. One example of object detection is a method of determining whether or not there is an object in the image by inputting an image captured by a camera (image sensor) and recognizing the image with a detector. Another example is a method of detecting an object by using point cloud data obtained by LiDAR (such as depth information obtained from information reflected from an object) as input to a detector. In recent years, image recognition and object detection using a camera (image sensor) have been mainly used with the dramatic improvement in image recognition accuracy.

Furthermore, in object detection, multiple sensors are also used together. For example, by using a LiDAR together with a camera or the like, it is possible to generate a detector that obtains an output as an object detection result by inputting feature information obtained from a plurality of sensors into a neural network that is a detector. Detection using multiple sensors has the advantage of being able to accurately detect an object compared to detection using a single sensor.

It is also possible to learn detectors, etc., based on information obtained by sensing. In one example of learning, point cloud data obtained from LiDAR is superimposed on the image captured by the camera, and the depth information of each object included in the image is used to accurately measure the distance to each object. do. A highly accurate distance estimation model can be learned by learning the relationship between each object on the image and the distance using the object captured in the image and the depth information as correct data.

However, in such a method, it may not be possible to handle point cloud data appropriately due to differences in the installation positions of multiple sensors. For example, since LiDAR generally irradiates a wide range, it is installed at a relatively high position (such as the roof of the vehicle 1). In addition, the cameras are installed near the front panel and the rear panel in order to take images in the front, rear, left, and right directions of travel. In such a situation, there may be discrepancies between the object illuminated by the LiDAR and the object captured by the camera. In this case, when the point cloud data is drawn on an image, an image is generated in which the point cloud that hits the object farther than the object captured by the camera is erroneously recognized as hitting the object. If learning is performed using such images as correct data, highly accurate learning may not be possible.

Therefore, the image processing apparatus 100 according to the present disclosure executes the following processing. That is, the image processing apparatus 100 acquires point cloud data indicating that a surrounding object has been detected from the LiDAR, and superimposes the point cloud data on an image including the laser irradiation range of the LiDAR in the imaging range. , based on the illumination information of the LiDAR, identify the false transmission points, which are points that are not actually illuminated on the object in the image, in the superimposed point cloud data. Then, the image processing apparatus 100 deletes the false transmission points from the point cloud data, and then generates an image by superimposing the point cloud data excluding the false transmission points. As a result, the image processing apparatus 100 can generate image data in which only the point cloud data that accurately hit the object are superimposed, so that learning processing and analysis processing using images can be performed with high accuracy.

Such image processing will be described below with reference to FIGS. 1 to 3. FIG. First, the difference between the irradiation range of the LiDAR 150 provided in the vehicle 1 and the imaging range of the camera 160 will be described with reference to FIG. As shown in FIG. 1, the vehicle 1 has a LiDAR 150 on the top of the head. The vehicle 1 also includes a camera 160 for capturing an image of the front. Although not shown in FIG. 1, the vehicle 1 may include more LiDARs 150 and cameras 160. FIG.

The vehicle 1 continuously performs detection by the LiDAR 150 and the camera 160 while driving. FIG. 1 shows a point 310 indicating that the laser emitted by the LiDAR 150 hits an arbitrary object further forward than the vehicle 200 ahead, and a point 330 indicating that the laser hits the rear of the vehicle 200 ahead. show.

Also, the camera 160 captures an imaging range including the vehicle 200 ahead of the vehicle 1 . For example, the camera 160 captures an image including the point 330 in the imaging range. At this time, the image captured by the camera 160 includes the vehicle 200 in the imaging range and does not include the point 310 because the vehicle 200 blocks the target ahead. On the other hand, when data corresponding to point 310 is superimposed on an image captured by camera 160, point 320 is included such that point 310 exists on an extension line. That is, although the point 320 is not actually illuminated by the laser emitted by the LiDAR 150, when the point cloud data is superimposed on the image, the vehicle 200 is apparently illuminated by the laser. point (false transparent point).

At this time, the point 330 is a point indicating that the laser actually emitted from the LiDAR 150 hits the vehicle 200, and the depression angle (inclination), which is the irradiation information of the laser, is obtained from the line segment connecting the LiDAR 150 and the point 330. . On the other hand, the point 320, which is a false transmission point, is actually based on the laser beam emitted from the LiDAR 150 at the depression angle at which the point 310 is obtained. Nonetheless, it is a point obtained based on a laser irradiated at a depression angle shallower than that of point 330 .

This point will be explained in detail using FIG. FIG. 2 is a diagram (2) for explaining an outline of image processing according to the present disclosure.

An image 340 shown in FIG. 2 shows a state in which points 330 and 320, which are point cloud data obtained from the LiDAR 150, are superimposed on the captured image. At this time, the point 320 is obtained by the shallow depression angle laser ("Line 1" shown in FIG. 2) emitted from the LiDAR 150, and actually indicates that the object in front of the vehicle 200 has been hit. 310 is the false transmission point. Also, a point 330 is obtained by a laser beam emitted from the LiDAR 150 and having a deeper depression angle than the laser beam corresponding to the point 310 (“Line 2” in FIG. 2).

The image 342 in FIG. 2 is obtained by superimposing the point cloud data irradiated by the LiDAR 150 on the image of the vehicle 200 captured by the image processing device 100 . Specifically, the image 342 is composed of point cloud data 332 obtained based on substantially the same elevation/depression angle information as the point 330 (in the image, the values of the vertical axis appear to be substantially the same), and the point 320. 10 shows a state in which point cloud data 322 obtained based on the same elevation/depression angle information is superimposed on the rear portion of the vehicle 200. FIG.

In this example, all four points included in the point cloud data 322 are false transmission points. In addition to the point cloud data 322 and the point cloud data 332, the image 342 also includes point cloud data 334 and point cloud data 336 indicating that the vehicle 200 was actually hit by the laser. Also, although not shown in the image 342, other point cloud data indicating that the laser actually hit the vehicle 200 may be superimposed in the vicinity of the point cloud data 322, which is the false transmission point. sell.

In this way, the point cloud data 322 is point cloud data that is superimposed on the image 342 and observed as if it hit the vehicle 200, even though the vehicle 200 is not actually hit by the laser. Therefore, if an attempt is made to use the depth information and the like included in the image 342 and the point cloud data 322 as learning data, the distance to the vehicle 200 and the depth information included in the point cloud data 322 will contradict each other. less reliable.

Therefore, the image processing apparatus 100 identifies points 320 (and point cloud data 322) that are false transparent points, and executes processing for removing the identified false transparent points. This point will be described with reference to FIG. FIG. 3 is a diagram (3) for explaining an outline of image processing according to the present disclosure.

When the LiDAR 150 irradiates a laser, the "elevation" indicating the angle of the height at which the laser is emitted, that is, elevation/depression angle information 410, and the "azimuth" indicating the horizontal angle with respect to the vehicle 1, that is, the azimuth angle information 420. It is possible to acquire irradiation information including Also, the LiDAR 150 can acquire identification information (irradiation ID) for each irradiation as irradiation information. The image processing apparatus 100 acquires these irradiation information together with the point cloud data. That is, the image processing apparatus 100 can specify the elevation/depression angle information 410 and the azimuth angle information 420 when the laser is irradiated for each of the point cloud data obtained from the LiDAR 150 based on the irradiation ID. In addition, the image processing apparatus 100 can specify height (y-axis) information and horizontal position (x-axis) in the image 430 captured by the camera 160 .

Based on this information, the image processing apparatus 100 identifies false transparent points. First, the image processing apparatus 100 selects one point to be processed from the point cloud data superimposed on the image 342 . For example, the image processing apparatus 100 selects point 330 . Subsequently, the image processing apparatus 100 selects another point on substantially the same x-axis. Here, the image processing apparatus 100 selects a point 320 which is another point on substantially the same x-axis.

Then, the image processing apparatus 100 compares the irradiation information based on the two irradiation IDs. Specifically, the image processing apparatus 100 compares elevation/depression angle information of two points. Then, if there is a contradiction in the elevation/depression angles of the points on the same x-axis, the image processing apparatus 100 identifies the point with the contradiction as a false transmission point. Specifically, the image processing apparatus 100 detects that the point 320, which should be superimposed at a higher position on the y-axis of the image 342 than the point 330, is superimposed on the image 342 relative to the point 330, because the irradiation is performed at a shallower depression angle. Identify point 320 as a false transmission point if it is superimposed at a lower position than . Alternatively, when the irradiation information of the point cloud data is specified by "(elevation, azimuth)", the image processing apparatus 100 draws the point 330 above the point 320 in the image 342, If the "elevation" value of point 330 is less than the "elevation" value of point 320, point 320 is identified as a false transparent point. Also, if the "elevation" value of the point 330 is greater than the "elevation" value of the point 320, the image processing apparatus 100 determines that neither the point 330 nor the point 320 is a false transparent point. The point to be processed (the point 338 in the example of FIG. 3) is compared with the irradiation information of the point 330, and the false transmission points are identified in order.

The image processing apparatus 100 deletes the specified false transparent point and does not superimpose it on the image 342 . The image processing apparatus 100 can specify all the false transparent points included in the image 342 by performing the processing for all the point cloud data of the image 342 .

Note that the image processing apparatus 100 may specify the false transmission point using not only the elevation/depression angle information but also the azimuth angle information. That is, depending on the relationship between the installation positions of the LiDAR 150 and the camera 160, not only the situation where the laser ahead of the object is detected as a false transmission point due to the difference in the installation position based on the elevation as described above, but also the object This is because a false transmission point may be detected by irradiating the tip of the object with the laser so as to go around from the side of the object.

In the example of FIG. 3, it is assumed that the image processing apparatus 100 has identified

points

320, 324, and 326 as false transparent points. In this case, the image processing apparatus 100 removes the

points

320 , 324 and 326 and generates an image by superimposing the remaining point cloud data on the image 342 . Thereby, the image processing apparatus 100 can obtain an image in which only point cloud data having accurate depth information corresponding to the object on the image are superimposed.

With regard to the image processing described above, an outline of the processing procedure will be described using FIG. FIG. 4 is a block diagram for explaining the flow of image processing according to the present disclosure. FIG. 4 is a schematic block diagram showing an example of a procedure of automatic driving by the vehicle 1 including image processing executed by the image processing device 100. As shown in FIG.

As shown in FIG. 4, the image processing device 100 acquires a camera image 550 and LiDAR data 552 (point cloud data). Then, the image processing apparatus 100 generates a LiDAR data superimposed image 554 in which the LiDAR data 552 is superimposed on the camera image 550 by the image processing described above.

After that, the image processing device 100 executes processing such as 3D semantic segmentation (3D Semantic Segmentation) 556 for detecting or recognizing surrounding objects, roads, etc. based on the camera image 550 and the LiDAR data superimposed image 554. . Note that the detection technique is not limited to 3D semantic segmentation, and the image processing apparatus 100 may use other known techniques. Then, the image processing device 100 uses the acquired peripheral information to perform predetermined automatic driving processing (task execution 558) such as parking processing in a parking space and driving to a destination.

(1-2. Configuration example of image processing apparatus according to embodiment)
Next, the configuration of the image processing apparatus 100 will be described using FIG. FIG. 5 is a diagram showing a configuration example of the image processing device 100 according to the embodiment of the present disclosure. As shown in FIG. 5, the image processing apparatus 100 has a communication section 110, a storage section 120, a control section 130, and a detection section 140. As shown in FIG. Note that the configuration shown in FIG. 5 is a functional configuration, and the hardware configuration may differ from this. Also, the functions of the image processing apparatus 100 may be distributed and implemented in a plurality of physically separated apparatuses.

The communication unit 110 is implemented by, for example, a network interface controller or NIC (Network Interface Card). The communication unit 110 may be a USB interface configured by a USB (Universal Serial Bus) host controller, a USB port, or the like. Also, the communication unit 110 may be a wired interface or a wireless interface. For example, the communication unit 110 may be a wireless communication interface of a wireless LAN system or a cellular communication system. The communication unit 110 functions as communication means or transmission means of the image processing apparatus 100 . For example, the communication unit 110 is connected to the network N by wire or wirelessly, and transmits/receives information to/from another information processing terminal such as an external device such as a cloud server via the network N. Network N is, for example, Bluetooth (registered trademark), the Internet, Wi-Fi (registered trademark), UWB (Ultra Wide Band), LPWA (Low Power Wide Area), ELTRES (registered trademark), or other wireless communication standards or methods. Realized.

The storage unit 120 is implemented by, for example, a semiconductor memory device such as RAM (Random Access Memory) or flash memory, or a storage device such as a hard disk or optical disk. The storage unit 120 stores various data. For example, the storage unit 120 stores irradiation information when the LiDAR 150 emits laser light, image data captured by the camera 160, and the like. The storage unit 120 may store a learning device (object detection model) trained for object detection, image data used for learning, and the like. The storage unit 120 may also store map data or the like for executing automatic driving. Note that although the present disclosure shows an example in which the storage unit 120 is installed in the image processing device 100 (that is, the vehicle 1), the data stored in the storage unit 120 is stored on an external device such as a cloud server. may

The detection unit 140 detects various types of information regarding the vehicle 1 and the image processing device 100 . Specifically, the detection unit 140 detects the environment around the vehicle 1, the location information of the vehicle 1, the information related to other devices connected to the image processing device 100 mounted on the vehicle 1, and the like. do. The detection unit 140 may be read as a sensor that detects various types of information.

For example, the detection unit 140 has a LiDAR 150 and a camera 160 as sensors. The LiDAR 150 is a sensor that reads the three-dimensional structure of the surrounding environment of the vehicle 1 . Specifically, the LiDAR 150 irradiates a surrounding object with a laser beam such as an infrared laser and measures the time it takes for the laser beam to reflect and return, thereby detecting the distance to the object and the relative speed.

The camera 160 is a sensor that has a function of imaging the surroundings of the vehicle 1. The camera 160 may take any form, such as a stereo camera, a monocular camera, or a lensless camera. Also, the camera 160 is not limited to a visible light camera such as an RGB camera, and may be a camera with a depth sensor including a ToF (Time of Flight) sensor. The camera 160 may also include an AI-equipped image sensor capable of object detection and recognition processing.

In addition, the detection unit 140 may have various sensors other than the LiDAR 150 and the camera 160. For example, the detection unit 140 may include a ranging system using millimeter wave radar. Also, the detection unit 140 may include a depth sensor for acquiring depth data. Also, the sensing unit 140 may be a sonar that searches the surrounding environment with sound waves. Further, the detection unit 140 includes a microphone that collects sounds around the vehicle 1, an illuminance sensor that detects the illuminance around the vehicle 1, a humidity sensor that detects the humidity around the vehicle 1, and a location sensor of the vehicle 1. It may also include a geomagnetic sensor or the like that detects the magnetic field in the .

Although not shown in FIG. 5, the image processing apparatus 100 may include a display unit that displays various information. The display unit is a mechanism for outputting various information, such as a liquid crystal display. For example, the display unit may display an image captured by the detection unit 140 or an object detected by the image processing device 100 in the image. Further, the display unit may also serve as a processing unit for receiving various operations from a user or the like who uses the image processing apparatus 100 . For example, the display unit may receive input of various information via key operations, a touch panel, or the like.

The control unit 130 stores a program (for example, an image processing program according to the present disclosure) stored inside the image processing apparatus 100 by a CPU (Central Processing Unit) or MPU (Micro Processing Unit), for example, in a RAM (Random Access Memory). ) etc. as a work area. Also, the control unit 130 is a controller, and may be realized by an integrated circuit such as an ASIC (Application Specific Integrated Circuit) or an FPGA (Field Programmable Gate Array).

As shown in FIG. 5, the control unit 130 includes an acquisition unit 131, an imaging unit 132, an identification unit 133, and a generation unit 134, and implements or executes the information processing functions and actions described below. . Note that the internal configuration of the control unit 130 is not limited to the configuration shown in FIG. 5, and may be another configuration as long as it performs information processing described later.

The acquisition unit 131 acquires various types of information. For example, the acquisition unit 131 acquires point cloud data indicating detection of surrounding objects from the LiDAR 150, which is a sensor using a laser.

In addition, the acquisition unit 131 acquires the point cloud data as well as the laser irradiation information when the point cloud data was obtained. For example, the acquisition unit 131 acquires elevation/depression angle information and azimuth angle information of laser irradiation as the irradiation information. In other words, the acquisition unit 131 acquires, as the irradiation information, a numerical value (elevation) indicating the height direction and a numerical value (azimuth) indicating the horizontal direction when the laser is irradiated.

The imaging unit 132 captures a two-dimensional image including the laser irradiation range of the LiDAR 150 in the imaging range. Specifically, the imaging unit 132 controls the camera 160 to capture an image of the surroundings of the vehicle 1 and captures an image including the laser irradiation range of the LiDAR 150 in the imaging range.

When the point cloud data is superimposed on the image including the irradiation range of the laser by the LiDAR 150 in the imaging range, the specifying unit 133 determines, based on the irradiation information of the LiDAR 150, that of the superimposed point cloud data, the actual Identify false transmission points, which are points not illuminated on the object. Specifically, the specifying unit 133 specifies a false transmission point based on the irradiation information of the LiDAR 150 when the point cloud data is superimposed on the image captured by the imaging unit 132 .

As described with reference to FIGS. 1 to 3, the specifying unit 133 specifies the false transmission point based on the elevation/depression angle information and the azimuth angle information in the laser irradiation as the irradiation information.

Specifically, the specifying unit 133 first specifies two point cloud data having substantially the same azimuth angle information among the point cloud data. Then, the specifying unit 133 compares the elevation/depression angle information corresponding to the two specified point cloud data with the values of the vertical axis coordinates when the two point cloud data are projected onto the image, thereby determining false Identify the transmission point.

More specifically, of the point cloud data superimposed on the image, the identifying unit 133 selects two points having substantially the same azimuth angle information in the image (in other words, horizontal axis coordinates (x-axis coordinates) on the image). Extract point cloud data. Furthermore, the specifying unit 133 determines that, of the two extracted point cloud data, a point that should have a higher vertical axis coordinate (y-axis coordinate) value, that is, a point having a larger elevation value, is replaced by the other point. A point is identified as a false transmission point if it is drawn at a position lower than . Note that when the specifying unit 133 determines that two points are drawn on the image without contradiction, the specifying unit 133 determines that the points are not false transparent points, and proceeds to the process of comparing the next two points.

Note that the specifying unit 133 may specify a false transmission point not only based on elevation/depression angle information, that is, contradiction with respect to the height direction, but also based on contradiction with respect to the horizontal direction. In such a situation, for example, when the LiDAR 150 is installed at the end of the vehicle 1, the irradiation of the laser exceeds the imaging range in the horizontal direction of the camera 160, and the object in the front included in the captured image is captured in the image. This may occur, for example, when an object in the back that is not included in the

In this case, the specifying unit 133 specifies two point cloud data having substantially the same elevation/depression angle information among the point cloud data. Then, the specifying unit 133 compares the azimuth angle information corresponding to the two point cloud data with the value of the horizontal axis coordinate when the two point cloud data are projected onto the two-dimensional image, thereby determining the false transmission. Identify points.

More specifically, the specifying unit 133 selects two points having substantially the same elevation/depression angle information (in other words, vertical axis coordinates (y-axis coordinates) on the image) in the image, among the point cloud data superimposed on the image. Extract point cloud data. Furthermore, the specifying unit 133 determines that, of the two extracted point cloud data, a point that should have a higher (or lower) horizontal axis coordinate (x-axis coordinate) value, that is, a point that has a larger (smaller) azimuth value A point is identified as a false transparent point if it is drawn at an inconsistent position either to the left or right of another point. Note that when the specifying unit 133 determines that two points are drawn on the image without contradiction, the specifying unit 133 determines that the points are not false transparent points, and proceeds to the process of comparing the next two points.

The generation unit 134 removes the false transmission points identified by the identification unit 133 and generates an image in which the point cloud data excluding the removed false transmission points is superimposed.

Taking FIG. 2 as an example, when the point 350 and the other three points having substantially the same elevation/depression angle information are identified as false transmission points, the generation unit 134 removes these four points from the original image, generates an image in which the point cloud data of is superimposed. As a result, the generation unit 134 can generate an image in which only the point cloud data accurately irradiated onto the object on the image are superimposed, so that an image that does not interfere with the subsequent object detection processing and learning processing can be generated. can provide.

(1-3. Image processing procedure according to the embodiment)
The procedure of the image processing described above will be described with reference to FIG. FIG. 6 is a flowchart showing the flow of processing according to the embodiment.

As shown in FIG. 6, the image processing device 100 acquires point cloud data from the LiDAR 150 (step S31). Also, the image processing device 100 uses the camera 160 to capture an image of a range including the irradiation range of the LiDAR 150 (step S32).

Subsequently, the image processing device 100 superimposes the acquired point cloud data on the captured image (step S33). Then, the image processing apparatus 100 extracts two points to be processed from the superimposed plurality of point cloud data (step S34).

Then, the image processing apparatus 100 executes the above-described specific processing for the relationship between the extracted two points, and determines whether there is a point that contradicts the irradiation information when the LiDAR 150 irradiates the laser (step S35). If contradictory points exist (step S35; Yes), the image processing apparatus 100 deletes the contradictory points (step S36).

On the other hand, if there is no inconsistent point (step S35; No), the image processing apparatus 100 determines whether or not all point cloud data have been processed at that time (step S37). If point cloud data to be processed remains (step S37; No), the image processing apparatus 100 repeats the process of extracting the following two points and identifying false transparent points.

On the other hand, when all the point cloud data have been processed (step S37; Yes), the image processing apparatus 100 generates an image in which the remaining point cloud data after deleting the false transmission points are superimposed (step S38). .

(1-4. Modified Example of Embodiment)
In the embodiment described above, an example in which the vehicle 1, which is an automatic driving vehicle equipped with the LiDAR 150 and the camera 160, executes the image processing according to the embodiment has been described. However, the image processing according to the embodiment may be performed by various moving bodies, not limited to self-driving vehicles (so-called automobiles).

For example, a mobile object that executes image processing according to the embodiment may be a small vehicle such as a motorcycle or a tricycle, a large vehicle such as a bus or truck, or an autonomous mobile object such as a robot or drone. . Further, the image processing apparatus 100 is not necessarily integrated with a mobile object such as the vehicle 1, and may be a cloud server or the like that acquires information from the mobile object via a network and performs image processing based on the acquired information. .

(2. Other embodiments)
The processing according to each of the above-described embodiments may be implemented in various different forms other than the above-described respective embodiments.

(2-1. Configuration of moving body)
For example, the image processing device 100 may be realized by an autonomous mobile body (automobile) that automatically drives. In this case, vehicle 1 and image processing device 100 may have configurations shown in FIGS. 7 and 8 in addition to the configuration shown in FIG. In addition, each part shown below may be included in each part shown in FIG. 5, for example.

That is, the image processing device 100 of the present technology can also be configured as part of the vehicle control system 11 described below. FIG. 7 is a block diagram showing a schematic functional configuration example of the vehicle control system 11 to which the present technology can be applied.

The vehicle control system 11 is provided in the vehicle 1 and performs processing related to driving support and automatic driving of the vehicle 1.

The vehicle control system 11 includes a vehicle control ECU (Electronic Control Unit) 21, a communication unit 22, a map information storage unit 23, a GNSS (Global Navigation Satellite System) receiving unit 24, an external recognition sensor 25, an in-vehicle sensor 26, a vehicle sensor 27, It has a recording unit 28 , a driving support/automatic driving control unit 29 , a DMS (Driver Monitoring System) 30 , an HMI (Human Machine Interface) 31 , and a vehicle control unit 32 .

vehicle control ECU 21, communication unit 22, map information storage unit 23, GNSS reception unit 24, external recognition sensor 25, in-vehicle sensor 26, vehicle sensor 27, recording unit 28, driving support/automatic driving control unit 29, DMS 30, HMI 31, and , and the vehicle control unit 32 are communicably connected to each other via a communication network 41 . The communication network 41 is, for example, a CAN (Controller Area Network), LIN (Local Interconnect Network), LAN (Local Area Network), FlexRay (registered trademark), Ethernet (registered trademark), and other digital two-way communication standards. It is composed of a communication network, a bus, and the like. The communication network 41 may be selectively used depending on the type of data to be communicated. For example, CAN is applied for data related to vehicle control, and Ethernet is applied for large-capacity data. Each part of the vehicle control system 11 performs wireless communication assuming relatively short-range communication such as near field communication (NFC (Near Field Communication)) or Bluetooth (registered trademark) without going through the communication network 41. may be connected directly using

In addition, hereinafter, when each part of the vehicle control system 11 communicates via the communication network 41, the description of the communication network 41 will be omitted. For example, when the vehicle control ECU 21 and the communication unit 22 communicate via the communication network 41, it is simply described that the vehicle control ECU 21 and the communication unit 22 communicate.

The vehicle control ECU 21 is composed of various processors such as a CPU (Central Processing Unit) and an MPU (Micro Processing Unit). The vehicle control ECU 21 controls all or part of the functions of the vehicle control system 11 .

The communication unit 22 communicates with various devices inside and outside the vehicle, other vehicles, servers, base stations, etc., and transmits and receives various data. At this time, the communication unit 22 can perform communication using a plurality of communication methods.

The communication with the outside of the vehicle that can be performed by the communication unit 22 will be described schematically. The communication unit 22 uses a wireless communication method such as 5G (5th generation mobile communication system), LTE (Long Term Evolution), DSRC (Dedicated Short Range Communications), etc., via a base station or access point, on an external network communicates with a server (hereinafter referred to as an external server) located in the The external network with which the communication unit 22 communicates is, for example, the Internet, a cloud network, or a provider's own network. The communication method for communicating with the external network by the communication unit 22 is not particularly limited as long as it is a wireless communication method capable of digital two-way communication at a predetermined communication speed or higher and at a predetermined distance or longer.

Also, for example, the communication unit 22 can communicate with a terminal existing in the vicinity of the own vehicle using P2P (Peer To Peer) technology. Terminals in the vicinity of one's own vehicle include, for example, terminals worn by pedestrians, bicycles, and other moving bodies that move at relatively low speeds, terminals installed at fixed locations such as stores, or MTC (Machine Type Communication). ) terminal. Furthermore, the communication unit 22 can also perform V2X communication. V2X communication includes, for example, vehicle-to-vehicle communication with other vehicles, vehicle-to-infrastructure communication with roadside equipment, etc., and vehicle-to-home communication , and communication between the vehicle and others, such as vehicle-to-pedestrian communication with a terminal or the like possessed by a pedestrian.

For example, the communication unit 22 can receive from the outside a program for updating the software that controls the operation of the vehicle control system 11 (Over The Air). The communication unit 22 can also receive map information, traffic information, information around the vehicle 1, and the like from the outside. Further, for example, the communication unit 22 can transmit information about the vehicle 1, information about the surroundings of the vehicle 1, and the like to the outside. The information about the vehicle 1 that the communication unit 22 transmits to the outside includes, for example, data indicating the state of the vehicle 1, recognition results by the recognition unit 73, and the like. Furthermore, for example, the communication unit 22 performs communication corresponding to a vehicle emergency call system such as e-call.

The communication with the inside of the vehicle that can be performed by the communication unit 22 will be described schematically. The communication unit 22 can communicate with each device in the vehicle using, for example, wireless communication. The communication unit 22 performs wireless communication with devices in the vehicle using a communication method such as wireless LAN, Bluetooth, NFC, and WUSB (Wireless USB) that enables digital two-way communication at a communication speed higher than a predetermined value. can be done. Not limited to this, the communication unit 22 can also communicate with each device in the vehicle using wired communication. For example, the communication unit 22 can communicate with each device in the vehicle by wired communication via a cable connected to a connection terminal (not shown). The communication unit 22 performs digital two-way communication at a predetermined communication speed or higher by wired communication, such as USB (Universal Serial Bus), HDMI (High-Definition Multimedia Interface) (registered trademark), and MHL (Mobile High-definition Link). can communicate with each device in the vehicle.

Here, equipment in the vehicle refers to equipment that is not connected to the communication network 41 in the vehicle, for example. Examples of in-vehicle devices include mobile devices and wearable devices possessed by passengers such as drivers, information devices that are brought into the vehicle and temporarily installed, and the like.

For example, the communication unit 22 receives electromagnetic waves transmitted by a vehicle information and communication system (VICS (registered trademark)) such as radio beacons, optical beacons, and FM multiplex broadcasting.

The map information accumulation unit 23 accumulates one or both of the map obtained from the outside and the map created by the vehicle 1. For example, the map information accumulation unit 23 accumulates a three-dimensional high-precision map, a global map covering a wide area, and the like, which is lower in accuracy than the high-precision map.

High-precision maps are, for example, dynamic maps, point cloud maps, and vector maps. The dynamic map is, for example, a map consisting of four layers of dynamic information, quasi-dynamic information, quasi-static information, and static information, and is provided to the vehicle 1 from an external server or the like. A point cloud map is a map composed of a point cloud (point cloud data). Here, the vector map refers to a map adapted to ADAS (Advanced Driver Assistance System) in which traffic information such as lane and signal positions are associated with a point cloud map.

The point cloud map and the vector map, for example, may be provided from an external server or the like, and based on the sensing results of the radar 52, LiDAR 53, etc., the vehicle 1 as a map for matching with a local map described later. It may be created and stored in the map information storage unit 23 . Further, when a high-precision map is provided from an external server or the like, in order to reduce the communication capacity, map data of, for example, several hundred meters square, regarding the planned route that the vehicle 1 will travel from now on, is acquired from the external server or the like. .

The GNSS receiver 24 receives GNSS signals from GNSS satellites and acquires position information of the vehicle 1 . The received GNSS signal is supplied to the driving support/automatic driving control unit 29 . In addition, the GNSS receiver 24 is not limited to the method using the GNSS signal, and may acquire the position information using, for example, a beacon.

The external recognition sensor 25 includes various sensors used for recognizing situations outside the vehicle 1 and supplies sensor data from each sensor to each part of the vehicle control system 11 . The type and number of sensors included in the external recognition sensor 25 are arbitrary.

For example, the external recognition sensor 25 includes a camera 51 , a radar 52 , a LiDAR (Light Detection and Ranging, Laser Imaging Detection and Ranging) 53 , and an ultrasonic sensor 54 . The configuration is not limited to this, and the external recognition sensor 25 may be configured to include one or more types of sensors among the camera 51, radar 52, LiDAR 53, and ultrasonic sensor . The numbers of cameras 51 , radars 52 , LiDARs 53 , and ultrasonic sensors 54 are not particularly limited as long as they are realistically installable in the vehicle 1 . Moreover, the type of sensor provided in the external recognition sensor 25 is not limited to this example, and the external recognition sensor 25 may be provided with other types of sensors. An example of the sensing area of each sensor included in the external recognition sensor 25 will be described later.

Note that the shooting method of the camera 51 is not particularly limited as long as it is a shooting method that enables distance measurement. For example, the camera 51 may be a ToF (Time Of Flight) camera, a stereo camera, a monocular camera, an infrared camera, or any other type of camera as required. The camera 51 is not limited to this, and may simply acquire a photographed image regardless of distance measurement.

Also, for example, the external recognition sensor 25 can include an environment sensor for detecting the environment with respect to the vehicle 1. The environment sensor is a sensor for detecting the environment such as weather, weather, brightness, etc., and can include various sensors such as raindrop sensors, fog sensors, sunshine sensors, snow sensors, and illuminance sensors.

Furthermore, for example, the external recognition sensor 25 includes a microphone used for detecting the sound around the vehicle 1 and the position of the sound source.

The in-vehicle sensor 26 includes various sensors for detecting information inside the vehicle, and supplies sensor data from each sensor to each part of the vehicle control system 11 . The types and number of various sensors included in the in-vehicle sensor 26 are not particularly limited as long as they are realistically installable in the vehicle 1 .

For example, the in-vehicle sensor 26 can include one or more sensors among cameras, radars, seating sensors, steering wheel sensors, microphones, and biosensors. As the camera provided in the in-vehicle sensor 26, for example, cameras of various shooting methods capable of distance measurement, such as a ToF camera, a stereo camera, a monocular camera, and an infrared camera, can be used. The camera included in the in-vehicle sensor 26 is not limited to this, and may simply acquire a photographed image regardless of distance measurement. The biosensors included in the in-vehicle sensor 26 are provided, for example, in seats, steering wheels, etc., and detect various biometric information of passengers such as the driver.

The vehicle sensor 27 includes various sensors for detecting the state of the vehicle 1, and supplies sensor data from each sensor to each section of the vehicle control system 11. The types and number of various sensors included in the vehicle sensor 27 are not particularly limited as long as they can be installed in the vehicle 1 realistically.

For example, the vehicle sensor 27 includes a speed sensor, an acceleration sensor, an angular velocity sensor (gyro sensor), and an inertial measurement unit (IMU (Inertial Measurement Unit)) integrating them. For example, the vehicle sensor 27 includes a steering angle sensor that detects the steering angle of the steering wheel, a yaw rate sensor, an accelerator sensor that detects the amount of operation of the accelerator pedal, and a brake sensor that detects the amount of operation of the brake pedal. For example, the vehicle sensor 27 includes a rotation sensor that detects the number of rotations of an engine or a motor, an air pressure sensor that detects tire air pressure, a slip rate sensor that detects a tire slip rate, and a wheel speed sensor that detects the rotational speed of a wheel. A sensor is provided. For example, the vehicle sensor 27 includes a battery sensor that detects the remaining battery level and temperature, and an impact sensor that detects external impact.

The recording unit 28 includes at least one of a nonvolatile storage medium and a volatile storage medium, and stores data and programs. The recording unit 28 is used, for example, as EEPROM (Electrically Erasable Programmable Read Only Memory) and RAM (Random Access Memory), and as a storage medium, magnetic storage devices such as HDD (Hard Disc Drive), semiconductor storage devices, optical storage devices, And a magneto-optical storage device can be applied. The recording unit 28 records various programs and data used by each unit of the vehicle control system 11 . For example, the recording unit 28 includes an EDR (Event Data Recorder) and a DSSAD (Data Storage System for Automated Driving), and records information on the vehicle 1 before and after an event such as an accident and biometric information acquired by the in-vehicle sensor 26. .

The driving support/automatic driving control unit 29 controls driving support and automatic driving of the vehicle 1 . For example, the driving support/automatic driving control unit 29 includes an analysis unit 61 , an action planning unit 62 and an operation control unit 63 .

The analysis unit 61 analyzes the vehicle 1 and its surroundings. The analysis unit 61 includes a self-position estimation unit 71 , a sensor fusion unit 72 and a recognition unit 73 .

The self-position estimation unit 71 estimates the self-position of the vehicle 1 based on the sensor data from the external recognition sensor 25 and the high-precision map accumulated in the map information accumulation unit 23. For example, the self-position estimation unit 71 generates a local map based on sensor data from the external recognition sensor 25, and estimates the self-position of the vehicle 1 by matching the local map and the high-precision map. The position of the vehicle 1 is based on, for example, the center of the rear wheel versus axle.

A local map is, for example, a three-dimensional high-precision map created using techniques such as SLAM (Simultaneous Localization and Mapping), an occupancy grid map, or the like. The three-dimensional high-precision map is, for example, the point cloud map described above. The occupancy grid map is a map that divides the three-dimensional or two-dimensional space around the vehicle 1 into grids (lattice) of a predetermined size and shows the occupancy state of objects in grid units. The occupancy state of an object is indicated, for example, by the presence or absence of the object and the existence probability. The local map is also used, for example, by the recognizing unit 73 for detection processing and recognition processing of the situation outside the vehicle 1 .

The self-position estimation unit 71 may estimate the self-position of the vehicle 1 based on the GNSS signal and sensor data from the vehicle sensor 27.

The sensor fusion unit 72 combines a plurality of different types of sensor data (for example, image data supplied from the camera 51 and sensor data supplied from the radar 52) to perform sensor fusion processing to obtain new information. . Methods for combining different types of sensor data include integration, fusion, federation, and the like.

The recognition unit 73 executes a detection process for detecting the situation outside the vehicle 1 and a recognition process for recognizing the situation outside the vehicle 1 .

For example, the recognition unit 73 performs detection processing and recognition processing of the external situation of the vehicle 1 based on information from the external recognition sensor 25, information from the self-position estimation unit 71, information from the sensor fusion unit 72, and the like. .

Specifically, for example, the recognition unit 73 performs detection processing and recognition processing of objects around the vehicle 1 . Object detection processing is, for example, processing for detecting the presence or absence, size, shape, position, movement, and the like of an object. Object recognition processing is, for example, processing for recognizing an attribute such as the type of an object or identifying a specific object. However, detection processing and recognition processing are not always clearly separated, and may overlap.

For example, the recognition unit 73 detects objects around the vehicle 1 by clustering the point cloud based on sensor data from the LiDAR 53 or the radar 52 or the like for each cluster of point groups. As a result, presence/absence, size, shape, and position of objects around the vehicle 1 are detected.

For example, the recognition unit 73 detects the movement of objects around the vehicle 1 by performing tracking that follows the movement of the masses of point groups classified by clustering. As a result, the speed and traveling direction (movement vector) of the object around the vehicle 1 are detected.

For example, the recognition unit 73 detects or recognizes vehicles, people, bicycles, obstacles, structures, roads, traffic lights, traffic signs, road markings, etc. from the image data supplied from the camera 51 . Also, the types of objects around the vehicle 1 may be recognized by performing recognition processing such as semantic segmentation.

For example, the recognition unit 73, based on the map accumulated in the map information accumulation unit 23, the estimation result of the self-position by the self-position estimation unit 71, and the recognition result of the object around the vehicle 1 by the recognition unit 73, Recognition processing of traffic rules around the vehicle 1 can be performed. Through this processing, the recognizing unit 73 can recognize the position and state of traffic signals, the content of traffic signs and road markings, the content of traffic restrictions, and the lanes in which the vehicle can travel.

For example, the recognition unit 73 can perform recognition processing of the environment around the vehicle 1 . The surrounding environment to be recognized by the recognition unit 73 includes the weather, temperature, humidity, brightness, road surface conditions, and the like.

The action plan section 62 creates an action plan for the vehicle 1. For example, the action planning unit 62 creates an action plan by performing route planning and route following processing.

Note that global path planning is the process of planning a rough route from the start to the goal. This route planning is referred to as a trajectory plan. In the route planned by the route planning, a trajectory generation (Local path planning) processing is also included. Path planning may be distinguished from long-term path planning and activation generation from short-term path planning, or from local path planning. A safety priority path represents a concept similar to launch generation, short-term path planning, or local path planning.

　Route following is the process of planning actions to safely and accurately travel the route planned by route planning within the planned time. The action planning unit 62 can, for example, calculate the target speed and target angular speed of the vehicle 1 based on the result of this route following processing.

The motion control unit 63 controls the motion of the vehicle 1 in order to implement the action plan created by the action planning unit 62.

For example, the operation control unit 63 controls a steering control unit 81, a brake control unit 82, and a drive control unit 83 included in the vehicle control unit 32, which will be described later, so that the vehicle 1 can control the trajectory calculated by the trajectory plan. Acceleration/deceleration control and direction control are performed so as to advance. For example, the operation control unit 63 performs cooperative control aimed at realizing ADAS functions such as collision avoidance or shock mitigation, follow-up driving, vehicle speed maintenance driving, collision warning of own vehicle, and lane deviation warning of own vehicle. For example, the operation control unit 63 performs cooperative control aimed at automatic driving in which the vehicle autonomously travels without depending on the operation of the driver.

The DMS 30 performs driver authentication processing, driver state recognition processing, etc., based on sensor data from the in-vehicle sensor 26 and input data input to the HMI 31, which will be described later. In this case, the driver's condition to be recognized by the DMS 30 includes, for example, physical condition, wakefulness, concentration, fatigue, gaze direction, drunkenness, driving operation, posture, and the like.

It should be noted that the DMS 30 may perform authentication processing for passengers other than the driver and processing for recognizing the state of the passenger. Further, for example, the DMS 30 may perform recognition processing of the situation inside the vehicle based on the sensor data from the sensor 26 inside the vehicle. Conditions inside the vehicle to be recognized include temperature, humidity, brightness, smell, and the like, for example.

The HMI 31 inputs various data, instructions, etc., and presents various data to the driver.

The input of data by the HMI 31 will be roughly explained. The HMI 31 comprises an input device for human input of data. The HMI 31 generates an input signal based on data, instructions, etc. input from an input device, and supplies the input signal to each section of the vehicle control system 11 . The HMI 31 includes operators such as a touch panel, buttons, switches, and levers as input devices. The HMI 31 is not limited to this, and may further include an input device capable of inputting information by a method other than manual operation using voice, gestures, or the like. Further, the HMI 31 may use, as an input device, a remote control device using infrared rays or radio waves, or an externally connected device such as a mobile device or wearable device corresponding to the operation of the vehicle control system 11 .

The presentation of data by HMI31 will be briefly explained. The HMI 31 generates visual information, auditory information, and tactile information for the passenger or outside the vehicle. The HMI 31 also performs output control for controlling the output, output content, output timing, output method, and the like of each of the generated information. The HMI 31 generates and outputs visual information such as an operation screen, a status display of the vehicle 1, a warning display, an image such as a monitor image showing the situation around the vehicle 1, and information indicated by light. The HMI 31 also generates and outputs information indicated by sounds such as voice guidance, warning sounds, warning messages, etc., as auditory information. Furthermore, the HMI 31 generates and outputs, as tactile information, information given to the passenger's tactile sense by force, vibration, motion, or the like.

As an output device from which the HMI 31 outputs visual information, for example, a display device that presents visual information by displaying an image by itself or a projector device that presents visual information by projecting an image can be applied. . In addition to a display device having a normal display, the display device displays visual information within the passenger's field of view, such as a head-up display, a transmissive display, or a wearable device with an AR (Augmented Reality) function. It may be a device. The HMI 31 can also use display devices such as a navigation device, an instrument panel, a CMS (Camera Monitoring System), an electronic mirror, and lamps provided in the vehicle 1 as output devices for outputting visual information.

Audio speakers, headphones, and earphones, for example, can be applied as output devices for the HMI 31 to output auditory information.

As an output device for the HMI 31 to output tactile information, for example, a haptic element using haptic technology can be applied. A haptic element is provided at a portion of the vehicle 1 that is in contact with a passenger, such as a steering wheel or a seat.

The vehicle control unit 32 controls each unit of the vehicle 1. The vehicle control section 32 includes a steering control section 81 , a brake control section 82 , a drive control section 83 , a body system control section 84 , a light control section 85 and a horn control section 86 .

The steering control unit 81 detects and controls the state of the steering system of the vehicle 1 . The steering system includes, for example, a steering mechanism including a steering wheel, an electric power steering, and the like. The steering control unit 81 includes, for example, a control unit such as an ECU that controls the steering system, an actuator that drives the steering system, and the like.

The brake control unit 82 detects and controls the state of the brake system of the vehicle 1 . The brake system includes, for example, a brake mechanism including a brake pedal, an ABS (Antilock Brake System), a regenerative brake mechanism, and the like. The brake control unit 82 includes, for example, a control unit such as an ECU that controls the brake system.

The drive control unit 83 detects and controls the state of the drive system of the vehicle 1 . The drive system includes, for example, an accelerator pedal, a driving force generator for generating driving force such as an internal combustion engine or a driving motor, and a driving force transmission mechanism for transmitting the driving force to the wheels. The drive control unit 83 includes, for example, a control unit such as an ECU that controls the drive system.

The body system control unit 84 detects and controls the state of the body system of the vehicle 1 . The body system includes, for example, a keyless entry system, smart key system, power window device, power seat, air conditioner, air bag, seat belt, shift lever, and the like. The body system control unit 84 includes, for example, a control unit such as an ECU that controls the body system.

The light control unit 85 detects and controls the states of various lights of the vehicle 1 . Lights to be controlled include, for example, headlights, backlights, fog lights, turn signals, brake lights, projections, bumper displays, and the like. The light control unit 85 includes a control unit such as an ECU for controlling lights.

The horn control unit 86 detects and controls the state of the car horn of the vehicle 1 . The horn control unit 86 includes, for example, a control unit such as an ECU that controls the car horn.

When the image processing device 100 is configured as part of the vehicle control system 11, for example, the control unit 130 shown in FIG. 5 corresponds to the vehicle control ECU 21 and the like. 5 corresponds to the external recognition sensor 25, the vehicle interior sensor 26, the vehicle sensor 27, and the like.

FIG. 8 is a diagram showing an example of sensing areas by the camera 51, radar 52, LiDAR 53, ultrasonic sensor 54, etc. of the external recognition sensor 25 in FIG. 8 schematically shows the vehicle 1 viewed from above, the left end side is the front end (front) side of the vehicle 1, and the right end side is the rear end (rear) side of the vehicle 1.

A sensing area 101F and a sensing area 101B are examples of sensing areas of the ultrasonic sensor 54. FIG. The sensing area 101</b>F covers the periphery of the front end of the vehicle 1 with a plurality of ultrasonic sensors 54 . The sensing area 101B covers the periphery of the rear end of the vehicle 1 with a plurality of ultrasonic sensors 54 .

The sensing results in the sensing area 101F and the sensing area 101B are used, for example, for parking assistance of the vehicle 1 and the like.

Sensing areas 102F to 102B show examples of sensing areas of the radar 52 for short or medium range. The sensing area 102F covers the front of the vehicle 1 to a position farther than the sensing area 101F. The sensing area 102B covers the rear of the vehicle 1 to a position farther than the sensing area 101B. The sensing area 102L covers the rear periphery of the left side surface of the vehicle 1 . The sensing area 102R covers the rear periphery of the right side surface of the vehicle 1 .

The sensing result in the sensing area 102F is used, for example, to detect vehicles, pedestrians, etc. existing in front of the vehicle 1. The sensing result in the sensing area 102B is used, for example, for the rear collision prevention function of the vehicle 1 or the like. The sensing results in the sensing area 102L and the sensing area 102R are used, for example, to detect an object in a blind spot on the side of the vehicle 1, or the like.

Sensing areas 103F to 103B show examples of sensing areas by the camera 51 . The sensing area 103F covers the front of the vehicle 1 to a position farther than the sensing area 102F. The sensing area 103B covers the rear of the vehicle 1 to a position farther than the sensing area 102B. The sensing area 103L covers the periphery of the left side surface of the vehicle 1 . The sensing area 103R covers the periphery of the right side surface of the vehicle 1 .

The sensing results in the sensing area 103F can be used, for example, for recognition of traffic lights and traffic signs, lane departure prevention support systems, and automatic headlight control systems. A sensing result in the sensing area 103B can be used for parking assistance and a surround view system, for example. Sensing results in the sensing area 103L and the sensing area 103R can be used, for example, in a surround view system.

The sensing area 104 shows an example of the sensing area of the LiDAR53. The sensing area 104 covers the front of the vehicle 1 to a position farther than the sensing area 103F. On the other hand, the sensing area 104 has a narrower lateral range than the sensing area 103F.

The sensing results in the sensing area 104 are used, for example, to detect objects such as surrounding vehicles.

A sensing area 105 shows an example of a sensing area of the long-range radar 52 . The sensing area 105 covers the front of the vehicle 1 to a position farther than the sensing area 104 . On the other hand, the sensing area 105 has a narrower lateral range than the sensing area 104 .

The sensing results in the sensing area 105 are used, for example, for ACC (Adaptive Cruise Control), emergency braking, and collision avoidance.

The sensing regions of the cameras 51, the radar 52, the LiDAR 53, and the ultrasonic sensors 54 included in the external recognition sensor 25 may have various configurations other than those shown in FIG. Specifically, the ultrasonic sensor 54 may also sense the sides of the vehicle 1 , and the LiDAR 53 may sense the rear of the vehicle 1 . Moreover, the installation position of each sensor is not limited to each example mentioned above. Also, the number of each sensor may be one or plural.

(2-2. Others)
Of the processes described in each of the above embodiments, all or part of the processes described as being performed automatically can be performed manually, or all of the processes described as being performed manually Alternatively, some can be done automatically by known methods. In addition, information including processing procedures, specific names, various data and parameters shown in the above documents and drawings can be arbitrarily changed unless otherwise specified. For example, the various information shown in each drawing is not limited to the illustrated information.

Also, each component of each device illustrated is functionally conceptual and does not necessarily need to be physically configured as illustrated. In other words, the specific form of distribution and integration of each device is not limited to the one shown in the figure, and all or part of them can be functionally or physically distributed and integrated in arbitrary units according to various loads and usage conditions. Can be integrated and configured.

In addition, the above-described embodiments and modifications can be appropriately combined within a range that does not contradict the processing content.

Further, in the embodiments, an example in which the image processing of the present disclosure is applied to object detection on a moving body has been described, but the image processing of the present disclosure is not limited to object detection on a moving body, and various task processing in other applications. may be used for

In addition, the effects described in this specification are only examples and are not limited, and other effects may be provided.

(3. Effect of the image processing device according to the present disclosure)
As described above, the image processing apparatus (the image processing apparatus 100 in the embodiment) according to the present disclosure includes the acquisition section (the acquisition section 131 in the embodiment) and the specifying section (the specifying section 133 in the embodiment). The acquisition unit acquires point cloud data indicating detection of surrounding objects from LiDAR (Light Detection and Ranging), which is a sensor using a laser. When the point cloud data is superimposed on the image including the irradiation range of the laser by LiDAR in the imaging range, the identifying unit determines, based on the irradiation information of the LiDAR, among the superimposed point cloud data, the target in the image actually. Identify false transmission points, which are points not illuminated on the object.

In this way, the image processing apparatus according to the present disclosure identifies, among the point cloud data superimposed on the image, false transmission points, which are points that are not actually illuminated on the object in the image. As a result, the image processing apparatus can utilize only the points that are actually irradiated on the object for the subsequent processing, so that the point cloud data obtained by sensing can be appropriately utilized.

The image processing apparatus further includes a generating unit (generating unit 134 in the embodiment) that removes the identified false transmission points and generates an image in which the point cloud data excluding the removed false transmission points is superimposed.

In this way, the image processing device generates an image in which the point cloud data after removing the false transparent points are superimposed. As a result, the image processing apparatus can use the image as correct data in the detection process and the learning process, so that the process can be performed with higher accuracy in the subsequent stages.

In addition, the image processing apparatus further includes an imaging unit (the imaging unit 132 in the embodiment) that captures an image including the laser irradiation range of the LiDAR in the imaging range. The specifying unit specifies the false transmission point based on the irradiation information of the LiDAR when the point cloud data is superimposed on the image captured by the imaging unit.

In this way, the image processing device may identify the false transmission point using the image captured by its own device. With such a configuration, the image processing apparatus executes the imaging process and the false transmission point identification process during automatic driving, so task processing such as automatic driving is executed while eliminating false transmission points that may hinder automatic driving. can do.

Further, the specifying unit specifies the false transmission point based on the elevation/depression angle information and the azimuth angle information in the laser irradiation as the irradiation information.

In this way, the image processing apparatus can accurately identify the false transmission point by using the elevation/depression angle information and the azimuth angle information included in the irradiation information.

Further, the specifying unit specifies two point cloud data having substantially the same azimuth angle information among the point cloud data, and the elevation/depression angle information corresponding to the two point cloud data and the two point cloud data are included in the image. False transmission points are identified by comparing the value of the vertical axis coordinate when projected upward.

Further, the specifying unit specifies two point cloud data having substantially the same elevation/depression angle information among the point cloud data, and also specifies the azimuth angle information corresponding to the two point cloud data and the two point cloud data as an image. False transmission points may be identified by comparing the value of the abscissa when projected upward.

In this way, the image processing apparatus compares the irradiation information with the coordinates on the image and determines whether the coordinates contradict the irradiation information, so that the false transmission point can be specified with high accuracy.

(4. Hardware configuration)
An information processing apparatus such as the image processing apparatus 100 according to the present disclosure described above is realized by a computer 1000 configured as shown in FIG. 9, for example. FIG. 9 is a hardware configuration diagram showing an example of a computer 1000 that implements the functions of the image processing apparatus 100 according to the present disclosure. The image processing apparatus 100 according to the embodiment will be described below as an example of the computer 1000 . The computer 1000 has a CPU 1100 , a RAM 1200 , a ROM (Read Only Memory) 1300 , a HDD (Hard Disk Drive) 1400 , a communication interface 1500 and an input/output interface 1600 . Each part of computer 1000 is connected by bus 1050 .

The CPU 1100 operates based on programs stored in the ROM 1300 or HDD 1400 and controls each section. For example, the CPU 1100 loads programs stored in the ROM 1300 or HDD 1400 into the RAM 1200 and executes processes corresponding to various programs.

The ROM 1300 stores a boot program such as BIOS (Basic Input Output System) executed by the CPU 1100 when the computer 1000 is started, and programs dependent on the hardware of the computer 1000.

The HDD 1400 is a computer-readable recording medium that non-temporarily records programs executed by the CPU 1100 and data used by such programs. Specifically, HDD 1400 is a recording medium that records an image processing program according to the present disclosure, which is an example of program data 1450 .

A communication interface 1500 is an interface for connecting the computer 1000 to an external network 1550 (for example, the Internet). For example, CPU 1100 receives data from another device via communication interface 1500, and transmits data generated by CPU 1100 to another device.

The input/output interface 1600 is an interface for connecting the input/output device 1650 and the computer 1000 . For example, the CPU 1100 receives data from input devices such as a keyboard and mouse via the input/output interface 1600 . The CPU 1100 also transmits data to an output device such as a display, speaker, or printer via the input/output interface 1600 . Also, the input/output interface 1600 may function as a media interface for reading a program or the like recorded on a predetermined recording medium. Media include, for example, optical recording media such as DVD (Digital Versatile Disc) and PD (Phase change rewritable disk), magneto-optical recording media such as MO (Magneto-Optical disk), tape media, magnetic recording media, semiconductor memories, etc. is.

For example, when the computer 1000 functions as the image processing apparatus 100 according to the embodiment, the CPU 1100 of the computer 1000 implements the functions of the control unit 130 and the like by executing an image processing program loaded onto the RAM 1200. The HDD 1400 also stores an image processing program according to the present disclosure and data in the storage unit 120 . Although CPU 1100 reads and executes program data 1450 from HDD 1400 , as another example, these programs may be obtained from another device via external network 1550 .

Note that the present technology can also take the following configuration.
(1)
An acquisition unit that acquires point cloud data indicating that a surrounding object has been detected from LiDAR (Light Detection and Ranging), which is a sensor using a laser;
When the point cloud data is superimposed on the image including the irradiation range of the laser by the LiDAR in the imaging range, based on the irradiation information of the LiDAR, out of the superimposed point cloud data, actually in the image a specifying unit that specifies a false transmission point that is a point that is not irradiated on the object;
An image processing device comprising:
(2)
a generating unit that removes the identified false transmission points and generates the image in which the point cloud data excluding the removed false transmission points is superimposed;
The image processing apparatus according to (1), further comprising:
(3)
An imaging unit that captures an image including the laser irradiation range of the LiDAR in the imaging range,
The identification unit
identifying the false transmission point based on the irradiation information of the LiDAR when the point cloud data is superimposed on the image captured by the imaging unit;
The image processing apparatus according to (1) or (2) above.
(4)
The identification unit
Identifying the false transmission point based on elevation/depression angle information and azimuth angle information in the irradiation of the laser as the irradiation information;
The image processing apparatus according to any one of (1) to (3) above.
(5)
The identification unit
Among the point cloud data, two point cloud data having substantially the same azimuth angle information are specified, and elevation/depression angle information corresponding to the two point cloud data and the two point cloud data are displayed on the image. Identifying the false transmission point by comparing the value of the vertical axis coordinate when projected;
The image processing device according to (4) above.
(6)
The identification unit
Among the point cloud data, two point cloud data having substantially the same elevation/depression angle information are specified, and azimuth angle information corresponding to the two point cloud data and the two point cloud data are displayed on the image. Identifying the false transmission point by comparing the value of the abscissa coordinate when projected;
The image processing device according to (4) above.
(7)
the computer
From LiDAR (Light Detection and Ranging), which is a sensor using a laser, acquire point cloud data indicating that the surrounding objects have been detected,
When the point cloud data is superimposed on the image including the irradiation range of the laser by the LiDAR in the imaging range, based on the irradiation information of the LiDAR, out of the superimposed point cloud data, actually in the image identifying false transmission points, which are points not illuminated on the object;
An image processing method comprising:
(8)
the computer,
An acquisition unit that acquires point cloud data indicating that a surrounding object has been detected from LiDAR (Light Detection and Ranging), which is a sensor using a laser;
When the point cloud data is superimposed on the image including the irradiation range of the laser by the LiDAR in the imaging range, based on the irradiation information of the LiDAR, out of the superimposed point cloud data, actually in the image a specifying unit that specifies a false transmission point that is a point that is not irradiated on the object;
An image processing program for functioning as an image processing device comprising:

1 vehicle 100 image processing device 110 communication unit 120 storage unit 130 control unit 131 acquisition unit 132 imaging unit 133 identification unit 134 generation unit 140 detection unit 150 LiDAR
160 camera

Claims

An acquisition unit that acquires point cloud data indicating that a surrounding object has been detected from LiDAR (Light Detection and Ranging), which is a sensor using a laser;
When the point cloud data is superimposed on the image including the irradiation range of the laser by the LiDAR in the imaging range, based on the irradiation information of the LiDAR, out of the superimposed point cloud data, actually in the image a specifying unit that specifies a false transmission point that is a point that is not irradiated on the object;
An image processing device comprising:
a generating unit that removes the identified false transmission points and generates the image in which the point cloud data excluding the removed false transmission points is superimposed;
The image processing apparatus according to claim 1, further comprising:
An imaging unit that captures an image including the laser irradiation range of the LiDAR in the imaging range,
The identification unit
identifying the false transmission point based on the irradiation information of the LiDAR when the point cloud data is superimposed on the image captured by the imaging unit;
The image processing apparatus according to claim 1.
The identification unit
Identifying the false transmission point based on elevation/depression angle information and azimuth angle information in the irradiation of the laser as the irradiation information;
The image processing apparatus according to claim 1.
The identification unit
Among the point cloud data, two point cloud data having substantially the same azimuth angle information are specified, and elevation/depression angle information corresponding to the two point cloud data and the two point cloud data are displayed on the image. Identifying the false transmission point by comparing the value of the vertical axis coordinate when projected;
The image processing apparatus according to claim 4.
The identification unit
Among the point cloud data, two point cloud data having substantially the same elevation/depression angle information are specified, and azimuth angle information corresponding to the two point cloud data and the two point cloud data are displayed on the image. Identifying the false transmission point by comparing the value of the abscissa coordinate when projected;
The image processing apparatus according to claim 4.
the computer
From LiDAR (Light Detection and Ranging), which is a sensor using a laser, acquire point cloud data indicating that the surrounding objects have been detected,
When the point cloud data is superimposed on the image including the irradiation range of the laser by the LiDAR in the imaging range, based on the irradiation information of the LiDAR, out of the superimposed point cloud data, actually in the image identifying false transmission points, which are points not illuminated on the object;
An image processing method comprising:
the computer,
An acquisition unit that acquires point cloud data indicating that a surrounding object has been detected from LiDAR (Light Detection and Ranging), which is a sensor using a laser;
When the point cloud data is superimposed on the image including the irradiation range of the laser by the LiDAR in the imaging range, based on the irradiation information of the LiDAR, out of the superimposed point cloud data, actually in the image a specifying unit that specifies a false transmission point that is a point that is not irradiated on the object;
An image processing program for functioning as an image processing device comprising: