WO2021025364A1

WO2021025364A1 - Method and system using lidar and camera to enhance depth information about image feature point

Info

Publication number: WO2021025364A1
Application number: PCT/KR2020/009992
Authority: WO
Inventors: 이동환; 김덕화
Original assignee: 네이버랩스 주식회사
Priority date: 2019-08-02
Filing date: 2020-07-29
Publication date: 2021-02-11
Also published as: KR20210015516A

Abstract

Disclosed are a method and system that use LiDAR and a camera to enhance depth information about an image feature point. The method for enhancing depth information about an image feature point includes the steps of: acquiring, through a camera and a LiDAR sensor, an image and LiDAR scan data for creating a map; and using the LiDAR scan data together with the image to acquire depth information about a feature point extracted from the image.

Description

Method and system for improving depth information of image feature points using lidar and camera

The following description relates to a technology for generating a three-dimensional structure using an image.

Mobile robots must not only be able to grasp their location within a given environment, but also must be able to map their surrounding screens by themselves when they are placed in a new environment that they have not experienced before.

Mapping of a mobile robot means generating pre-data required for autonomous driving of the mobile robot. A robot positioning a position using a lidar needs 3D precise data, and a robot positioning a position based on an image needs data composed of image-based 3D feature points.

As an example of a mobile robot mapping technology, Korean Patent Laid-Open Publication No. 10-2011-0001932 (published on January 06, 2011) identifies the location of obstacles in an indoor space while moving the robot in an arbitrary direction. A technology for creating an indoor map by displaying the location of an obstacle on a map is disclosed.

It provides a method and system that can generate a three-dimensional structure for map creation using LiDAR and a camera.

A method and system capable of accurately obtaining the depth of a corresponding image by using lidar scan data with an image acquired by a camera are provided.

Provides a method and system for refining the depth value using a lidar-based SLAM (Simultaneous Localization And Map-Building) and an image.

A method executed on a computer system, the computer system comprising at least one processor configured to execute computer readable instructions contained in a memory, the method comprising, by the at least one processor, a camera and a lidar ( Obtaining an image for map creation and LiDAR scan data through a LiDAR) sensor; And acquiring, by the at least one processor, depth information of a feature point extracted from the image by using the lidar scan data together with the image.

According to an aspect, in the acquiring of the image and lidar scan data, the image and lidar scan data of the same time zone may be acquired using a timestamp.

According to another aspect, the method may further include generating, by the at least one processor, a 3D map to which a pose is tagged using the lidar scan data.

According to another aspect, in the generating step, a point cloud map to which a pose is tagged may be generated by performing Simultaneous Localization And Map-Building (SLAM) using the lidar scan data.

According to another aspect, the obtaining of depth information of a feature point extracted from the image may include: extracting a feature point having a feature that is invariant to rotation and scale from the image; Configuring a point cloud by accumulating the lidar scan data acquired at the same time as the image; And determining a point existing on an extension line connecting the lens of the camera and the feature point in the point cloud as a three-dimensional coordinate of the feature point.

According to another aspect, the step of obtaining depth information of the feature point extracted from the image further comprises extracting a point cloud of a view frustum corresponding to the pose information of the image from the point cloud, In the determining of the three-dimensional coordinates of the feature point, the three-dimensional coordinates of the feature point may be determined in the extracted point cloud of the view area.

According to another aspect, in the determining of the three-dimensional coordinates of the feature point, the three-dimensional coordinates of the feature point may be determined by changing the coordinates of the image and the point cloud into a spherical coordinate system.

According to another aspect, the determining of the three-dimensional coordinates of the feature points includes finding an intersection point between the line of sight and the object in the direction of the line of sight passing through the feature points using ray casting, and determining the three-dimensional coordinates of the feature points. You can decide.

According to another aspect, the determining of the three-dimensional coordinates of the feature point may include finding the three-dimensional coordinates of the feature point in the point cloud through NNS (nearest neigbor search) using a multidimensional tree structure (KD-tree). .

According to another aspect, the method may further include, by the at least one processor, refining depth information of the redundantly extracted feature points using a plurality of images having depth information.

There is provided a computer program stored in a non-transitory computer-readable recording medium for executing the method on the computer system.

It provides a non-transitory computer-readable recording medium in which a program for executing the method in a computer is recorded.

A computer system comprising at least one processor configured to execute computer-readable instructions contained in a memory, wherein the at least one processor includes an image and a lidar for mapping through a camera and a LiDAR sensor. A data acquisition unit that acquires scan data; And a depth information acquisition unit that acquires depth information of a feature point extracted from the image by using the lidar scan data together with the image.

According to embodiments of the present invention, it is possible to create a more accurate map by creating a 3D structure using a lidar and a camera.

According to embodiments of the present invention, depth information of a three-dimensional structure may be improved by obtaining a depth of an image as a real date using lidar scan data.

According to embodiments of the present invention, an optimized 3D structure may be obtained by refining a depth value using a lidar-based SLAM and an image.

1 is a block diagram illustrating an example of an internal configuration of a computer system according to an embodiment of the present invention.

2 is a diagram illustrating an example of components that may be included in a processor of a computer system according to an embodiment of the present invention.

3 is a flowchart illustrating an example of a method for enhancing depth information that can be performed by a computer system according to an embodiment of the present invention.

4 is a view for explaining an example of a pose tagging process through the LIDA SLAM in an embodiment of the present invention.

5 is a view for explaining an example of a process of obtaining depth information of a feature point of an image according to an embodiment of the present invention.

6 is an exemplary diagram for explaining a process of extracting a feature point of an image according to an embodiment of the present invention.

7 is an exemplary diagram for explaining a point cloud configured by LiDAR scan data in an embodiment of the present invention.

8 is an exemplary diagram for explaining a view area culling process for a LiDAR point cloud according to an embodiment of the present invention.

9 is an exemplary view for explaining a process of changing the coordinates of feature points and point clouds of an image into a spherical coordinate system in an embodiment of the present invention.

10 is an exemplary diagram for explaining a ray casting-based depth association process according to an embodiment of the present invention.

11 shows an example of a multidimensional tree structure according to an embodiment of the present invention.

12 is an exemplary diagram for explaining a nearest neigbor search (NNS) process according to an embodiment of the present invention.

Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

Embodiments of the present invention relate to a technology for generating a three-dimensional structure using an image.

Embodiments including those specifically disclosed in the present specification can obtain a 3D structure required for image-based positioning using images and lidars, and through this, various aspects such as accuracy, precision, optimization, and generation speed of the map can be obtained. To achieve significant advantages.

1 is a block diagram showing an example of a computer system according to an embodiment of the present invention. For example, the depth information enhancement system according to embodiments of the present invention may be implemented by the computer system 100 illustrated through FIG. 1.

As shown in FIG. 1, the computer system 100 is a component for executing the method for improving depth information according to embodiments of the present invention, and includes a memory 110, a processor 120, a communication interface 130, and input/output. It may include an interface 140.

The memory 110 is a computer-readable recording medium and may include a permanent mass storage device such as a random access memory (RAM), read only memory (ROM), and a disk drive. Here, a non-destructive large-capacity recording device such as a ROM and a disk drive may be included in the computer system 100 as a separate permanent storage device separated from the memory 110. In addition, an operating system and at least one program code may be stored in the memory 110. These software components may be loaded into the memory 110 from a computer-readable recording medium separate from the memory 110. Such a separate computer-readable recording medium may include a computer-readable recording medium such as a floppy drive, disk, tape, DVD/CD-ROM drive, and memory card. In another embodiment, software components may be loaded into the memory 110 through a communication interface 130 other than a computer-readable recording medium. For example, software components may be loaded into the memory 110 of the computer system 100 based on a computer program installed by files received over the network 160.

The processor 120 may be configured to process instructions of a computer program by performing basic arithmetic, logic, and input/output operations. Instructions may be provided to the processor 120 by the memory 110 or the communication interface 130. For example, the processor 120 may be configured to execute a command received according to a program code stored in a recording device such as the memory 110.

The communication interface 130 may provide a function for the computer system 100 to communicate with other devices through the network 160. For example, a request, command, data, file, etc., generated by the processor 120 of the computer system 100 according to a program code stored in a recording device such as the memory 110, is transmitted to the network according to the control of the communication interface 130. 160) can be delivered to other devices. Conversely, signals, commands, data, files, etc. from other devices may be received by the computer system 100 through the communication interface 130 of the computer system 100 via the network 160. Signals, commands, data, etc. received through the communication interface 130 may be transmitted to the processor 120 or the memory 110, and files, etc. may be further included in the computer system 100 (as described above). Permanent storage).

The communication method is not limited, and not only a communication method using a communication network (for example, a mobile communication network, wired Internet, wireless Internet, broadcasting network) that the network 160 may include, but also short-range wired/wireless communication between devices may be included. have. For example, the network 160 includes a personal area network (PAN), a local area network (LAN), a campus area network (CAN), a metropolitan area network (MAN), a wide area network (WAN), and a broadband network (BBN). , Internet, and the like. In addition, the network 160 may include any one or more of a network topology including a bus network, a star network, a ring network, a mesh network, a star-bus network, a tree or a hierarchical network, etc. Not limited.

The input/output interface 140 may be a means for an interface with the input/output device 150. For example, the input device may include a device such as a microphone, keyboard, camera, or mouse, and the output device may include a device such as a display and a speaker. As another example, the input/output interface 140 may be a means for interfacing with a device in which input and output functions are integrated into one, such as a touch screen. The input/output device 150 may be configured with the computer system 100 and one device.

Further, in other embodiments, the computer system 100 may include fewer or more components than the components of FIG. 1. However, there is no need to clearly show most of the prior art components. For example, the computer system 100 may be implemented to include at least some of the input/output devices 150 described above, or other components such as a transceiver, a global positioning system (GPS) module, a camera, various sensors, and a database. May include more. As a more specific example, the computer system 100 may be implemented in the form of a mobile robot for creating a map, and a camera module, a lidar sensor, an acceleration sensor or a gyro sensor, various physical buttons, and buttons using a touch panel required for mapping , Input/output ports, and the like may be implemented to be further included in the computer system 100.

The present invention relates to an image-based positioning (visual localization) technology, and is a technology applicable to both indoor and outdoor maps.

For image-based positioning, it is necessary to know the 3D structure of the corresponding area. Here, the 3D structure may include 3D coordinates corresponding to a specific pixel of the image, and may also include a depth value of the image. In other words, knowing the three-dimensional structure is equivalent to knowing the three-dimensional coordinates and depth values of an image.

In general, a 3D structure can be obtained using an RGB image through a technology such as SfM (structure from motion). However, in the case of using only an image, a scale value cannot be obtained, and in the case of a depth value, it is only an estimated value, so accuracy is poor.

Although it is possible to use an RGB-D image sensor with a depth sensor attached, the RGB-D image sensor has a disadvantage in that the noise of the depth value is very severe, and the range that can be sensed for the depth is limited. It cannot be used for outdoor mapping.

Embodiments of the present invention relate to a method of accurately obtaining the depth of an image using lidar scan data together with an image.

FIG. 2 is a diagram showing an example of components that can be included in a processor of a computer system according to an embodiment of the present invention, and FIG. 3 is depth information that can be performed by a computer system according to an embodiment of the present invention. It is a flow chart showing an example of the improvement method.

As shown in FIG. 2, the processor 120 may include a data acquisition unit 201, a map generation unit 202, a depth information acquisition unit 203, and a depth information refiner 204. Components of the processor 120 may be expressions of different functions performed by the processor 120 according to a control command provided by at least one program code. For example, the data acquisition unit 201 may be used as a functional representation that the processor 120 operates to control the computer system 100 to acquire data from a camera and a lidar.

The processor 120 and components of the processor 120 may perform steps S310 to S340 included in the method of improving depth information of FIG. 3. For example, the processor 120 and the components of the processor 120 may be implemented to execute an instruction according to the code of the operating system included in the memory 110 and the at least one program code described above. Here, at least one program code may correspond to a code of a program implemented to process the depth information enhancement method.

The depth information enhancement method may not occur in the illustrated order, and some of the steps may be omitted or an additional process may be further included.

The processor 120 may load a program code stored in a program file for a method of improving depth information into the memory 110. For example, a program file for a method for improving depth information may be stored in a permanent storage device separate from the memory 110, and the processor 120 stores program codes from a program file stored in a permanent storage device through a bus. Computer system 100 can be controlled to be loaded onto 110. At this time, each of the data acquisition unit 201, the map generation unit 202, the depth information acquisition unit 203, and the depth information refiner 204 included in the processor 120 and the processor 120 is a memory 110 It may be different functional expressions of the processor 120 for executing the subsequent steps (S310 to S340) by executing an instruction of a corresponding part of the program code loaded in the. In order to execute the steps S310 to S340, the processor 120 and components of the processor 120 may directly process an operation according to a control command or control the computer system 100.

In step S310, the data acquisition unit 201 may acquire an image for creating a map through the camera and acquire lidar scan data through a lidar sensor. In the present application, a 3D structure may be generated by reconstructing 3D points for pixels corresponding to the camera from the camera image and lidar scan data. To this end, the data acquisition unit 201 may acquire an image and lidar scan data together through a mobile robot for creating an indoor or outdoor map. The depth information obtained from the lidar sensor is very accurate as an actual measurement value, so it can be used not only for indoor maps but also for outdoor maps. In this case, the data acquisition unit 201 may simultaneously obtain a camera image and lidar scan data for each timestamp. In other words, the data acquisition unit 201 may acquire an image and lidar scan data in the same time zone using the time stamp.

In step S320, the map generator 202 may generate a 3D map to which a pose is tagged by using the lidar scan data acquired in step S310. Visual localization (VL) may be performed based on pose information (including a 3-axis position value and a 3-axis direction value) of an image generated by the map generator 202.

As an example, referring to FIG. 4, the map generator 202 may obtain pose information of images by performing a lidar SLAM using lidar scan data, and poses of the images are tagged as point cloud maps. cloud map). As the LIDA SLAM is performed, a 3D map is created. In this case, a point cloud map in which the poses of each image are tagged is created on the 3D structure of the map according to pose tagging through the LIDA SLAM.

The pose information of the image and the 3D structure made of the lidar sensor can be used to obtain a 3D point corresponding to each pixel of the image.

In FIG. 3 again, in step S330, the depth information acquisition unit 203 may acquire depth information of a feature point extracted from the corresponding image by using the image and lidar scan data in the same time zone. In this case, the depth information acquisition unit 203 may obtain depth information of the image by calculating a vector from the lens of the camera to a specific pixel of the image using the parameter of the camera. According to the present invention, it is possible to create a more accurate map by accurately obtaining depth information of an image using the lidar scan data. A process of obtaining depth information using lidar scan data will be described in detail below.

In step S340, the depth information refiner 204 may refine depth information by using a plurality of images having depth information. This is to solve the problem that the feature points extracted from one image are redundantly extracted from the next image, and depth information can be optimized by modifying the previously calculated 3D points using the 3D points of each feature point and the poses of the images. have. For example, the depth information refiner 204 refines the depth value of the duplicated feature point by determining the camera parameter and the absolute coordinate corresponding to the given 3D point using an optimization technique such as a bundle adjustment method. can do. At this time, the depth information refiner 204 uses the pixel error of the images based on the three-dimensional points of the plurality of images and the feature points extracted from each image, and the relationship to pose the points and images in a direction to optimize the error. Can be estimated.

Referring to FIG. 5, in step S501, the depth information acquisition unit 203 may extract a keypoint corresponding to a specific pixel from an image. In this case, the feature point refers to a pixel having a characteristic that is invariant to rotation and scale on an image. For example, the depth information acquisition unit 203 includes Scale-Invariant Feature Transform (SIFT) and Features (FAST). From Accelerated Segment Test), BRIEF (Binary Robust Independent Elementary Features), ORB (Oriented FAST and Rotated BRIEF), the feature points can be extracted from the image using widely used feature point extraction algorithms.

6 shows an example of a result of extracting a feature point, and a feature point 601 that is invariant in rotation and size on the image 600 can be identified.

In FIG. 5 again, in step S502, the depth information acquisition unit 203 may configure a point cloud by accumulating lidar scan data acquired within a specific time based on the time point at which the image was acquired. The depth information acquisition unit 203 may configure a lidar point cloud using lidar scan data acquired at the same time point, that is, at the time point at which the image is acquired.

FIG. 7 shows an example of a lidar point cloud 700 configured with lidar scan data acquired at the same time for the image 600 of FIG. 6.

In FIG. 5 again, in step S503, the depth information acquisition unit 203 may extract a point cloud of a view frustum corresponding to the pose information of the image from the point cloud configured in step S502. The depth information acquisition unit 203 may extract only a point cloud suitable for a corresponding view by using the pose information of the image without using all the point clouds to optimize the computation time.

Referring to FIG. 8, the depth information acquisition unit 203 filters the low point cloud 700 configured by the lidar scan data acquired at the same time as the image according to the field of view (FOV) of the camera to provide image pose information. A culled point cloud 800 according to the view area of may be obtained.

In FIG. 5 again, in step S504, the depth information acquisition unit 203 determines a point existing on the extension line connecting the lens of the camera and the feature point extracted from the image in the point cloud extracted in step S503. It can be used as a 3D coordinate corresponding to information.

To this end, first, the depth information acquisition unit 203 converts the image coordinate p and the point cloud map of the feature point into a spherical coordinate system as shown in FIG. 9.

Can be changed to.

Rectangular coordinate system (x,y,z)

Changing to is the same as in Equation 1.

[Equation 1]

Spherical coordinate system

To change to the Cartesian coordinate system (x,y,z) is shown in Equation 2.

[Equation 2]

At this time, the depth information acquisition unit 203

Ignore

The nearest point can be used as a 3D point of a feature point by using only.

The depth information acquisition unit 203 may acquire depth information by performing depth association based on ray casting. The ray projection method examines a view ray passing through a pixel point in an image, and checks whether a scene object intersects the line of sight to find the closest intersection point to the pixel for each pixel.

Referring to FIG. 10, the ray projection method is for detecting the view plane, and assuming that the line of sight starts from the lens of the camera, the closest object blocking the line of sight path passing through one pixel point of the image is found. . That is, if the surface of an object is facing a light source and light hits this surface, the light is either unblocked or creates a shadow. In this case, the shading of the surface can be calculated using a widely known shading technique. One of the shading techniques, the depth buffer algorithm, processes surfaces one at a time and calculates depth values for all projection points. The calculated depth of the surface is compared with the previously stored depth value to determine the surface visible from the pixel. These ray projections process each pixel in the image, one by one, and calculate the depth of any surface on the projection path leading to that pixel.

Furthermore, it is important to quickly find a point in order to acquire depth information of a feature point using a point cloud.

The depth information acquisition unit 203 performs NNS (nearest neigbor search) using a multidimensional tree structure (KD-tree) to quickly find a point to be used as a 3D point of a feature point in the point cloud extracted in step S503. can do.

11 shows an example of a multidimensional tree structure for NNS. 11 illustrates a two-dimensional space division for convenience of description.

The multi-dimensional tree structure is an extension of the binary search tree (BST) to a multi-dimensional space, and is used to compare the tree levels in turn. The multidimensional tree structure has a sub-hierarchical structure that divides an area along one axis at a time and changes the axis in a circular manner at each sub-level.

Referring to FIG. 11, the X axis is first divided with respect to the Y plane, and then the Y axis is divided with respect to the X plane. Assuming that the dividing plane is selected in the order of X→Y, all points with x coordinates less than or equal to x ₀₀ selected from the highest root can be configured as left nodes, and points larger than x coordinates can be configured as right nodes. The points of the lower node are divided using y coordinate values y ₀₁ and y ₁₁ . Among the subscripts of the coordinate values, the first subscript indicates the position of the node within the same level starting from 0 for the leftmost node, and the second subscript indicates the level where the node is located.

12 shows a process of finding a point closest to the pixel coordinate P using the tree structure of FIG. 11.

Referring to FIG. 12, moving from a node including P to a neighboring node starts by finding a point closest to P. A sphere centered on P can be used to find the smallest of the squared values of the distance between two points. If the node information overlaps the sphere centered on P, the squared distance between P and the point on the node is calculated and the lowest value. Save it. After repeating the search for all nodes overlapping with the sphere centered on P is completed, the smallest value among the stored values may be determined as the nearest neighbor point to P.

Accordingly, the depth information acquisition unit 203 may more quickly find the 3D point of the feature point in the point cloud through the NNS using the multidimensional tree structure (KD-tree).

As described above, according to embodiments of the present invention, a more accurate map can be created by generating a three-dimensional structure using a lidar and a camera, and in particular, the depth of an image is determined using lidar scan data obtained through a lidar sensor. Since it can be obtained as an actual value, the depth information of the three-dimensional structure can be improved.

The apparatus described above may be implemented as a hardware component, a software component, and/or a combination of a hardware component and a software component. For example, the devices and components described in the embodiments are a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable gate array (FPGA), a programmable gate array (PLU). It may be implemented using one or more general purpose computers or special purpose computers, such as a logic unit), a microprocessor, or any other device capable of executing and responding to instructions. The processing device may execute an operating system (OS) and one or more software applications executed on the operating system. In addition, the processing device may access, store, manipulate, process, and generate data in response to the execution of software. For the convenience of understanding, although it is sometimes described that one processing device is used, one of ordinary skill in the art, the processing device is a plurality of processing elements and/or multiple types of processing elements. It can be seen that it may include. For example, the processing device may include a plurality of processors or one processor and one controller. In addition, other processing configurations are possible, such as a parallel processor.

The software may include a computer program, code, instructions, or a combination of one or more of these, configuring the processing unit to behave as desired or processed independently or collectively. You can command the device. Software and/or data may be embodyed in any type of machine, component, physical device, computer storage medium or device to be interpreted by the processing device or to provide instructions or data to the processing device. have. The software may be distributed over networked computer systems and stored or executed in a distributed manner. Software and data may be stored on one or more computer-readable recording media.

The method according to the embodiment may be implemented in the form of program instructions that can be executed through various computer means and recorded in a computer-readable medium. In this case, the medium may be one that continuously stores a program executable by a computer, or temporarily stores a program for execution or download. In addition, the medium may be a variety of recording means or storage means in a form in which a single or several pieces of hardware are combined, but is not limited to a medium directly connected to a computer system, but may be distributed on a network. Examples of media include magnetic media such as hard disks, floppy disks, and magnetic tapes, optical recording media such as CD-ROMs and DVDs, magnetic-optical media such as floptical disks, and And a ROM, RAM, flash memory, and the like, and may be configured to store program instructions. In addition, examples of other media include an app store that distributes applications, a site that supplies or distributes various software, and a recording medium or a storage medium managed by a server.

As described above, although the embodiments have been described by the limited embodiments and the drawings, various modifications and variations are possible from the above description to those of ordinary skill in the art. For example, the described techniques are performed in a different order from the described method, and/or components such as a system, structure, device, circuit, etc. described are combined or combined in a form different from the described method, or other components Alternatively, even if substituted or substituted by an equivalent, an appropriate result can be achieved.

Therefore, other implementations, other embodiments, and claims and equivalents fall within the scope of the claims to be described later.

Claims

In the method executed on a computer system,

The computer system includes at least one processor configured to execute computer readable instructions contained in a memory,

The above method,

Acquiring, by the at least one processor, an image for map creation and LiDAR scan data through a camera and a LiDAR sensor; And

Obtaining, by the at least one processor, depth information of a feature point extracted from the image by using the lidar scan data together with the image

How to include.
The method of claim 1,

The step of obtaining the image and lidar scan data,

Acquiring the image and the lidar scan data in the same time zone using a timestamp

The method characterized by.
The method of claim 1,

The above method,

Generating, by the at least one processor, a 3D map to which a pose is tagged using the lidar scan data

How to further include.
The method of claim 3,

The generating step,

Creating a point cloud map to which poses are tagged by performing SLAM (Simultaneous Localization And Map-Building) using the lidar scan data

The method characterized by.
The method of claim 1,

The step of obtaining depth information of the feature point extracted from the image,

Extracting feature points having characteristics that are invariant to rotation and scale from the image;

Configuring a point cloud by accumulating the lidar scan data acquired at the same time as the image; And

Determining a point existing on an extension line connecting the lens of the camera and the feature point in the point cloud as a three-dimensional coordinate of the feature point

How to include.
The method of claim 5,

The step of obtaining depth information of the feature point extracted from the image,

Extracting a point cloud of a view frustum corresponding to the pose information of the image from the point cloud

Including more,

The step of determining the three-dimensional coordinates of the feature point,

Determining the three-dimensional coordinates of the feature point in the extracted point cloud of the view area

The method characterized by.
The method of claim 5,

The step of determining the three-dimensional coordinates of the feature point,

Changing the coordinates of the image and the point cloud into a spherical coordinate system to determine the three-dimensional coordinates of the feature point

The method characterized by.
The method of claim 5,

The step of determining the three-dimensional coordinates of the feature point,

Using ray casting to determine the three-dimensional coordinates of the feature point by finding the intersection between the line of sight and the object in the direction of the line of sight passing through the feature point

The method characterized by.
The method of claim 5,

The step of determining the three-dimensional coordinates of the feature point,

Finding the three-dimensional coordinates of the feature point in the point cloud through NNS (nearest neigbor search) using a multidimensional tree structure (KD-tree)

The method characterized by.
The method of claim 1,

The above method,

Refining, by the at least one processor, depth information of duplicated feature points using a plurality of images having depth information

How to further include.
A computer program stored on a non-transitory computer readable recording medium for executing the method of any one of claims 1 to 10 on the computer system.
A non-transitory computer-readable recording medium in which a program for executing the method of any one of claims 1 to 10 on a computer is recorded.
In a computer system,

At least one processor configured to execute computer readable instructions contained in memory

Including,

The at least one processor,

A data acquisition unit that acquires an image for map creation and LiDAR scan data through a camera and a LiDAR sensor; And

A depth information acquisition unit that acquires depth information of a feature point extracted from the image by using the lidar scan data together with the image

Computer system comprising a.
The method of claim 13,

The at least one processor,

A map generator that generates a 3D map with a pose tagging using the lidar scan data

Computer system further comprising a.
The method of claim 13,

The at least one processor,

A depth information refiner that refines the depth information of the duplicated feature points using a plurality of images with depth information

Computer system further comprising a.
The method of claim 13,

The data acquisition unit,

Acquiring the image and the lidar scan data in the same time zone using a timestamp

Computer system, characterized in that.
The method of claim 13,

The depth information acquisition unit,

Extracting feature points having characteristics that are invariant to rotation and size from the image,

Accumulate the lidar scan data acquired at the same time as the image to form a point cloud,

Determining a point existing on an extension line connecting the lens of the camera and the feature point in the point cloud as a three-dimensional coordinate of the feature point

Computer system, characterized in that.
The method of claim 17,

The depth information acquisition unit,

Extracting a point cloud of a view area corresponding to the pose information of the image from the point cloud and determining the three-dimensional coordinates of the feature point from the extracted point cloud of the view area

Computer system, characterized in that.
The method of claim 17,

The depth information acquisition unit,

Using a ray projection method to determine the three-dimensional coordinates of the feature point by finding the intersection point between the line of sight and the object in the direction of the line of sight passing through the feature point

Computer system, characterized in that.
The method of claim 17,

The depth information acquisition unit,

Finding the three-dimensional coordinates of the feature point in the point cloud through NNS (nearest neigbor search) using a multidimensional tree structure (KD-tree)

Computer system, characterized in that.