WO2023224457A1 - Procédé d'obtention d'un point caractéristique d'une carte de profondeur - Google Patents

Procédé d'obtention d'un point caractéristique d'une carte de profondeur Download PDF

Info

Publication number
WO2023224457A1
WO2023224457A1 PCT/KR2023/009383 KR2023009383W WO2023224457A1 WO 2023224457 A1 WO2023224457 A1 WO 2023224457A1 KR 2023009383 W KR2023009383 W KR 2023009383W WO 2023224457 A1 WO2023224457 A1 WO 2023224457A1
Authority
WO
WIPO (PCT)
Prior art keywords
depth map
feature point
depth
data
obtaining
Prior art date
Application number
PCT/KR2023/009383
Other languages
English (en)
Korean (ko)
Inventor
최성광
Original Assignee
주식회사 브이알크루
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 주식회사 브이알크루 filed Critical 주식회사 브이알크루
Publication of WO2023224457A1 publication Critical patent/WO2023224457A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • This disclosure relates to artificial intelligence technology, and more specifically, to a method for acquiring feature points of a depth map using an artificial intelligence-based model.
  • machine-learning technology including deep-learning
  • deep-learning has received attention by showing results that exceed the performance of existing methods in analyzing various types of data such as video, voice, and text.
  • machine learning technology is being introduced and utilized in various fields due to the scalability and flexibility inherent in the technology itself.
  • the image-related field is one of the fields that is most actively introducing machine learning technology in various detailed fields such as object recognition and object classification.
  • Visual localization refers to a technology that identifies the surrounding environment through a camera and determines where the camera user is currently located based on this. This visual positioning can estimate in real time the location of the user who took the image and the direction the camera is looking (camera pose) from the image. Such visual localization can be performed by analyzing images captured by a camera, for example, by acquiring feature points from a depth map using a feature point acquisition model. In relation to this, Republic of Korea Patent No. 10-2014093 (registered on August 20, 2019) has been issued.
  • This disclosure was developed in response to the above-described background technology, and seeks to provide a method for acquiring feature points of a depth map using an artificial intelligence-based model.
  • a method for obtaining feature points of a depth map performed by a computing device, the method comprising: obtaining a first depth map step; Using a function determined based on a method of acquiring the first depth map or a pre-learned transformation model, the first depth map is configured such that the value of the first depth map corresponds to the value of at least one pre-stored reference depth map.
  • obtaining the first depth map includes receiving the first depth map and device information from a device, and obtaining the second depth map corresponds to the device information.
  • Obtaining the second depth map by converting the first depth map so that the value of the first depth map corresponds to the value of the at least one pre-stored reference depth map, using the function. You can.
  • obtaining the first depth map may include receiving an image in a format other than a depth map from a device; and obtaining the first depth map from the image using a pre-learned depth acquisition model.
  • the depth range of the first depth map may be cut-off from the initial depth range of the first depth map based on a predetermined range.
  • the depth range of the first depth map may be calculated using a mapping function that maps to a predetermined range.
  • matching the final feature point with a feature point of the at least one reference depth map respectively; and determining at least one of the location or pose of the device related to the first depth map based on a matching reference depth map that matches the final feature point among the at least one reference depth map.
  • the at least one reference depth map may be obtained from a 3D model generated based on spatial information about a space related to the first depth map scanned or photographed using an imaging device.
  • a computer program stored in a computer-readable storage medium wherein the computer program is configured to be used by a processor of a computing device to acquire feature points of a depth map.
  • instructions for performing the following steps which include: obtaining a first depth map; Using a function determined based on a method of acquiring the first depth map or a pre-learned transformation model, the first depth map is configured such that the value of the first depth map corresponds to the value of at least one pre-stored reference depth map.
  • a computing device for acquiring feature points of a depth map includes: a processor; a memory storing a computer program executable by the processor; and a network unit, wherein the processor acquires a first depth map, and uses a function determined based on a method of acquiring the first depth map or a pre-learned transformation model to configure the first depth map.
  • the processor acquires a first depth map, and uses a function determined based on a method of acquiring the first depth map or a pre-learned transformation model to configure the first depth map.
  • the present disclosure can acquire feature points of a depth map using an artificial intelligence-based model.
  • FIG. 1 is a diagram illustrating a system for acquiring feature points of a depth map using an artificial intelligence-based model according to some embodiments of the present disclosure.
  • Figure 2 is a schematic diagram showing network functions according to some embodiments of the present disclosure.
  • 3 to 5 are diagrams for explaining a method of obtaining feature points of a depth map performed in a computing device according to some embodiments of the present disclosure.
  • FIG. 6 shows a brief, general schematic diagram of an example computing environment in which embodiments of the present disclosure may be implemented.
  • a component may be, but is not limited to, a process running on a processor, a processor, an object, a thread of execution, a program, and/or a computer.
  • an application running on a computing device and the computing device can be a component.
  • One or more components may reside within a processor and/or thread of execution.
  • a component may be localized within one computer.
  • a component may be distributed between two or more computers. Additionally, these components can execute from various computer-readable media having various data structures stored thereon.
  • Components can transmit signals, for example, with one or more data packets (e.g., data and/or signals from one component interacting with other components in a local system, a distributed system, to other systems and over a network such as the Internet). Depending on the data being transmitted, they may communicate through local and/or remote processes.
  • data packets e.g., data and/or signals from one component interacting with other components in a local system, a distributed system, to other systems and over a network such as the Internet.
  • a network such as the Internet
  • the term “or” is intended to mean an inclusive “or” and not an exclusive “or.” That is, unless otherwise specified or clear from context, “X utilizes A or B” is intended to mean one of the natural implicit substitutions. That is, either X uses A; X uses B; Or, if X uses both A and B, “X uses A or B” can apply to either of these cases. Additionally, the term “and/or” as used herein should be understood to refer to and include all possible combinations of one or more of the related listed items.
  • the term “at least one of A or B” should be interpreted to mean “a case containing only A,” “a case containing only B,” and “a case of combining A and B.”
  • network function artificial neural network, and neural network may be used interchangeably.
  • FIG. 1 is a diagram illustrating a system for acquiring feature points of a depth map using an artificial intelligence-based model according to some embodiments of the present disclosure.
  • a system for acquiring feature points of a depth map using an artificial intelligence-based model may include a computing device 100, a device 200, and a network. You can.
  • the configuration of the system shown in Figure 1 is only a simplified example. In some embodiments of the present disclosure, the system may include different configurations, and only some of the disclosed configurations may constitute the system.
  • the computing device 100 may include a processor 110, a memory 130, and a network unit 150.
  • the configuration of the computing device 100 shown in FIG. 1 is only a simplified example. In one embodiment of the present disclosure, the computing device 100 may include different configurations for performing the computing environment of the computing device 100, and only some of the disclosed configurations may configure the computing device 100.
  • the processor 110 may be composed of one or more cores, and may include a central processing unit (CPU), a general purpose graphics processing unit (GPGPU), and a tensor processing unit (TPU) of a computing device. unit) may include a processor for data analysis and deep learning.
  • the processor 110 may read a computer program stored in the memory 130 and perform data processing for machine learning according to an embodiment of the present disclosure. According to an embodiment of the present disclosure, the processor 110 may perform an operation for learning a neural network.
  • the processor 110 is used for learning neural networks, such as processing input data for learning in deep learning (DL), extracting features from input data, calculating errors, and updating the weights of the neural network using backpropagation. Calculations can be performed.
  • DL deep learning
  • At least one of the CPU, GPGPU, and TPU of the processor 110 may process learning of the network function.
  • CPU and GPGPU can work together to process learning of network functions and data classification using network functions.
  • the processors of a plurality of computing devices can be used together to process learning of network functions and data classification using network functions.
  • a computer program executed in a computing device according to an embodiment of the present disclosure may be a CPU, GPGPU, or TPU executable program.
  • the memory 130 may store any type of information generated or determined by the processor 110 and any type of information received by the network unit 150.
  • the memory 130 is a flash memory type, hard disk type, multimedia card micro type, or card type memory (e.g. (e.g. SD or -Only Memory), and may include at least one type of storage medium among magnetic memory, magnetic disk, and optical disk.
  • the computing device 100 may operate in connection with web storage that performs a storage function of the memory 130 on the Internet.
  • the description of the memory described above is merely an example, and the present disclosure is not limited thereto.
  • the network unit 150 may include any wired or wireless communication network capable of transmitting and receiving arbitrary types of data and signals, etc., as expressed in the present disclosure.
  • the computing device 100 may acquire feature points of a depth map using an artificial intelligence-based model.
  • the computing device 100 may acquire feature points of the depth map using a feature point acquisition model, which is an artificial intelligence-based model.
  • the feature point acquisition model may be a neural network built through deep learning or machine learning.
  • the feature point acquisition model can acquire feature points from the depth map included in the input data. Additionally, learning of the feature point acquisition model may be performed in advance using the first dataset. For example, when depth maps in which corresponding areas are photographed in different environments are input, the feature point acquisition model may be learned so that feature points obtained from each of the depth maps correspond to each other. A detailed description of the neural network will be described later with reference to FIG. 2.
  • the first dataset may refer to a set of data for performing learning and verification of a neural network.
  • the first dataset may include a first training dataset and/or a first validation dataset.
  • the first learning dataset may be a set of data used in the learning process of a neural network.
  • the first learning dataset may be a set of data used for learning in the learning process of a feature point acquisition model.
  • the first validation dataset may be a set of data used to evaluate the neural network.
  • the first verification dataset may be a set of data used to evaluate a feature point acquisition model.
  • a depth map may be an image that shows the relative distances of each pixel within a specific image. Accordingly, the depth map may include information related to the distance from the location where a specific image is taken to the surface of the subject.
  • the processor 110 of the computing device 100 may obtain a first depth map.
  • the processor 110 may obtain the first depth map from the device 200.
  • the processor 110 may receive the first depth map and device information from the device 200.
  • the first depth map may be an image showing the relative distances of each pixel present in the first image captured by the device 200. Accordingly, the first depth map may include information related to the distance from the location of the device 200 that captures the first image to the surface of the subject.
  • the device information may include at least one of information about the type of the device 200 and/or information about the camera of the device 200.
  • the depth map may be determined based on device information.
  • the value of the first depth map generated by the device 200 may be determined based on device information of the device 200.
  • the processor 110 may receive an image from the device 200.
  • the image may be an image other than a depth map.
  • the image may include at least one red-green-blue (RGB) image and/or at least one grayscale image other than the depth map.
  • the processor 110 may obtain a first depth map from an image (an image other than the depth map) using a pre-learned depth acquisition model.
  • the depth acquisition model can be a neural network built through deep learning or machine learning.
  • the depth acquisition model may be a model learned to obtain a depth map from an image.
  • the depth acquisition model may include at least one of a Deep Plane Sweep Network (DPS-Net) model and/or a DenseDepth model.
  • DPS-Net Deep Plane Sweep Network
  • the DPS-Net model may be a model that can obtain a depth map by setting the cost volume from deep features using a plane sweep algorithm and normalizing it.
  • the plane sweep algorithm may be an algorithm that finds the intersection between line segments in a set of line segments (eg, polygons, etc.).
  • the DPS-Net model is a feature extraction part of 3D points from neighboring continuous images with baselines captured by a camera, and a cost volume for multi-view images. It may be composed of a generative network, a cost aggregation part with a convolutional layer structure related to context-awareness, and a depth map regression part that infers a depth map through CNN.
  • a DenseDepth model can be composed of encoders and decoders.
  • the DenseDepth model can ultimately reconstruct the depth map through a series of processes such as feature extraction, down-sampling, combining, and up-sampling of input data.
  • the DenseDepth model can perform feature extraction and down-sampling processes on input images (e.g., RGB images, gray-scale images, etc.) through the encoder part.
  • the DenseDepth model can perform an up-sampling process by referring to the concatenation operation on the features extracted through the decoder part and the size of the image.
  • the weights can be updated in a way that minimizes the loss value obtained through the depth map information, which is the correct answer label, and the loss function.
  • the quality comparison between the depth map estimation data generated through the DenseDepth model and the depth map data corresponding to the correct answer label is SSIM (structural similarity index metric), PSNR (Peak Signal-to-Noise Ratio), background noise, This can be done through analysis of indicators such as profile characteristics at the object boundary.
  • the depth acquisition model is not limited to this and may include a neural network based on various depth map acquisition technologies built through deep learning or machine learning.
  • the second dataset may refer to a set of data for learning and verifying the depth acquisition model.
  • the second dataset may include a second training dataset and/or a second validation dataset.
  • the second learning dataset may be a set of data used for learning in the learning process of the depth acquisition model.
  • the second validation dataset may be a set of data used to evaluate the depth acquisition model.
  • the processor 110 uses a function determined based on a method of acquiring the first depth map or a pre-learned transformation model, so that the value of the first depth map corresponds to the value of at least one pre-stored reference depth map.
  • a second depth map can be obtained by converting the first depth map as much as possible.
  • the processor 110 uses a function determined based on the device 200 or a pre-learned transformation model to ensure that the value of the first depth map corresponds to the value of at least one pre-stored reference depth map.
  • a second depth map can be obtained by converting the first depth map as much as possible.
  • the value of the first depth map may mean a numerical value representing the characteristics and/or characteristics of the first depth map.
  • the value of the first depth map may include the size of the first depth map, bits, etc. However, the value is not limited to this and may include various numerical values.
  • the processor 110 uses a function corresponding to the device information to determine at least one reference depth in which the value of the first depth map is pre-stored.
  • a second depth map obtained by converting the first depth map to correspond to the value of the map can be obtained.
  • the memory 130 of the computing device 100 may have functions corresponding to device information stored in advance.
  • the memory 130 may have a function corresponding to the type of device 200 included in the device information stored in advance.
  • the processor 110 uses a function corresponding to device information previously stored in the memory 130 to configure the first depth map so that the value of the first depth map corresponds to the value of at least one reference depth map.
  • the converted second depth map can be obtained.
  • the device 200 may generate the first depth map with 8 bits.
  • the memory 130 may store at least one pre-stored reference depth map in 16 bits.
  • the processor 110 uses a function corresponding to device information to pre-preset the bits of the first depth map.
  • the second depth map can be obtained by converting it into 16 bits, which are the bits of at least one stored reference depth map.
  • the second depth map and the at least one pre-stored reference depth map have bits corresponding to each other, so that the type of the second depth map and the type of the at least one pre-stored reference depth map may correspond to each other.
  • the processor 110 may obtain a first depth map from the image using a pre-learned depth acquisition model. there is. And, the processor 110 uses a pre-learned transformation model to obtain a second depth map obtained by transforming the first depth map such that the value of the first depth map corresponds to the value of at least one pre-stored reference depth map. You can.
  • the transformation model can be a neural network built through deep learning or machine learning.
  • the conversion model may be a model learned to obtain a second depth map obtained by converting the first depth map from the first depth map so that the type of the first depth map corresponds to the type of at least one pre-stored reference depth map. .
  • the transformation model is a direction in which the first depth map and the reference depth map matching the first depth map correspond to each other, or a direction in which the feature points of the first depth map and the feature points of the reference depth map matching the first depth map are smoothly matched. It can be learned by setting the loss function. Additionally, learning of the transformation model may be performed in advance using a third data set.
  • the third data set may refer to a set of data for performing learning and verification of the transformation model.
  • the third dataset may include a third training dataset and/or a third validation dataset.
  • the third learning dataset may be a set of data used for learning in the learning process of the transformation model.
  • the third validation dataset may be a set of data used to evaluate the transformation model.
  • the processor 110 may use a pre-learned feature point acquisition model to obtain at least one candidate feature point and/or at least one descriptor corresponding to each of the at least one candidate feature point from the second depth map. For example, the processor 110 may obtain at least one candidate feature point from the second depth map using a pre-learned feature point acquisition model.
  • At least one candidate feature point may be a coordinate for each feature portion in the second depth map.
  • At least one descriptor may include at least one of information about the directionality and size of each of the at least one candidate feature point, and/or the relationship between pixels surrounding each of the at least one candidate feature point.
  • the processor 110 may obtain the boundary area of the first depth map based on the depth range of the first depth map.
  • Depth can have values from 0 to infinity, but since the depth map has a certain size, it can have values in a finite range. Therefore, the depth range may mean that the initial depth, which has an infinite value, is expressed as a finite range of values in the depth map.
  • the depth range may be determined by cutting off the initial depth range based on a predetermined range.
  • the depth range of the first depth map may be determined by cutting off the initial depth range of the first depth map based on a predetermined range. Cutoff may mean limiting the initial depth range to a predetermined range in order to represent a depth with an infinite value as a finite range.
  • the processor 110 may obtain the depth range of the first depth map by cutting off the initial depth range of the first depth map based on the predetermined range. Additionally, the processor 110 may obtain the boundary area of the first depth map based on the obtained depth range of the first depth map.
  • processor 110 may receive a plurality of depth maps from device 200.
  • the processor 110 may obtain the boundary area of the first depth map based on the final depth range of the first depth map included in the plurality of depth maps.
  • the final depth range of the first depth map may be a range adjusted from the initial depth range of the first depth map based on a plurality of depth maps excluding the first depth map.
  • the processor 110 may obtain a third depth map corresponding to a first area that is cut off from the first depth map and expressed as 0 among a plurality of depth maps excluding the first depth map.
  • the processor 110 may substitute the value of the area of the third depth map corresponding to the first area into the value of the first area. That is, the processor 110 may replace the value of the first area with the value of the area of the third depth map corresponding to the first area rather than 0.
  • the processor 110 may determine the final depth range of the first depth map as the range before the area expressed as 0. Accordingly, the final depth range of the first depth map may be wider than the initial depth range of the first depth map.
  • the depth range can be calculated using a mapping function that maps to a predetermined range.
  • the depth range of the first depth map can be calculated using a mapping function that maps to a predetermined range.
  • the processor 110 may obtain the depth range of the first depth map using a mapping function that maps to a predetermined range. Additionally, the processor 110 may obtain the boundary area of the first depth map based on the obtained depth range of the first depth map.
  • the boundary area of the first depth map may refer to an area located at the boundary between an area within the depth range of the first depth map and an area outside the depth range of the first depth map.
  • the processor 110 may process the value of an area outside the depth range of the first depth map as 0.
  • the size of the difference between the maximum value of the depth range of the first depth map and the value of the area outside the depth range of the first depth map may be large enough to be judged as a feature point. Accordingly, when the processor 110 acquires a candidate feature point using a feature point acquisition model, a false feature point generated due to an error occurring while processing the first depth map, rather than an actual feature point, is acquired as a candidate feature point in the boundary area. It can be.
  • the processor 110 may obtain the final feature point by removing at least one boundary candidate feature point that exists in the boundary area from among the at least one candidate feature point.
  • the processor 110 may obtain at least one final feature point containing only real feature points by removing boundary candidate feature points that are fake feature points generated due to an error.
  • the processor 110 may match the final feature point with the feature point of at least one reference depth map.
  • the processor 110 may determine the reference depth map with the largest number of feature points matching the final feature point among at least one reference depth map as the matching reference depth map.
  • the processor 110 may determine at least one of the location and/or pose of the device 200 related to the first depth map based on a matching reference depth map that matches the final feature point among the at least one reference depth map.
  • the pose of the device 200 may include the direction and/or angle of the device 200.
  • the device 200 related to the first depth map may refer to a device that transmitted the first depth map to the computing device 100.
  • the device 200 related to the first depth map acquires the first depth map from an image in a format other than a depth map using a depth acquisition model learned in advance by the processor 110. When doing so, it may refer to a device that transmits an image in a format other than a depth map to the computing device 100.
  • the processor 110 may generate device location information including at least one of the location and/or pose of the device 200.
  • the processor 110 may transmit the generated device location information to the device 200 using the network unit 150. Accordingly, the computing device 100 may transmit device location information to the device 200, allowing the user of the device 200 to recognize the current location and/or pose.
  • At least one reference depth map may be obtained from a 3D model generated based on spatial information about a space related to a first depth map scanned or photographed using an imaging device.
  • An imaging device may refer to any type of equipment for detecting optical images, converting them into electrical signals, and inputting them into the computing device 100.
  • the imaging device may include at least one of a scanner, Lidar, and/or a vision sensor.
  • the computing device 100 may include an imaging device or may be linked to an external imaging device wirelessly or wired.
  • Spatial information may refer to information about the interior of a space related to the first depth map.
  • spatial information may mean information about any type of object existing in a space related to the first depth map.
  • the spatial information may include at least one of distance, direction, and/or speed from the imaging device to an object existing in a space associated with the first depth map.
  • the spatial information may include at least one of the characteristics of the object's color, temperature, material distribution, and/or concentration.
  • the processor 110 may generate point cloud information including color data and/or depth data based on spatial information.
  • Point cloud information may refer to location information of measurement points related to the color and/or location of a basic object existing in a space related to the first depth map.
  • the basic object may be an object related to a feature point acquired from a feature point acquisition model.
  • the basic object may be an object excluding a moving object and a predetermined specific object.
  • the point cloud information may be information from which information related to at least one of lighting, a moving object, and/or a predetermined specific object has been removed.
  • Lighting information may be information related to light.
  • a moving object is an object that can move (eg, a car, a bicycle, etc.), and may not be acquired as a feature point in the feature point acquisition model. In other words, the moving object may not be related to the feature point obtained from the feature point acquisition model.
  • the predetermined specific object is a floating object (for example, a plant) rather than a moving object, and may not be acquired as a feature point in the feature point acquisition model. That is, a specific predetermined object may not be related to a feature point obtained from a feature point acquisition model.
  • the predetermined specific object may be an object set by the user.
  • a specific predetermined object may be an object that the user has set to not be included in the default object.
  • point cloud information may include color data and/or depth data.
  • the color data may be data related to the color of an object existing in a space related to the first depth map.
  • the color data may include a vertex color value including RGB color values of an object existing in a space related to the first depth map.
  • color data is not limited to this and may include various values representing colors.
  • Depth data may be data related to the location of an object existing in a space related to the first depth map.
  • depth data may include x, y, and z location coordinates of an object existing in a space related to the first depth map.
  • the meaning of depth data is not limited to the above-mentioned example, and may include various values indicating the location or depth of an object.
  • the processor 110 may generate a 3D model based on point cloud information.
  • the 3D model may be a model created in three dimensions to correspond to a space related to the first depth map based on point cloud information. Accordingly, the processor 110 may generate a 3D model corresponding to the space related to the first depth map based on point cloud information from which information related to at least one of lighting, a moving object, and/or a predetermined specific object has been removed.
  • the space related to the first depth map may be a space scanned or photographed using an imaging device.
  • the processor 110 may obtain at least one reference depth map from the generated 3D model and store the at least one reference depth map in the storage unit 130 .
  • the processor 110 may acquire a reference depth map for each of the plurality of regions by photographing a plurality of regions of the 3D model.
  • the processor 110 may store a reference depth map for each of the plurality of areas in the storage unit 130 .
  • the computing device 100 can obtain a more accurate reference depth map and improve feature point matching accuracy by obtaining the reference depth map from a generated 3D model rather than estimating it from a 2D image.
  • Device 200 may refer to any type of nodes in the system that have a mechanism for communication with the computing device 100.
  • the device 200 may include a mobile terminal, a smart phone, etc.
  • the device 200 may include modules for generating a depth map. Accordingly, the device 200 may generate a first depth map and transmit it to the computing device 100. Additionally, the user of the device 200 may recognize the current location and/or pose based on device location information received from the computing device 100.
  • the network may include any wired or wireless communication network through which the computing device 100 and the device 200 can transmit and receive data and signals of any type to each other.
  • Figure 2 is a schematic diagram showing a network function according to an embodiment of the present disclosure.
  • a neural network can generally consist of a set of interconnected computational units, which can be referred to as nodes. These nodes may also be referred to as neurons.
  • a neural network consists of at least one node. Nodes (or neurons) that make up neural networks may be interconnected by one or more links.
  • one or more nodes connected through a link may form a relative input node and output node relationship.
  • the concepts of input node and output node are relative, and any node in an output node relationship with one node may be in an input node relationship with another node, and vice versa.
  • input node to output node relationships can be created around links.
  • One or more output nodes can be connected to one input node through a link, and vice versa.
  • the value of the data of the output node may be determined based on the data input to the input node.
  • the link connecting the input node and the output node may have a weight. Weights may be variable and may be varied by the user or algorithm in order for the neural network to perform the desired function. For example, when one or more input nodes are connected to one output node by respective links, the output node is set to the values input to the input nodes connected to the output node and the links corresponding to each input node. The output node value can be determined based on the weight.
  • one or more nodes are interconnected through one or more links to form an input node and output node relationship within the neural network.
  • the characteristics of the neural network can be determined according to the number of nodes and links within the neural network, the correlation between the nodes and links, and the value of the weight assigned to each link. For example, if the same number of nodes and links exist and two neural networks with different weight values of the links exist, the two neural networks may be recognized as different from each other.
  • a neural network may consist of a set of one or more nodes.
  • a subset of nodes that make up a neural network can form a layer.
  • Some of the nodes constituting the neural network may form one layer based on the distances from the first input node.
  • a set of nodes with a distance n from the initial input node may constitute n layers.
  • the distance from the initial input node can be defined by the minimum number of links that must be passed to reach the node from the initial input node.
  • this definition of a layer is arbitrary for explanation purposes, and the order of a layer within a neural network may be defined in a different way than described above.
  • a layer of nodes may be defined by distance from the final output node.
  • the initial input node may refer to one or more nodes in the neural network through which data is directly input without going through links in relationships with other nodes.
  • a neural network network in the relationship between nodes based on links, it may mean nodes that do not have other input nodes connected by links.
  • the final output node may refer to one or more nodes that do not have an output node in their relationship with other nodes among the nodes in the neural network.
  • hidden nodes may refer to nodes constituting a neural network other than the first input node and the last output node.
  • the neural network according to an embodiment of the present disclosure is a neural network in which the number of nodes in the input layer may be the same as the number of nodes in the output layer, and the number of nodes decreases and then increases again as it progresses from the input layer to the hidden layer. You can.
  • the neural network according to another embodiment of the present disclosure may be a neural network in which the number of nodes in the input layer may be less than the number of nodes in the output layer, and the number of nodes decreases as it progresses from the input layer to the hidden layer. there is.
  • the neural network according to another embodiment of the present disclosure may be a neural network in which the number of nodes in the input layer may be greater than the number of nodes in the output layer, and the number of nodes increases as it progresses from the input layer to the hidden layer. You can.
  • a neural network according to another embodiment of the present disclosure may be a neural network that is a combination of the above-described neural networks.
  • a deep neural network may refer to a neural network that includes multiple hidden layers in addition to the input layer and output layer. Deep neural networks allow you to identify latent structures in data. In other words, it is possible to identify the potential structure of a photo, text, video, voice, or music (e.g., what object is in the photo, what the content and emotion of the text are, what the content and emotion of the voice are, etc.) . Deep neural networks include convolutional neural network (CNN), recurrent neural network (RNN), auto encoder, restricted Boltzmann machine (RBM), and deep trust network ( It may include deep belief network (DBN), Q network, U network, Siamese network, generative adversarial network (GAN), etc. The description of the deep neural network described above is only an example and the present disclosure is not limited thereto.
  • a neural network may be trained in at least one of supervised learning, unsupervised learning, semi-supervised learning, or reinforcement learning.
  • Learning of a neural network may be a process of applying knowledge for the neural network to perform a specific operation to the neural network.
  • Neural networks can be trained to minimize output errors.
  • neural network learning learning data is repeatedly input into the neural network, the output of the neural network and the error of the target for the learning data are calculated, and the error of the neural network is transferred from the output layer of the neural network to the input layer in the direction of reducing the error. This is the process of updating the weight of each node in the neural network through backpropagation.
  • teacher learning learning data in which the correct answer is labeled in each learning data is used (i.e., labeled learning data), and in the case of non-teacher learning, the correct answer may not be labeled in each learning data.
  • the learning data may be data in which each learning data is labeled with a category.
  • Labeled training data is input to the neural network, and the error can be calculated by comparing the output (category) of the neural network with the label of the training data.
  • the error can be calculated by comparing the input training data with the neural network output. The calculated error is backpropagated in the reverse direction (i.e., from the output layer to the input layer) in the neural network, and the connection weight of each node in each layer of the neural network can be updated according to backpropagation. The amount of change in the connection weight of each updated node may be determined according to the learning rate.
  • the neural network's calculation of input data and backpropagation of errors can constitute a learning cycle (epoch).
  • the learning rate may be applied differently depending on the number of repetitions of the learning cycle of the neural network. For example, in the early stages of neural network training, a high learning rate can be used to increase efficiency by allowing the neural network to quickly achieve a certain level of performance, and in the later stages of training, a low learning rate can be used to increase accuracy.
  • the training data can generally be a subset of real data (i.e., the data to be processed using the learned neural network), and thus the error for the training data is reduced, but the error for the real data is reduced. There may be an incremental learning cycle.
  • Overfitting is a phenomenon in which errors in actual data increase due to excessive learning on training data. For example, a phenomenon in which a neural network that learned a cat by showing a yellow cat fails to recognize that it is a cat when it sees a non-yellow cat may be a type of overfitting. Overfitting can cause errors in machine learning algorithms to increase. To prevent such overfitting, various optimization methods can be used. To prevent overfitting, methods such as increasing the learning data, regularization, dropout to disable some of the network nodes during the learning process, and use of a batch normalization layer can be applied. You can.
  • a computer-readable medium storing a data structure is disclosed.
  • Data structure can refer to the organization, management, and storage of data to enable efficient access and modification of data.
  • Data structure can refer to the organization of data to solve a specific problem (e.g., retrieving data, storing data, or modifying data in the shortest possible time).
  • a data structure may be defined as a physical or logical relationship between data elements designed to support a specific data processing function.
  • Logical relationships between data elements may include connection relationships between user-defined data elements.
  • Physical relationships between data elements may include actual relationships between data elements that are physically stored in a computer-readable storage medium (e.g., a persistent storage device).
  • a data structure may specifically include a set of data, relationships between data, and functions or instructions applicable to the data. Effectively designed data structures allow computing devices to perform computations while minimizing the use of the computing device's resources. Specifically, computing devices can increase the efficiency of operations, reading, insertion, deletion, comparison, exchange, and search through effectively designed data structures.
  • Data structures can be divided into linear data structures and non-linear data structures depending on the type of data structure.
  • a linear data structure may be a structure in which only one piece of data is connected to another piece of data.
  • Linear data structures may include List, Stack, Queue, and Deque.
  • a list can refer to a set of data that has an internal order.
  • the list may include a linked list.
  • a linked list may be a data structure in which data is connected in such a way that each data is connected in a single line with a pointer. In a linked list, a pointer may contain connection information to the next or previous data.
  • a linked list can be expressed as a singly linked list, a doubly linked list, or a circularly linked list.
  • a stack may be a data listing structure that allows limited access to data.
  • a stack can be a linear data structure in which data can be processed (for example, inserted or deleted) at only one end of the data structure.
  • Data stored in the stack may have a data structure (LIFO-Last in First Out) where the later it enters, the sooner it comes out.
  • a queue is a data listing structure that allows limited access to data. Unlike the stack, it can be a data structure (FIFO-First in First Out) where data stored later is released later.
  • a deck can be a data structure that can process data at both ends of the data structure.
  • a non-linear data structure may be a structure in which multiple pieces of data are connected behind one piece of data.
  • Nonlinear data structures may include graph data structures.
  • a graph data structure can be defined by vertices and edges, and an edge can include a line connecting two different vertices.
  • Graph data structure may include a tree data structure.
  • a tree data structure may be a data structure in which there is only one path connecting two different vertices among a plurality of vertices included in the tree. In other words, it may be a data structure that does not form a loop in the graph data structure.
  • Data structures may include neural networks. And the data structure including the neural network may be stored in a computer-readable medium. Data structures including neural networks also include data preprocessed for processing by a neural network, data input to the neural network, weights of the neural network, hyperparameters of the neural network, data acquired from the neural network, activation functions associated with each node or layer of the neural network, neural network It may include a loss function for learning.
  • a data structure containing a neural network may include any of the components disclosed above.
  • the data structure including the neural network includes preprocessed data for processing by the neural network, data input to the neural network, weights of the neural network, hyperparameters of the neural network, data acquired from the neural network, activation functions associated with each node or layer of the neural network, neural network It may be configured to include all or any combination of the loss function for learning.
  • a data structure containing a neural network may include any other information that determines the characteristics of the neural network.
  • the data structure may include all types of data used or generated in the computational process of a neural network and is not limited to the above.
  • Computer-readable media may include computer-readable recording media and/or computer-readable transmission media.
  • a neural network can generally consist of a set of interconnected computational units, which can be referred to as nodes. These nodes may also be referred to as neurons.
  • a neural network consists of at least one node.
  • the data structure may include data input to the neural network.
  • a data structure containing data input to a neural network may be stored in a computer-readable medium.
  • Data input to the neural network may include learning data input during the neural network learning process and/or input data input to the neural network on which training has been completed.
  • Data input to the neural network may include data that has undergone pre-processing and/or data subject to pre-processing.
  • Preprocessing may include a data processing process to input data into a neural network. Therefore, the data structure may include data subject to preprocessing and data generated by preprocessing.
  • the above-described data structure is only an example and the present disclosure is not limited thereto.
  • the data structure may include the weights of the neural network. (In this specification, weights and parameters may be used with the same meaning.) And the data structure including the weights of the neural network may be stored in a computer-readable medium.
  • a neural network may include multiple weights. Weights may be variable and may be varied by the user or algorithm in order for the neural network to perform the desired function. For example, when one or more input nodes are connected to one output node by respective links, the output node is set to the values input to the input nodes connected to the output node and the links corresponding to each input node. Based on the weight, the data value output from the output node can be determined.
  • the above-described data structure is only an example and the present disclosure is not limited thereto.
  • the weights may include weights that are changed during the neural network learning process and/or weights for which neural network learning has been completed.
  • Weights that change during the neural network learning process may include weights that change at the start of the learning cycle and/or weights that change during the learning cycle.
  • the above-described data structure is only an example and the present disclosure is not limited thereto.
  • the data structure including the weights of the neural network may be stored in a computer-readable storage medium (e.g., memory, hard disk) after going through a serialization process.
  • Serialization can be the process of converting a data structure into a form that can be stored on the same or a different computing device and later reorganized and used.
  • Computing devices can transmit and receive data over a network by serializing data structures.
  • Data structures containing the weights of a serialized neural network can be reconstructed on the same computing device or on a different computing device through deserialization.
  • the data structure including the weights of the neural network is not limited to serialization.
  • the data structure including the weights of the neural network is a data structure to increase computational efficiency while minimizing the use of computing device resources (e.g., in non-linear data structures, B-Tree, Trie, m-way search tree, AVL tree, Red-Black Tree) may be included.
  • computing device resources e.g., in non-linear data structures, B-Tree, Trie, m-way search tree, AVL tree, Red-Black Tree.
  • the data structure may include hyper-parameters of a neural network. And the data structure including the hyperparameters of the neural network can be stored in a computer-readable medium.
  • a hyperparameter may be a variable that can be changed by the user. Hyperparameters include, for example, learning rate, cost function, number of learning cycle repetitions, weight initialization (e.g., setting the range of weight values subject to weight initialization), Hidden Unit. It may include a number (e.g., number of hidden layers, number of nodes in hidden layers).
  • the above-described data structure is only an example and the present disclosure is not limited thereto.
  • 3 to 5 are diagrams for explaining a method of obtaining feature points of a depth map performed in a computing device according to some embodiments of the present disclosure.
  • the processor 110 of the computing device 100 may obtain a first depth map.
  • the processor 110 may obtain the first depth map from the device 200 (S110).
  • the processor 110 may receive the first depth map and device information from the device 200.
  • the first depth map may be an image showing the relative distances of each pixel present in the first image captured by the device 200. Accordingly, the first depth map may include information related to the distance from the location of the device 200 that captures the first image to the surface of the subject.
  • the processor 110 may receive an image from the device 200.
  • the image may be an image other than a depth map.
  • the image may include at least one of at least one red-green-blue (RGB) image and/or at least one grayscale image.
  • the processor 110 may obtain a first depth map from an image (an image other than the depth map) using a pre-learned depth acquisition model.
  • the processor 110 uses a function determined based on a method of acquiring the first depth map or a pre-learned transformation model to set the first depth map so that the value of the first depth map corresponds to the value of at least one pre-stored reference depth map.
  • a second depth map obtained by converting the depth map can be obtained (S120).
  • the processor 110 uses a function corresponding to the device information to determine at least one reference depth in which the value of the first depth map is pre-stored.
  • a second depth map obtained by converting the first depth map to correspond to the value of the map can be obtained.
  • the processor 110 may obtain a first depth map from the image using a pre-learned depth acquisition model. there is. And, the processor 110 uses a pre-learned transformation model to obtain a second depth map obtained by transforming the first depth map such that the value of the first depth map corresponds to the value of at least one pre-stored reference depth map. You can.
  • the processor 110 may obtain at least one candidate feature point from the second depth map using a pre-learned feature point acquisition model (S130).
  • the processor 110 may use a pre-learned feature point acquisition model to obtain at least one candidate feature point and/or at least one descriptor corresponding to each of the at least one candidate feature point from the second depth map.
  • At least one candidate feature point may be a coordinate for each feature portion in the second depth map.
  • At least one descriptor may include at least one of information about the directionality and size of each of the at least one candidate feature point, and/or the relationship between pixels surrounding each of the at least one candidate feature point.
  • the processor 110 may obtain the boundary area of the first depth map based on the depth range of the first depth map (S140).
  • the depth range may be determined by cutting off the initial depth range based on a predetermined range.
  • the depth range of the first depth map may be determined by cutting off the initial depth range of the first depth map based on a predetermined range. Cutoff may mean limiting the initial depth range to a predetermined range in order to represent a depth with an infinite value as a finite range.
  • the processor 110 may obtain the depth range of the first depth map by cutting off the initial depth range of the first depth map based on the predetermined range. Additionally, the processor 110 may obtain the boundary area of the first depth map based on the obtained depth range of the first depth map.
  • the depth range can be calculated using a mapping function that maps to a predetermined range.
  • the depth range of the first depth map can be calculated using a mapping function that maps to a predetermined range.
  • the processor 110 may obtain the depth range of the first depth map using a mapping function that maps to a predetermined range. Additionally, the processor 110 may obtain the boundary area of the first depth map based on the obtained depth range of the first depth map.
  • the processor 110 may obtain a final feature point by removing at least one boundary candidate feature point existing in the boundary area from among the at least one candidate feature point (S150).
  • the processor 110 may obtain at least one final feature point containing only real feature points by removing boundary candidate feature points that are fake feature points generated due to an error.
  • the processor 110 may match the final feature point with the feature point of at least one reference depth map (S160).
  • the processor 110 may determine the reference depth map with the largest number of feature points matching the final feature point among at least one reference depth map as the matching reference depth map.
  • the processor 110 may determine at least one of the location or pose of the device based on a matching reference depth map that matches the final feature point among at least one reference depth map (S170).
  • the processor 110 may generate device location information including at least one of the location and/or pose of the device 200.
  • the processor 110 may transmit the generated device location information to the device 200 using the network unit 150. Accordingly, the computing device 100 may transmit device location information to the device 200, allowing the user of the device 200 to recognize the current location and/or pose.
  • FIGS. 3 to 5 are exemplary steps. Accordingly, it will also be apparent to those skilled in the art that some of the steps in FIGS. 3 to 5 may be omitted or additional steps may be present without departing from the scope of the present disclosure. Additionally, specific details regarding the components (eg, computing device 100, device 200, etc.) depicted in FIGS. 3 to 5 may be replaced with the content previously described with reference to FIGS. 1 and 2 .
  • FIG. 6 is a brief, general schematic diagram of an example computing environment in which embodiments of the present disclosure may be implemented.
  • program modules include routines, programs, components, data structures, etc. that perform specific tasks or implement specific abstract data types.
  • routines programs, components, data structures, etc. that perform specific tasks or implement specific abstract data types.
  • program modules include routines, programs, components, data structures, etc. that perform specific tasks or implement specific abstract data types.
  • the described embodiments of the disclosure can also be practiced in distributed computing environments where certain tasks are performed by remote processing devices that are linked through a communications network.
  • program modules may be located in both local and remote memory storage devices.
  • Computers typically include a variety of computer-readable media.
  • Computer-readable media can be any medium that can be accessed by a computer, and such computer-readable media includes volatile and non-volatile media, transitory and non-transitory media, removable and non-transitory media. Includes removable media.
  • Computer-readable media may include computer-readable storage media and computer-readable transmission media.
  • Computer-readable storage media refers to volatile and non-volatile media, transient and non-transitory media, removable and non-removable, implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Includes media.
  • Computer readable storage media may include RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital video disk (DVD) or other optical disk storage, magnetic cassette, magnetic tape, magnetic disk storage or other magnetic storage. This includes, but is not limited to, a device, or any other medium that can be accessed by a computer and used to store desired information.
  • a computer-readable transmission medium typically implements computer-readable instructions, data structures, program modules, or other data on a modulated data signal, such as a carrier wave or other transport mechanism. Includes all information delivery media.
  • modulated data signal refers to a signal in which one or more of the characteristics of the signal have been set or changed to encode information within the signal.
  • computer-readable transmission media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared, and other wireless media. Combinations of any of the above are also intended to be included within the scope of computer-readable transmission media.
  • System bus 1108 couples system components, including but not limited to system memory 1106, to processing unit 1104.
  • Processing unit 1104 may be any of a variety of commercially available processors. Dual processors and other multiprocessor architectures may also be used as processing unit 1104.
  • System bus 1108 may be any of several types of bus structures that may further be interconnected to a memory bus, peripheral bus, and local bus using any of a variety of commercial bus architectures.
  • System memory 1106 includes read only memory (ROM) 1110 and random access memory (RAM) 1112.
  • the basic input/output system (BIOS) is stored in non-volatile memory 1110, such as ROM, EPROM, and EEPROM, and is a basic input/output system that helps transfer information between components within the computer 1102, such as during startup. Contains routines.
  • RAM 1112 may also include high-speed RAM, such as static RAM, for caching data.
  • Computer 1102 may also include an internal hard disk drive (HDD) 1114 (e.g., EIDE, SATA)—the internal hard disk drive 1114 may also be configured for external use within a suitable chassis (not shown).
  • HDD hard disk drive
  • FDD magnetic floppy disk drive
  • optical disk drive 1120 e.g., a CD-ROM for reading the disk 1122 or reading from or writing to other high-capacity optical media such as DVDs.
  • Hard disk drive 1114, magnetic disk drive 1116, and optical disk drive 1120 are connected to system bus 1108 by hard disk drive interface 1124, magnetic disk drive interface 1126, and optical drive interface 1128, respectively. ) can be connected to.
  • the interface 1124 for implementing an external drive includes at least one or both of Universal Serial Bus (USB) and IEEE 1394 interface technologies.
  • drives and their associated computer-readable media provide non-volatile storage of data, data structures, computer-executable instructions, and the like.
  • drive and media correspond to storing any data in a suitable digital format.
  • removable optical media such as HDDs, removable magnetic disks, and CDs or DVDs
  • removable optical media such as zip drives, magnetic cassettes, flash memory cards, cartridges, etc.
  • other types of computer-readable media, such as the like may also be used in the example operating environment and that any such media may contain computer-executable instructions for performing the methods of the present disclosure.
  • a number of program modules may be stored in drives and RAM 1112, including an operating system 1130, one or more application programs 1132, other program modules 1134, and program data 1136. All or portions of the operating system, applications, modules and/or data may also be cached in RAM 1112. It will be appreciated that the present disclosure may be implemented on various commercially available operating systems or combinations of operating systems.
  • a user may enter commands and information into computer 1102 through one or more wired/wireless input devices, such as a keyboard 1138 and a pointing device such as mouse 1140.
  • Other input devices may include microphones, IR remote controls, joysticks, game pads, stylus pens, touch screens, etc.
  • input device interface 1142 which is often connected to the system bus 1108, but may also include a parallel port, an IEEE 1394 serial port, a game port, a USB port, an IR interface, It can be connected by other interfaces, etc.
  • a monitor 1144 or other type of display device is also connected to system bus 1108 through an interface, such as a video adapter 1146.
  • computers typically include other peripheral output devices (not shown) such as speakers, printers, etc.
  • Computer 1102 may operate in a networked environment using logical connections to one or more remote computers, such as remote computer(s) 1148, via wired and/or wireless communications.
  • Remote computer(s) 1148 may be a workstation, computing device computer, router, personal computer, portable computer, microprocessor-based entertainment device, peer device, or other conventional network node, and is generally connected to computer 1102.
  • the logical connections depicted include wired/wireless connections to a local area network (LAN) 1152 and/or a larger network, such as a wide area network (WAN) 1154.
  • LAN and WAN networking environments are common in offices and companies and facilitate enterprise-wide computer networks, such as intranets, all of which can be connected to a worldwide computer network, such as the Internet.
  • computer 1102 When used in a LAN networking environment, computer 1102 is connected to local network 1152 through wired and/or wireless communication network interfaces or adapters 1156. Adapter 1156 may facilitate wired or wireless communication to LAN 1152, which also includes a wireless access point installed thereon for communicating with wireless adapter 1156.
  • the computer 1102 When used in a WAN networking environment, the computer 1102 may include a modem 1158 or be connected to a communicating computing device on the WAN 1154 or to establish communications over the WAN 1154, such as via the Internet. Have other means. Modem 1158, which may be internal or external and a wired or wireless device, is coupled to system bus 1108 via serial port interface 1142.
  • program modules described for computer 1102, or portions thereof may be stored in remote memory/storage device 1150. It will be appreciated that the network connections shown are exemplary and that other means of establishing a communications link between computers may be used.
  • Computer 1102 may be associated with any wireless device or object deployed and operating in wireless communications, such as a printer, scanner, desktop and/or portable computer, portable data assistant (PDA), communications satellite, wirelessly detectable tag. Performs actions to communicate with any device or location and telephone. This includes at least Wi-Fi and Bluetooth wireless technologies. Accordingly, communication may be a predefined structure as in a conventional network or may simply be ad hoc communication between at least two devices.
  • wireless communications such as a printer, scanner, desktop and/or portable computer, portable data assistant (PDA), communications satellite, wirelessly detectable tag.
  • PDA portable data assistant
  • Wi-Fi Wireless Fidelity
  • Wi-Fi is a wireless technology, like cell phones, that allows these devices, such as computers, to send and receive data indoors and outdoors, anywhere within the coverage area of a base station.
  • Wi-Fi networks use wireless technology called IEEE 802.11 (a, b, g, etc.) to provide secure, reliable, and high-speed wireless connections.
  • Wi-Fi can be used to connect computers to each other, the Internet, and wired networks (using IEEE 802.3 or Ethernet).
  • Wi-Fi networks can operate in the unlicensed 2.4 and 5 GHz wireless bands, for example, at data rates of 11 Mbps (802.11a) or 54 Mbps (802.11b), or in products that include both bands (dual band). .
  • the various embodiments presented herein may be implemented as a method, apparatus, or article of manufacture using standard programming and/or engineering techniques.
  • article of manufacture includes a computer program, carrier, or media accessible from any computer-readable storage device.
  • computer-readable storage media include magnetic storage devices (e.g., hard disks, floppy disks, magnetic strips, etc.), optical disks (e.g., CDs, DVDs, etc.), smart cards, and flash. Includes, but is not limited to, memory devices (e.g., EEPROM, cards, sticks, key drives, etc.).
  • various storage media presented herein include one or more devices and/or other machine-readable media for storing information.
  • It can be used in computing devices, systems, etc. to acquire feature points of a depth map.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

Selon certains modes de réalisation de la présente divulgation, un procédé d'obtention d'un point caractéristique d'une carte de profondeur au moyen d'un dispositif informatique peut comprendre les étapes consistant à : obtenir une première carte de profondeur ; obtenir une seconde carte de profondeur en convertissant la première carte de profondeur de telle sorte qu'une valeur de la première carte de profondeur correspond à une valeur d'au moins une carte de profondeur de référence stockée à l'avance en utilisant une fonction déterminée sur la base d'un procédé d'obtention de la première carte de profondeur ou d'un modèle de conversion pré-entraîné ; et obtenir au moins un point caractéristique candidat à partir de la seconde carte de profondeur en utilisant un modèle d'acquisition de point caractéristique pré-entraîné. Le dessin représentatif peut être la figure 1.
PCT/KR2023/009383 2022-05-19 2023-07-04 Procédé d'obtention d'un point caractéristique d'une carte de profondeur WO2023224457A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR1020220061402A KR102616081B1 (ko) 2022-05-19 2022-05-19 깊이 맵의 특징점을 획득하기 위한 방법
KR10-2022-0061402 2022-05-19

Publications (1)

Publication Number Publication Date
WO2023224457A1 true WO2023224457A1 (fr) 2023-11-23

Family

ID=88835828

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2023/009383 WO2023224457A1 (fr) 2022-05-19 2023-07-04 Procédé d'obtention d'un point caractéristique d'une carte de profondeur

Country Status (2)

Country Link
KR (2) KR102616081B1 (fr)
WO (1) WO2023224457A1 (fr)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160050372A1 (en) * 2014-08-15 2016-02-18 Qualcomm Incorporated Systems and methods for depth enhanced and content aware video stabilization
KR20160036985A (ko) * 2014-09-26 2016-04-05 삼성전자주식회사 3d 파노라마 이미지 생성을 위한 영상 생성 장치 및 방법
KR20170115757A (ko) * 2016-04-08 2017-10-18 한국과학기술원 깊이 정보 생성 장치 및 방법
US20180218511A1 (en) * 2015-07-31 2018-08-02 Versitech Limited Method and System for Global Motion Estimation and Compensation

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102014093B1 (ko) 2016-03-28 2019-08-26 영남대학교 산학협력단 얼굴의 특징점 검출 시스템 및 방법
KR101852085B1 (ko) * 2016-08-16 2018-04-26 한국과학기술원 깊이 정보 획득 장치 및 깊이 정보 획득 방법
CN111340922A (zh) * 2018-12-18 2020-06-26 北京三星通信技术研究有限公司 定位与地图构建的方法和电子设备

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160050372A1 (en) * 2014-08-15 2016-02-18 Qualcomm Incorporated Systems and methods for depth enhanced and content aware video stabilization
KR20160036985A (ko) * 2014-09-26 2016-04-05 삼성전자주식회사 3d 파노라마 이미지 생성을 위한 영상 생성 장치 및 방법
US20180218511A1 (en) * 2015-07-31 2018-08-02 Versitech Limited Method and System for Global Motion Estimation and Compensation
KR20170115757A (ko) * 2016-04-08 2017-10-18 한국과학기술원 깊이 정보 생성 장치 및 방법

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LONG XIAOXIAO; LIU LINGJIE; LI WEI; THEOBALT CHRISTIAN; WANG WENPING: "Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks", 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), IEEE, 20 June 2021 (2021-06-20), pages 8254 - 8263, XP034008067, DOI: 10.1109/CVPR46437.2021.00816 *

Also Published As

Publication number Publication date
KR20230161710A (ko) 2023-11-28
KR20230173642A (ko) 2023-12-27
KR102616081B1 (ko) 2023-12-20

Similar Documents

Publication Publication Date Title
WO2019074195A1 (fr) Dispositif et procédé de comparaison d'images basée sur un apprentissage profond, et programme d'ordinateur stocké sur un support d'enregistrement lisible par ordinateur
WO2021261825A1 (fr) Dispositif et procédé de génération de données météorologiques reposant sur l'apprentissage automatique
WO2021040354A1 (fr) Procédé de traitement de données utilisant un réseau de neurones artificiels
WO2024080791A1 (fr) Procédé de génération d'ensemble de données
WO2020004815A1 (fr) Procédé de détection d'une anomalie dans des données
WO2022005091A1 (fr) Procédé et appareil de lecture de l'âge d'un os
WO2022149696A1 (fr) Procédé de classification utilisant un modèle d'apprentissage profond
WO2019039757A1 (fr) Dispositif et procédé de génération de données d'apprentissage et programme informatique stocké dans un support d'enregistrement lisible par ordinateur
WO2022119162A1 (fr) Méthode de prédiction de maladie basée sur une image médicale
WO2022265292A1 (fr) Procédé et dispositif de détection de données anormales
KR20190041961A (ko) 딥러닝 기반 이미지 비교 장치, 방법 및 컴퓨터 판독가능매체에 저장된 컴퓨터 프로그램
WO2024117708A1 (fr) Procédé de conversion d'image faciale à l'aide d'un modèle de diffusion
WO2022075678A2 (fr) Appareil et procédé de détection de symptômes anormaux d'un véhicule basés sur un apprentissage auto-supervisé en utilisant des données pseudo-normales
WO2023224350A2 (fr) Procédé et dispositif de détection de point de repère à partir d'une image de volume 3d
WO2023224457A1 (fr) Procédé d'obtention d'un point caractéristique d'une carte de profondeur
WO2023101417A1 (fr) Procédé permettant de prédire une précipitation sur la base d'un apprentissage profond
WO2023128349A1 (fr) Procédé d'imagerie très haute résolution à l'aide d'un apprentissage coopératif
WO2021251691A1 (fr) Procédé de détection d'objet à base de rpn sans ancrage
WO2024096683A1 (fr) Procédé de réalisation d'une occultation d'un objet virtuel
WO2023224456A1 (fr) Procédé de génération d'ensemble de données
WO2021194105A1 (fr) Procédé d'apprentissage de modèle de simulation d'expert, et dispositif d'apprentissage
WO2024014777A1 (fr) Procédé et dispositif de génération de données pour localisation visuelle
WO2024106929A1 (fr) Procédé et dispositif de détermination de la validité d'une pose de caméra à l'aide d'une localisation visuelle
WO2024106928A1 (fr) Procédé et dispositif pour déterminer la validité d'une pose de caméra à l'aide d'une localisation visuelle
WO2024010323A1 (fr) Procédé et dispositif de localisation visuelle

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23807976

Country of ref document: EP

Kind code of ref document: A1