WO2024085600A1 - Procédé et système de mise en correspondance de modèle tridimensionnel et de données de vue de rue - Google Patents

Procédé et système de mise en correspondance de modèle tridimensionnel et de données de vue de rue Download PDF

Info

Publication number
WO2024085600A1
WO2024085600A1 PCT/KR2023/016046 KR2023016046W WO2024085600A1 WO 2024085600 A1 WO2024085600 A1 WO 2024085600A1 KR 2023016046 W KR2023016046 W KR 2023016046W WO 2024085600 A1 WO2024085600 A1 WO 2024085600A1
Authority
WO
WIPO (PCT)
Prior art keywords
street view
model
matching
information
view data
Prior art date
Application number
PCT/KR2023/016046
Other languages
English (en)
Korean (ko)
Inventor
김수정
백무열
Original Assignee
네이버랩스 주식회사
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 네이버랩스 주식회사 filed Critical 네이버랩스 주식회사
Publication of WO2024085600A1 publication Critical patent/WO2024085600A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/10Geometric effects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/05Geographic models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/20Finite element generation, e.g. wire-frame surface description, tesselation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras

Definitions

  • the present disclosure relates to a method and system for matching a 3D model and street view data. Specifically, by estimating location information and direction information for a street view image using a 3D model containing high accuracy location information. , relates to a method and system for matching 3D models and street view data.
  • map information services have been commercialized.
  • street view images may be provided.
  • a map information service provider can acquire images of a real space and then provide images taken at a specific point on an electronic map as a street view image.
  • Street view images contain high-quality texture information about the floor, buildings, and structures of real space, and can provide a user experience like looking around from a specific point in real space.
  • street view images have problems in that the accuracy of location information is low and they do not include 3D geometric information.
  • a 3D model based on aerial photography has the advantage of including 3D geometric information and accurate absolute coordinate location information.
  • 3D models based on aerial photography have a problem in that the quality of texture information is poor.
  • the present disclosure provides a method for matching a three-dimensional model and street view data, a computer-readable non-transitory recording medium and a device (system) recording commands to solve the above problems.
  • the present disclosure may be implemented in various ways, including a method, a device (system), or a computer-readable non-transitory recording medium recording instructions.
  • a method of matching a 3D model and street view data which is performed by at least one processor, includes 3D geometric information and texture information expressed in absolute coordinate positions.
  • Receiving a 3D model receiving street view data of a specific area including a plurality of street view images taken from a plurality of nodes in a specific area and absolute coordinate location information of first accuracy for the plurality of street view images. and estimating absolute coordinate position information and direction information of a second accuracy for the plurality of street view images based on the 3D model and street view data, where the second accuracy is a higher accuracy than the first accuracy.
  • a computer-readable non-transitory recording medium recording instructions for executing a method according to an embodiment of the present disclosure on a computer is provided.
  • An information processing system comprising a communication module, a memory, and at least one processor connected to the memory and configured to execute at least one computer-readable program included in the memory, and at least one program Receives a 3D model of a specific area including 3D geometric information and texture information expressed in absolute coordinate positions, and provides Receive street view data of a specific area including absolute coordinate location information of first accuracy, and provide absolute coordinate location information and direction information of second accuracy for a plurality of street view images based on the 3D model and street view data. It includes instructions for estimating, and the second accuracy is a higher accuracy than the first accuracy.
  • 3D geometric information and high accuracy location information and direction information with street view data including high quality texture information
  • 3D geometric information and high accuracy A variety of services can be provided using all of the location information, direction information, and high-quality texture information.
  • limited resources can be used efficiently by automatically extracting matching information between a street view image and a 3D model without additional reference point surveying or tagging work.
  • FIG. 1 is a diagram illustrating an example of a method for matching a 3D model and street view data according to an embodiment of the present disclosure.
  • Figure 2 is a schematic diagram showing a configuration in which an information processing system according to an embodiment of the present disclosure is connected to enable communication with a plurality of user terminals.
  • Figure 3 is a block diagram showing the internal configuration of a user terminal and an information processing system according to an embodiment of the present disclosure.
  • Figure 4 is a diagram showing an example of a 3D model according to an embodiment of the present disclosure.
  • Figure 5 is a diagram illustrating an example of acquiring street view data according to an embodiment of the present disclosure.
  • FIG. 6 is a diagram illustrating an example of extracting map matching points and/or map matching lines by performing feature matching between a 3D model and street view data according to an embodiment of the present disclosure.
  • FIG. 7 is a diagram illustrating an example of extracting a plurality of feature point correspondence sets by performing feature matching between a plurality of street view images included in street view data according to an embodiment of the present disclosure.
  • FIG. 8 is a diagram illustrating an example in which 3D geometric information included in a 3D model is projected onto a street view image using a street view image and estimated high-precision location information according to an embodiment of the present disclosure.
  • Figure 9 is a flowchart illustrating an example of a method for matching a 3D model and street view data according to an embodiment of the present disclosure.
  • a modulee' or 'unit' refers to a software or hardware component, and the 'module' or 'unit' performs certain roles.
  • 'module' or 'unit' is not limited to software or hardware.
  • a 'module' or 'unit' may be configured to reside on an addressable storage medium and may be configured to run on one or more processors.
  • a 'module' or 'part' refers to components such as software components, object-oriented software components, class components and task components, processes, functions and properties. , procedures, subroutines, segments of program code, drivers, firmware, microcode, circuits, data, databases, data structures, tables, arrays, or variables.
  • Components and 'modules' or 'parts' may be combined into smaller components and 'modules' or 'parts' or further components and 'modules' or 'parts'.
  • a 'module' or 'unit' may be implemented with a processor and memory.
  • 'Processor' should be interpreted broadly to include general-purpose processors, central processing units (CPUs), microprocessors, digital signal processors (DSPs), controllers, microcontrollers, state machines, etc.
  • 'processor' may refer to an application-specific integrated circuit (ASIC), programmable logic device (PLD), field programmable gate array (FPGA), etc.
  • ASIC application-specific integrated circuit
  • PLD programmable logic device
  • FPGA field programmable gate array
  • 'Processor' refers to a combination of processing devices, for example, a combination of a DSP and a microprocessor, a combination of a plurality of microprocessors, a combination of one or more microprocessors in combination with a DSP core, or any other such combination of configurations. You may. Additionally, 'memory' should be interpreted broadly to include any electronic component capable of storing electronic information.
  • RAM random access memory
  • ROM read-only memory
  • NVRAM non-volatile random access memory
  • PROM programmable read-only memory
  • EPROM erasable-programmable read-only memory
  • a memory is said to be in electronic communication with a processor if the processor can read information from and/or write information to the memory.
  • the memory integrated into the processor is in electronic communication with the processor.
  • 'system' may include at least one of a server device and a cloud device, but is not limited thereto.
  • a system may consist of one or more server devices.
  • a system may consist of one or more cloud devices.
  • the system may be operated with a server device and a cloud device configured together.
  • 'display' may refer to any display device associated with a computing device, e.g., any display device capable of displaying any information/data controlled by or provided by the computing device. can refer to.
  • 'each of a plurality of A' or 'each of a plurality of A' may refer to each of all components included in a plurality of A, or may refer to each of some components included in a plurality of A. .
  • 'street view data' may refer to data including road view data including images captured on the roadway and location information, as well as walk view data including images captured on the sidewalk and location information. .
  • 'street view data' may further include images and location information taken at random points outdoors (or indoors facing the outdoors), as well as roadways and sidewalks.
  • FIG. 1 is a diagram illustrating an example of a method of matching a 3D model 110 and street view data 120 according to an embodiment of the present disclosure.
  • the information processing system may acquire/receive the 3D model 110 and street view data 120 for a specific area.
  • the 3D model 110 may include 3D geometric information expressed in absolute coordinate positions and texture information corresponding thereto.
  • the location information included in the 3D model 110 may be information of higher accuracy than the location information included in the street view data 120.
  • the texture information included in the 3D model 110 may be of lower quality (eg, lower resolution) than the texture information included in the street view data 120.
  • 3D geometric information expressed as an absolute coordinate position may be generated based on an aerial photograph taken of a specific area from above the specific area.
  • the 3D model 110 for a specific area includes, for example, a 3D building model 112, a digital elevation model (DEM) 114, and a true ortho image (true ortho image) for a specific area. 116), may include road layout, road DEM, etc.
  • An example of the three-dimensional model 110 used in the present disclosure will be described in detail later with reference to FIG. 4 .
  • the street view data 120 may include a plurality of street view images captured at a plurality of nodes within a specific area and absolute coordinate location information for each of the plurality of street view images.
  • the location information included in the street view data 120 may be information of lower accuracy than the location information included in the 3D model 110, and the texture information included in the street view image is included in the 3D model 110. It may be information of higher quality (e.g., higher resolution) than the included texture information.
  • the location information included in the street view data 120 may be location information obtained using a GPS device when a node captures a street view image. Location information obtained using a vehicle's GPS equipment may have an error of about 5 to 10 meters.
  • street view data may include direction information (i.e., image shooting direction information) for each of a plurality of street view images.
  • direction information i.e., image shooting direction information
  • An example of a method for obtaining street view data 120 used in the present disclosure will be described in detail later with reference to FIG. 5 .
  • the information processing system may perform map matching 130 between the 3D model 110 and street view data 120. Specifically, the information processing system may perform feature matching between texture information included in the 3D model 110 and a plurality of street view images included in the street view data 120. To perform map matching 130, the information processing system may convert at least some of the plurality of street view images included in the street view data 120 into a top view image. As a result of map matching 130, a plurality of map matching points/map matching lines 132 can be extracted.
  • the map matching point may represent a corresponding pair of a point in the street view image and a point in the 3D model 110.
  • the type of map matching point may vary depending on the type of 3D model 110 used for map matching 130, the location of the point, etc.
  • map matching points are Ground Control Points (GCP), which are point correspondence pairs on the ground within a specific area, and Building Control Points (BCP), which are point correspondence pairs on buildings within a specific area.
  • GCP Ground Control Points
  • BCP Building Control Points
  • Map matching points can be extracted not only from the ground, buildings, and structures described above, but also from street view images and arbitrary areas of the 3D model 110.
  • the map matching line may represent a corresponding pair of one line of the street view image and one line of the 3D model 110.
  • the type of map matching line may vary depending on the type of 3D model 110 used for map matching 130, the location of the line, etc.
  • map matching lines include Ground Control Line (GCL), which is a corresponding pair of lines on the ground within a specific area, and Building Control Line (BCL), which is a corresponding pair of lines on buildings within a specific area.
  • GCL Ground Control Line
  • BCL Building Control Line
  • Map matching lines can be extracted from the ground, buildings, structures, and lanes described above, as well as street view images and arbitrary areas of the 3D model 110.
  • the information processing system may perform feature matching 150 between a plurality of street view images to extract a plurality of feature point correspondence sets 152.
  • feature matching 150 between a plurality of street view images may be performed using at least a portion of the 3D model 110.
  • feature matching 150 between street view images can be performed using the 3D building model 112 included in the 3D model 110.
  • An example in which the information processing system extracts a plurality of feature point correspondence sets 152 by performing feature matching 150 between a plurality of street view images included in the street view data 120 will be described in detail later with reference to FIG. 7. do.
  • the information processing system provides absolute coordinate position information and direction information for the plurality of street view images based on at least one of the plurality of map matching points/lines 132 and at least a portion of the plurality of feature point correspondence sets 152. can be estimated (160).
  • the processor may estimate absolute coordinate position information and direction information for a plurality of street view images using a bundle adjustment technique (160).
  • the estimated absolute coordinate position information and direction information 162 is information in an absolute coordinate system representing the 3D model 110, and may be a parameter of 6 degrees of freedom (DoF).
  • DoF degrees of freedom
  • the absolute coordinate location information and direction information 162 estimated through this process may be data with higher precision than the absolute coordinate location information and direction information included in the street view data 120.
  • 3D model 110 which includes 3D geometric information and high-accuracy location and direction information
  • street view data 120 which includes high-quality texture information
  • Figure 2 is a schematic diagram showing a configuration in which the information processing system 230 according to an embodiment of the present disclosure is connected to communicate with a plurality of user terminals 210_1, 210_2, and 210_3.
  • a plurality of user terminals 210_1, 210_2, and 210_3 may be connected to an information processing system 230 capable of providing a map information service through a network 220.
  • the plurality of user terminals 210_1, 210_2, and 210_3 may include terminals of users receiving a map information service.
  • the plurality of user terminals 210_1, 210_2, and 210_3 may be cars that capture street view images from nodes.
  • the information processing system 230 includes one or more server devices and/or databases capable of storing, providing, and executing computer-executable programs (e.g., downloadable applications) and data related to providing map information services, etc.
  • it may include one or more distributed computing devices and/or distributed databases based on cloud computing services.
  • the map information service provided by the information processing system 230 may be provided to the user through an application or web browser installed on each of the plurality of user terminals 210_1, 210_2, and 210_3.
  • the information processing system 230 may provide information corresponding to a street view image request, an image-based location recognition request, etc. received from the user terminals 210_1, 210_2, and 210_3 through an application or perform corresponding processing. You can.
  • a plurality of user terminals 210_1, 210_2, and 210_3 may communicate with the information processing system 230 through the network 220.
  • the network 220 may be configured to enable communication between a plurality of user terminals 210_1, 210_2, and 210_3 and the information processing system 230.
  • the network 220 may be, for example, a wired network such as Ethernet, a wired home network (Power Line Communication), a telephone line communication device, and RS-serial communication, a mobile communication network, a wireless LAN (WLAN), It may consist of wireless networks such as Wi-Fi, Bluetooth, and ZigBee, or a combination thereof.
  • the communication method is not limited, and may include communication methods utilizing communication networks that the network 220 may include (e.g., mobile communication networks, wired Internet, wireless Internet, broadcasting networks, satellite networks, etc.) as well as user terminals (210_1, 210_2, 210_3). ) may also include short-range wireless communication between the network 220 may include (e.g., mobile communication networks, wired Internet, wireless Internet, broadcasting networks, satellite networks, etc.) as well as user terminals (210_1, 210_2, 210_3). ) may also include short-range wireless communication between the network 220 may include (e.g., mobile communication networks, wired Internet, wireless Internet, broadcasting networks, satellite networks, etc.) as well as user terminals (210_1, 210_2, 210_3). ) may also include short-range wireless communication between the network 220 may include (e.g., mobile communication networks, wired Internet, wireless Internet, broadcasting networks, satellite networks, etc.) as well as user terminals (210_1, 210_2,
  • the mobile phone terminal (210_1), tablet terminal (210_2), and PC terminal (210_3) are shown as examples of user terminals, but they are not limited thereto, and the user terminals (210_1, 210_2, 210_3) use wired and/or wireless communication.
  • This may be any computing device capable of installing and executing an application or a web browser.
  • user terminals include AI speakers, smartphones, mobile phones, navigation, computers, laptops, digital broadcasting terminals, PDAs (Personal Digital Assistants), PMPs (Portable Multimedia Players), tablet PCs, game consoles, It may include wearable devices, IoT (internet of things) devices, VR (virtual reality) devices, AR (augmented reality) devices, set-top boxes, etc.
  • three user terminals (210_1, 210_2, 210_3) are shown as communicating with the information processing system 230 through the network 220, but this is not limited to this, and a different number of user terminals are connected to the network ( It may be configured to communicate with the information processing system 230 through 220).
  • the information processing system 230 may receive data associated with a street view image request including information about a specific point on an electronic map from the user terminals 210_1, 210_2, and 210_3. Then, the information processing system 230 may transmit a street view image for a specific point on the received electronic map to the user terminals 210_1, 210_2, and 210_3. Additionally or alternatively, the information processing system 230 may receive data associated with a video-based location recognition request including a video or image captured at a specific point from the user terminals 210_1, 210_2, and 210_3.
  • the information processing system 230 can estimate accurate location information and direction information about the point where the image or image was captured, and send the estimated location information and direction information to the user terminal. It can be sent to (210_1, 210_2, 210_3).
  • the information processing system 230 may transmit various service-related data based on data created by matching the 3D model and street view data to the user terminals 210_1, 210_2, and 210_3.
  • FIG. 3 is a block diagram showing the internal configuration of the user terminal 210 and the information processing system 230 according to an embodiment of the present disclosure.
  • the user terminal 210 may refer to any computing device capable of executing an application or a web browser and capable of wired/wireless communication, for example, the mobile phone terminal 210_1, tablet terminal 210_2 of FIG. 2, It may include a PC terminal (210_3), etc.
  • the user terminal 210 may include a memory 312, a processor 314, a communication module 316, and an input/output interface 318.
  • information processing system 230 may include memory 332, processor 334, communication module 336, and input/output interface 338. As shown in FIG.
  • the user terminal 210 and the information processing system 230 are configured to communicate information and/or data through the network 220 using respective communication modules 316 and 336. It can be. Additionally, the input/output device 320 may be configured to input information and/or data to the user terminal 210 through the input/output interface 318 or to output information and/or data generated from the user terminal 210.
  • Memories 312 and 332 may include any non-transitory computer-readable recording medium. According to one embodiment, the memories 312 and 332 are non-permanent mass storage devices such as read only memory (ROM), disk drive, solid state drive (SSD), flash memory, etc. It can be included. As another example, non-perishable mass storage devices such as ROM, SSD, flash memory, disk drive, etc. may be included in the user terminal 210 or the information processing system 230 as a separate persistent storage device that is distinct from memory. Additionally, the memories 312 and 332 may store an operating system and at least one program code (eg, code for an application installed and running on the user terminal 210).
  • ROM read only memory
  • SSD solid state drive
  • flash memory etc. It can be included.
  • non-perishable mass storage devices such as ROM, SSD, flash memory, disk drive, etc. may be included in the user terminal 210 or the information processing system 230 as a separate persistent storage device that is distinct from memory.
  • the memories 312 and 332 may store an
  • These software components may be loaded from a computer-readable recording medium separate from the memories 312 and 332.
  • This separate computer-readable recording medium may include a recording medium directly connectable to the user terminal 210 and the information processing system 230, for example, a floppy drive, disk, tape, DVD/CD- It may include computer-readable recording media such as ROM drives and memory cards.
  • software components may be loaded into the memories 312 and 332 through a communication module rather than a computer-readable recording medium. For example, at least one program is loaded into memory 312, 332 based on a computer program installed by files provided over the network 220 by developers or a file distribution system that distributes installation files for applications. It can be.
  • the processors 314 and 334 may be configured to process instructions of a computer program by performing basic arithmetic, logic, and input/output operations. Instructions may be provided to the processors 314 and 334 by memories 312 and 332 or communication modules 316 and 336. For example, processors 314 and 334 may be configured to execute received instructions according to program codes stored in recording devices such as memories 312 and 332.
  • the communication modules 316 and 336 may provide a configuration or function for the user terminal 210 and the information processing system 230 to communicate with each other through the network 220, and may provide a configuration or function for the user terminal 210 and/or information processing.
  • the system 230 may provide a configuration or function for communicating with other user terminals or other systems (for example, a separate cloud system, etc.). For example, a request or data generated by the processor 314 of the user terminal 210 according to a program code stored in a recording device such as the memory 312 (e.g., data associated with a street view image request for a specific area, etc. ) may be transmitted to the information processing system 230 through the network 220 under the control of the communication module 316.
  • a control signal or command provided under the control of the processor 334 of the information processing system 230 is transmitted through the communication module 316 of the user terminal 210 through the communication module 336 and the network 220. It may be received by the user terminal 210. For example, the user terminal 210 may receive data related to a street view image for a specific area from the information processing system 230.
  • the input/output interface 318 may be a means for interfacing with the input/output device 320.
  • input devices may include devices such as cameras, keyboards, microphones, mice, etc., including audio sensors and/or image sensors
  • output devices may include devices such as displays, speakers, haptic feedback devices, etc. You can.
  • the input/output interface 318 may be a means for interfacing with a device that has components or functions for performing input and output, such as a touch screen, integrated into one.
  • the processor 314 of the user terminal 210 uses information and/or data provided by the information processing system 230 or another user terminal when processing instructions of a computer program loaded in the memory 312. A service screen, etc.
  • the input/output device 320 is shown not to be included in the user terminal 210, but the present invention is not limited to this and may be configured as a single device with the user terminal 210. Additionally, the input/output interface 338 of the information processing system 230 may be connected to the information processing system 230 or means for interfacing with a device (not shown) for input or output that the information processing system 230 may include. It can be. In FIG. 3 , the input/output device 320 is shown not to be included in the user terminal 210, but the present invention is not limited to this and may be configured as a single device with the user terminal 210. Additionally, the input/output interface 338 of the information processing system 230 may be connected to the information processing system 230 or means for interfacing with a device (not shown) for input or output that the information processing system 230 may include. It can be. In FIG.
  • the input/output interfaces 318 and 338 are shown as elements configured separately from the processors 314 and 334, but the present invention is not limited thereto, and the input/output interfaces 318 and 338 may be configured to be included in the processors 314 and 334. there is.
  • the user terminal 210 and information processing system 230 may include more components than those in FIG. 3 . However, there is no need to clearly show most prior art components. According to one embodiment, the user terminal 210 may be implemented to include at least some of the input/output devices 320 described above. Additionally, the user terminal 210 may further include other components such as a transceiver, a global positioning system (GPS) module, a camera, various sensors, and a database.
  • GPS global positioning system
  • the user terminal 210 may include components generally included in a smartphone, such as an acceleration sensor, a gyro sensor, an image sensor, a proximity sensor, a touch sensor, Various components such as an illuminance sensor, a camera module, various physical buttons, buttons using a touch panel, input/output ports, and a vibrator for vibration may be implemented to be further included in the user terminal 210.
  • the processor 314 of the user terminal 210 may be configured to operate an application that provides a map information service. At this time, code associated with the corresponding application and/or program may be loaded into the memory 312 of the user terminal 210.
  • the processor 314 uses input devices such as a touch screen, a keyboard, a camera including an audio sensor and/or an image sensor, and a microphone connected to the input/output interface 318. It is possible to receive text, images, videos, voices, and/or actions input or selected through, and store the received text, images, videos, voices, and/or actions in the memory 312 or use the communication module 316 and It can be provided to the information processing system 230 through the network 220. For example, the processor 314 may receive a user's input requesting a street view image for a specific area and provide the input to the information processing system 230 through the communication module 316 and the network 220.
  • input devices such as a touch screen, a keyboard, a camera including an audio sensor and/or an image sensor, and a microphone connected to the input/output interface 318. It is possible to receive text, images, videos, voices, and/or actions input or selected through, and store the received text, images, videos, voices, and/or actions in the memory 312 or use
  • the processor 314 of the user terminal 210 manages, processes, and/or stores information and/or data received from the input/output device 320, other user terminals, the information processing system 230, and/or a plurality of external systems. It can be configured to do so. Information and/or data processed by processor 314 may be provided to information processing system 230 via communication module 316 and network 220.
  • the processor 314 of the user terminal 210 may transmit information and/or data to the input/output device 320 through the input/output interface 318 and output the information. For example, the processor 314 may display the received information and/or data on the screen of the user terminal.
  • the processor 334 of the information processing system 230 may be configured to manage, process, and/or store information and/or data received from a plurality of user terminals 210 and/or a plurality of external systems. Information and/or data processed by the processor 334 may be provided to the user terminal 210 through the communication module 336 and the network 220.
  • Figure 4 is a diagram showing an example of a 3D model according to an embodiment of the present disclosure.
  • a 3D model for a specific area may include 3D geometric information expressed in absolute coordinate positions and texture information corresponding thereto.
  • the 3D geometric information included in a 3D model for a specific area does not need to include information about all areas of the specific area.
  • a 3D model may only include geometric information and corresponding texture information for some areas of a specific area, such as the ground, buildings, or structures.
  • the 3D model may be a model created based on a digital elevation model 410 including geometric information about the ground of a specific area and a precise orthoimage 420 corresponding thereto.
  • the 3D model is based on a digital elevation model 410 containing geometric information about the ground of a specific area, a plurality of corresponding aerial photographs, and absolute coordinate position information and direction information 430 of each aerial photograph. It may be a generated model. In one embodiment, a precise orthoimage may be generated based on a plurality of aerial photos and the absolute coordinate location information and direction information 430 of each aerial photo.
  • the 3D model is based on a 3D mesh model 440 containing geometric information about buildings in a specific area and atlas data 450 containing texture information corresponding to the 3D mesh model. It may be a model created with .
  • the 3D mesh model may be a 3D triangular mesh model.
  • the 3D model includes a 3D mesh model 440 containing geometric information about buildings in a specific area, a plurality of aerial photos corresponding thereto, and absolute coordinate location information and direction information 430 of each aerial photo. It may be a model created as a basis.
  • the 3D model may be a model created based on a 3D mesh model containing geometric information about structures such as signs and traffic lights in a specific area and the corresponding texture information.
  • FIG. 5 is a diagram illustrating an example of acquiring street view data according to an embodiment of the present disclosure.
  • Street view data may include a plurality of street view images captured at a plurality of nodes within a specific area 520 and absolute coordinate location information for the plurality of street view images.
  • the plurality of nodes may be virtual nodes arranged at predefined intervals (eg, 5m intervals, 10m intervals, etc.) within the road 522 of a specific area 520.
  • a plurality of street view images are a plurality of images taken of a specific area 520 in various directions while driving on the road 522 of a specific area 520 using a vehicle 510 equipped with at least one camera 512. It can be obtained from an image.
  • cameras 512 may be four fisheye cameras.
  • each street view image may be a 360-degree panoramic image generated based on a plurality of images taken in various directions at each node.
  • the street view image for each node may be a 360-degree panoramic image created by stitching multiple images taken in various directions at each node. In this way, the street view image is generated based on an image taken from a relatively close distance of a space within a specific area, and may include high-quality texture information.
  • location information about a plurality of street view images included in street view data may be location information with relatively low accuracy.
  • the location information for a plurality of street view images included in the street view data may be location information acquired by a GPS device installed in the vehicle 510 at the node where each street view image was captured.
  • Location information obtained using the GPS equipment provided in the vehicle 510 may have an error of about 5 to 10 meters.
  • FIG. 6 is a diagram illustrating an example of extracting a map matching point 632 and/or a map matching line 634 by performing feature matching between a 3D model and street view data according to an embodiment of the present disclosure.
  • the information processing system may perform map matching between the 3D model and street view data to extract a plurality of map matching points 632 and/or a plurality of map matching lines 634.
  • the information processing system converts the street view image 610 into a top view image and texture information 620 of the three-dimensional model corresponding to the street view image 610 (e.g., three-dimensional A plurality of map matching points 632 and/or a plurality of map matching lines 634 can be extracted by performing feature matching between the model's orthoimage, aerial image, etc.) and the top view converted street view image 612. .
  • the texture information 620 of the 3D model corresponding to the street view image 610 may be obtained using absolute coordinate position information associated with the street view image 610.
  • Each map matching point 632 is a corresponding pair of one point of the street view image 610 (e.g., (u1, v1)) and one point of the 3D model (e.g., (x1, y1, z1)).
  • a method of representing one point of the street view image 610 among each map matching point 632 may be a format of the street view image 610 (e.g., equirectangular format, cubic format). It may vary depending on the cubic format, projection format, etc.).
  • the type of map matching point 632 may vary depending on the type of 3D model used for map matching, the location of the point, etc.
  • the map matching point 632 is a ground control point, which is a corresponding pair of points on the ground within a specific area, a building control point, which is a corresponding pair of points on a building within a specific area, or a corresponding pair of points on a structure within a specific area. May contain at least one of the structure control points.
  • the map matching point 632 can be extracted not only from the ground, buildings, and structures described above, but also from the street view image 610 and an arbitrary area of the 3D model.
  • Each map matching line 634 may represent a corresponding pair of one line of the street view image 610 and one line of the 3D model.
  • any expression method e.g., two points, line segment equation, one point and direction vector, etc.
  • each map matching line 634 contains two points (u1, v1), (u2, v2) of the street view image 610 and two points (x1, y1, z1), (x2, It may include corresponding pairs of y2, z2).
  • the type of map matching line 634 may vary depending on the type of 3D model used for map matching, the location of the line, etc.
  • the map matching line 634 is a ground control line, which is a corresponding pair of lines on the ground within a specific area, a building control line, which is a corresponding pair of lines on buildings within a specific area, and a corresponding pair of lines on structures within a specific area. It may include at least one of a structure control line or a lane control line that is a corresponding pair of lines in a lane within a specific area.
  • the map matching line 634 can be extracted not only from the ground, buildings, structures, and lanes described above, but also from the street view image 610 and arbitrary areas of the 3D model.
  • a ground point survey is performed directly, or an operator manually tags the location of the reference point in the image to use a standard reference point provided by the country.
  • the work was performed.
  • reference point surveying or tagging work is inefficient, and costs increase as the scope of work expands.
  • matching information between the street view image 610 and the 3D model can be automatically extracted without additional reference point surveying or tagging work, allowing efficient use of limited resources.
  • map matching point 632 but also the map matching line 634, more accurate map matching can be achieved even in areas where the map matching point 632 is not well extracted or the extraction quality is poor, such as expressways and city streets. can be performed.
  • FIG. 7 is a diagram illustrating an example of extracting a plurality of feature point correspondence sets by performing feature matching between a plurality of street view images 710, 720, and 730 included in street view data according to an embodiment of the present disclosure.
  • the information processing system may perform feature matching between a plurality of street view images 710, 720, and 730 to extract a plurality of feature point correspondence sets.
  • the method of performing feature matching between a plurality of street view images 710, 720, and 730 is not limited to a specific method, and may be used by any feature matching method (conventional feature such as Scale Invariant Feature Transform (SIFT), SuperPoint/Glue, Deep-based features such as R2D2, etc.) can be used.
  • SIFT Scale Invariant Feature Transform
  • R2D2D2 Deep-based features
  • the information processing system may extract a plurality of feature point correspondence sets by performing feature matching between a plurality of street view images 710, 720, and 730 for adjacent nodes 712, 722, and 732.
  • the nodes may be virtual nodes arranged at predefined intervals (d in the example of FIG. 7) within the road 700 in a specific area.
  • the information processing system may include a first street view image 710 for the first node 712 on a road 700 in a specific area, a second street view image 720 for the second node 722, and A plurality of feature point correspondence sets can be extracted by performing feature matching between the third street view image 730 for the third node 732.
  • Each feature point correspondence set is a corresponding set of points estimated to be the same point (e.g., ⁇ (u1, v1), (u2, v2), ... , (un, vn) ⁇ .
  • feature matching between a plurality of street view images 710, 720, and 730 may be performed using at least a portion of a 3D model.
  • the information processing system may perform feature matching between a plurality of street view images 710, 720, and 730 using at least a portion of a 3D mesh model for a building.
  • FIG. 8 is a diagram illustrating an example in which 3D geometric information included in a 3D model is projected onto the street view image using the street view image 810 and estimated high-precision location information and direction information according to an embodiment of the present disclosure. am.
  • high accuracy absolute coordinate location information and direction information for the street view image 810 may be estimated according to the process described in FIG. 1 .
  • the estimated position information and direction information are information in an absolute coordinate system representing a three-dimensional model of a specific area, and may be parameters of 6 degrees of freedom.
  • the information processing system can match the 3D model and street view data based on high accuracy absolute coordinate position information and direction information for the estimated plurality of street view images. Additionally, the information processing system can project the 3D geometric information included in the 3D model onto the street view image based on high accuracy absolute coordinate position information and direction information for the estimated plurality of street view images.
  • a street view image 820 in which street information generated by projecting 3D geometric information included in a 3D model onto an existing street view image 810 is displayed in color is provided. It can be.
  • realistic street view rendering services using 3D geometric information for street view images using street information
  • image-based location recognition (visual localization) services in outdoor environments etc.
  • a variety of services can be provided using the matching results.
  • Method 900 is a flow diagram illustrating a method 900 according to one embodiment of the present disclosure.
  • Method 900 allows a processor (e.g., at least one processor of an information processing system) to receive a three-dimensional model of a specific area including three-dimensional geometric information and texture information expressed in absolute coordinate positions (S910 ).
  • a 3D model may be created based on an aerial photo, and texture information included in the 3D model may be of lower quality than a street view image included in street view data.
  • a 3D model of a specific area including 3D geometric information and texture information expressed in absolute coordinate positions may be a 3D model of a portion of a specific area.
  • a 3D model may be a model created based on a digital elevation model and precise orthoimagery of a specific area.
  • the 3D model may be a model created based on a digital elevation model of a specific area, a plurality of aerial photographs, and the absolute coordinate location information and direction information of each aerial photograph.
  • the 3D model may be a model created based on map book data including a 3D mesh model for a building in a specific area and texture information corresponding to the 3D mesh model.
  • the 3D model may be a model created based on a 3D mesh model of a building in a specific area, a plurality of aerial photos, and the absolute coordinate location information and direction information of each aerial photo.
  • the processor may be configured to display a plurality of street view images (e.g., 360-degree panoramic images) taken from a plurality of nodes within a specific area and absolute coordinate location information of the first accuracy for the plurality of street view images.
  • Street view data can be received (S920).
  • the plurality of nodes may be nodes arranged at predefined intervals within a road in a specific area.
  • the processor may estimate absolute coordinate position information and direction information of the second accuracy for the plurality of street view images based on the 3D model and street view data (S930).
  • the second accuracy may be higher than the first accuracy.
  • the processor may perform feature matching between the 3D model and street view data to extract at least one of a plurality of map matching points or a plurality of map matching lines.
  • the processor converts at least some of the plurality of street view images into a top view image, performs feature matching between the 3D model and the top view converted street view image, and extracts map matching points and/or map matching lines. can do.
  • a map matching point may represent a corresponding pair of a point in a street view image and a point in a 3D model.
  • the types of map matching points may vary depending on the type of 3D model, location of the point, etc.
  • a map matching point is a ground control point, which is a corresponding pair of points on the ground within a specific area, a building control point, which is a corresponding pair of points on a building within a specific area, or a structure control point, which is a corresponding pair of points on a structure within a specific area. It can contain at least one of the points.
  • a map matching line may represent a corresponding pair of one line in a street view image and one line in a 3D model.
  • the type of map matching line can vary depending on the type of 3D model, the location of the line, etc.
  • map matching lines include ground control lines, which are corresponding pairs of lines on the ground within a specific area, building control lines, which are corresponding pairs of lines on buildings within a specific area, and structure control lines, which are corresponding pairs of lines on structures within a specific area.
  • it may include at least one of lane control lines, which are line-corresponding pairs in lanes within a specific area.
  • the processor may perform feature matching between a plurality of street view images to extract a plurality of feature point corresponding sets.
  • the 3D model of a specific area may include a 3D mesh model of a building in the specific area, and the processor uses at least a portion of the 3D mesh model of the building to create a 3D mesh model between a plurality of street view images. Feature matching can be performed.
  • the processor estimates second accuracy absolute coordinate position information and direction information for the plurality of street view images based on at least one of the plurality of map matching points or the plurality of map matching lines, and the plurality of feature point correspondence sets. You can.
  • the processor may estimate absolute coordinate position information and direction information of a second accuracy for a plurality of street view images using a bundle adjustment technique.
  • the absolute coordinate position information and direction information of the estimated second accuracy may be information of an absolute coordinate system representing a three-dimensional model.
  • the processor may match the 3D model and the street view data based on second accuracy absolute coordinate position information and direction information for the estimated plurality of street view images. According to one embodiment, the processor may project 3D geometric information included in the 3D model onto a plurality of street view images.
  • the above-described method may be provided as a computer program stored in a computer-readable recording medium for execution on a computer.
  • the medium may continuously store a computer-executable program, or may temporarily store it for execution or download.
  • the medium may be a variety of recording or storage means in the form of a single or several pieces of hardware combined. It is not limited to a medium directly connected to a computer system and may be distributed over a network. Examples of media include magnetic media such as hard disks, floppy disks and magnetic tapes, optical recording media such as CD-ROMs and DVDs, magneto-optical media such as floptical disks, And there may be something configured to store program instructions, including ROM, RAM, flash memory, etc. Additionally, examples of other media include recording or storage media managed by app stores that distribute applications, sites or servers that supply or distribute various other software, etc.
  • the processing units used to perform the techniques may include one or more ASICs, DSPs, digital signal processing devices (DSPDs), programmable logic devices (PLDs). ), field programmable gate arrays (FPGAs), processors, controllers, microcontrollers, microprocessors, electronic devices, and other electronic units designed to perform the functions described in this disclosure. , a computer, or a combination thereof.
  • the various illustrative logical blocks, modules, and circuits described in connection with this disclosure may be general-purpose processors, DSPs, ASICs, FPGAs or other programmable logic devices, discrete gate or transistor logic, discrete hardware components, or It may be implemented or performed as any combination of those designed to perform the functions described in.
  • a general-purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine.
  • a processor may also be implemented as a combination of computing devices, such as a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other configuration.
  • RAM random access memory
  • ROM read-only memory
  • NVRAM non-volatile random access memory
  • PROM on computer-readable media such as programmable read-only memory (EPROM), electrically erasable PROM (EEPROM), flash memory, compact disc (CD), magnetic or optical data storage devices, etc. It may also be implemented as stored instructions. Instructions may be executable by one or more processors and may cause the processor(s) to perform certain aspects of the functionality described in this disclosure.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Geometry (AREA)
  • Computer Graphics (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Remote Sensing (AREA)
  • Processing Or Creating Images (AREA)

Abstract

La présente divulgation concerne un procédé, mis en œuvre par au moins un processeur, permettant de mettre en correspondance un modèle tridimensionnel et des données de vue de rue. Le procédé de mise en correspondance d'un modèle tridimensionnel et de données de vue de rue comprend les étapes consistant à : recevoir un modèle tridimensionnel d'une zone particulière, le modèle comprenant des informations de texture et des informations géométriques tridimensionnelles représentées avec un emplacement de coordonnées absolues ; recevoir une pluralité de données d'image de vue de rue de la zone particulière comprenant des images de vue de rue capturées au niveau d'une pluralité de nœuds dans la zone particulière et des éléments d'informations d'emplacement de coordonnées absolues concernant la pluralité d'images de vue de rue avec une première précision ; et estimer des informations d'emplacement de coordonnées absolues et des informations d'orientation concernant la pluralité d'images de vue de rue avec une seconde précision sur la base du modèle tridimensionnel et des données de vue de rue. La seconde précision est supérieure à la première précision.
PCT/KR2023/016046 2022-10-17 2023-10-17 Procédé et système de mise en correspondance de modèle tridimensionnel et de données de vue de rue WO2024085600A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR10-2022-0133418 2022-10-17
KR1020220133418A KR20240053391A (ko) 2022-10-17 2022-10-17 3차원 모델과 거리뷰 데이터를 정합시키는 방법 및 시스템

Publications (1)

Publication Number Publication Date
WO2024085600A1 true WO2024085600A1 (fr) 2024-04-25

Family

ID=90738152

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2023/016046 WO2024085600A1 (fr) 2022-10-17 2023-10-17 Procédé et système de mise en correspondance de modèle tridimensionnel et de données de vue de rue

Country Status (2)

Country Link
KR (1) KR20240053391A (fr)
WO (1) WO2024085600A1 (fr)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101863188B1 (ko) * 2017-10-26 2018-06-01 (주)아세아항측 3차원 문화재 모델 구축 방법
KR102147969B1 (ko) * 2018-01-22 2020-08-25 네이버 주식회사 파노라마 뷰에 대한 3차원 모델을 생성하는 방법 및 시스템
KR20210010309A (ko) * 2019-07-19 2021-01-27 네이버랩스 주식회사 항공사진을 이용하여 3차원 지도를 생성하는 장치 및 방법
KR102318522B1 (ko) * 2018-12-31 2021-10-29 한국전자통신연구원 수치표고모델을 이용한 3차원 가상 환경 저작 시스템 및 그것의 동작 방법
KR20220064524A (ko) * 2020-11-12 2022-05-19 네이버랩스 주식회사 이미지 기반 측위 방법 및 시스템

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101863188B1 (ko) * 2017-10-26 2018-06-01 (주)아세아항측 3차원 문화재 모델 구축 방법
KR102147969B1 (ko) * 2018-01-22 2020-08-25 네이버 주식회사 파노라마 뷰에 대한 3차원 모델을 생성하는 방법 및 시스템
KR102318522B1 (ko) * 2018-12-31 2021-10-29 한국전자통신연구원 수치표고모델을 이용한 3차원 가상 환경 저작 시스템 및 그것의 동작 방법
KR20210010309A (ko) * 2019-07-19 2021-01-27 네이버랩스 주식회사 항공사진을 이용하여 3차원 지도를 생성하는 장치 및 방법
KR20220064524A (ko) * 2020-11-12 2022-05-19 네이버랩스 주식회사 이미지 기반 측위 방법 및 시스템

Also Published As

Publication number Publication date
KR20240053391A (ko) 2024-04-24

Similar Documents

Publication Publication Date Title
WO2015174729A1 (fr) Procédé et système de fourniture de réalité augmentée destinés à fournir des informations spatiales, ainsi que support d'enregistrement et système de distribution de fichier
US20140300637A1 (en) Method and apparatus for determining camera location information and/or camera pose information according to a global coordinate system
WO2015014018A1 (fr) Procédé de navigation et de positionnement en intérieur pour terminal mobile basé sur la technologie de reconnaissance d'image
WO2011139115A2 (fr) Procédé pour accéder à des informations sur des personnages à l'aide d'une réalité augmentée, serveur et support d'enregistrement lisible par ordinateur
WO2011031026A2 (fr) Système de délivrance de service d'avatar en 3 dimensions et procédé d'utilisation d'image d'arrière-plan
JP7164987B2 (ja) 映像通話を利用した道案内方法およびシステム
WO2021230466A1 (fr) Procédé et système de détermination d'emplacement de véhicule
WO2016035993A1 (fr) Dispositif et procédé d'établissement de carte intérieure utilisant un point de nuage
WO2021125578A1 (fr) Procédé et système de reconnaissance de position reposant sur un traitement d'informations visuelles
WO2011034305A2 (fr) Procédé et système de mise en correspondance hiérarchique d'images de bâtiments, et support d'enregistrement lisible par ordinateur
WO2024085600A1 (fr) Procédé et système de mise en correspondance de modèle tridimensionnel et de données de vue de rue
CN116858215B (zh) 一种ar导航地图生成方法及装置
WO2021086018A1 (fr) Procédé d'affichage de réalité augmentée tridimensionnelle
WO2024085628A1 (fr) Procédé et système d'acquisition automatique de point de commande au sol
WO2024106833A1 (fr) Procédé et système permettant d'acquérir automatiquement un point de commande de bâtiment
WO2024101833A1 (fr) Procédé et système de génération d'une carte de caractéristiques visuelles à l'aide d'un modèle tridimensionnel et d'une image de vue de rue
WO2024096717A1 (fr) Procédé et système d'acquisition automatique d'une paire de correspondance de points caractéristiques entre des images de vue de rue à l'aide d'un modèle tridimensionnel
WO2024085630A1 (fr) Procédé et système d'entraînement de modèle de réseau neuronal d'extraction de caractéristiques visuelles
WO2024085631A1 (fr) Procédé et système d'acquisition automatique de ligne de commande au sol
WO2024101776A1 (fr) Procédé et système de génération de modèle de vue de rue tridimensionnelle utilisant un modèle de construction tridimensionnel et un modèle de route
WO2023128045A1 (fr) Procédé et système de génération d'image de croquis à main levée pour apprentissage automatique
WO2021206200A1 (fr) Dispositif et procédé permettant de traiter des informations de nuage de points
WO2021075878A1 (fr) Procédé permettant de fournir un service d'enregistrement de réalité augmentée et terminal utilisateur
WO2024085455A1 (fr) Procédé et système de correction de pose d'objet
WO2016035992A1 (fr) Dispositif pour l'établissement de carte d'intérieur

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23880177

Country of ref document: EP

Kind code of ref document: A1