WO2023124948A1 - 一种三维地图的创建方法及电子设备 - Google Patents

一种三维地图的创建方法及电子设备 Download PDF

Info

Publication number
WO2023124948A1
WO2023124948A1 PCT/CN2022/138459 CN2022138459W WO2023124948A1 WO 2023124948 A1 WO2023124948 A1 WO 2023124948A1 CN 2022138459 W CN2022138459 W CN 2022138459W WO 2023124948 A1 WO2023124948 A1 WO 2023124948A1
Authority
WO
WIPO (PCT)
Prior art keywords
electronic device
image
target
frame
images
Prior art date
Application number
PCT/CN2022/138459
Other languages
English (en)
French (fr)
Inventor
温裕祥
何凯文
李江伟
唐忠伟
郑亚
罗巍
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2023124948A1 publication Critical patent/WO2023124948A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces

Definitions

  • the present application relates to the technical field of electronic equipment, in particular to a method for creating a three-dimensional map and electronic equipment.
  • metaverse/virtual world is intended to elevate people's understanding and perception of the world to a new level.
  • the digital world is a new dimension that expands the way humans understand and transform things, which is mainstream under existing technologies.
  • the key link between the digital world and the real world lies in high-precision, equal-scale, high-fidelity 3D maps. Therefore, fast, efficient and accurate creation of 3D (high-precision) maps is the first and necessary condition for entering the digital world.
  • the production of 3D maps may involve many fields such as geographic surveying and mapping, computer vision, etc.
  • the application of 3D maps covers various products related to the virtual world such as augmented reality (AR) games and virtual reality (VR) games.
  • AR augmented reality
  • VR virtual reality
  • one solution is to accurately measure the real coordinates of the real world as the starting point or control point through real-time differential positioning (real-time kinematic, RTK) or urban control points, and scan through laser scanners and panoramic cameras
  • the color 3D point cloud is obtained as map data, and then the real coordinates of the real world are aligned with the 3D point cloud through the control point alignment algorithm to generate a high-precision 3D map with absolute position.
  • RTK real-time kinematic
  • This solution requires a dedicated point cloud acquisition device, so the dependence on the device is high, resulting in a high cost of 3D map production, which is not conducive to popularization and use.
  • the point cloud collected in this scheme is generally dense, which will take up a large storage space, and it takes a long time to process the point cloud data, so the efficiency of creating a 3D map is low.
  • Another solution is to use the mobile phone to collect real-world images and upload them to the cloud server.
  • the cloud server processes the pictures uploaded by the mobile phone in sequence, generates sparse 3D points and camera perspectives through sparse reconstruction, and then combines the 3D points and camera perspectives for dense Reconstruction, and finally generate a 3D map.
  • the cloud server needs to have all the frame images uploaded by the mobile phone to create a 3D map, and the processing of the next frame image by the cloud server depends on the processing result of the previous frame image, so the calculation server needs to be sequentially Processing pictures uploaded by mobile phones leads to low processing efficiency.
  • the image acquisition process has high technical requirements for collectors using mobile phones, and users need to understand the concept of 3D reconstruction in order to collect suitable images that can better create 3D maps. Therefore, the universality of this solution Poor performance, not conducive to promotion and use.
  • the current method for creating a 3D map has the problems of low processing efficiency and poor versatility, making it difficult to create a 3D map easily and efficiently.
  • the present application provides a method for creating a three-dimensional map and an electronic device, which are used to improve the efficiency of creating a three-dimensional map, reduce the difficulty of creating a three-dimensional map, and realize the creation of a three-dimensional map simply and quickly.
  • the present application provides a method for creating a three-dimensional map, which is applied to a distributed system including multiple computing nodes.
  • the frame image is used as the target image, and the target image is subjected to the target processing process until the multiple frame images have been subjected to the target processing process; wherein the multiple frame images are images taken by the electronic device for the same environment
  • the target processing process includes the following steps: extracting the first feature point of the target image; acquiring feature points of at least one frame of image that has been subjected to the target processing process; selecting among the feature points of the at least one frame of image At least one second feature point and the first feature point form a feature matching pair; wherein, the first feature point and the at least one second feature point correspond to the same point in the environment; the target computing node obtains a pair of all A plurality of feature matching pairs obtained after the multi-frame image is subjected to the target processing process, and a three-dimensional point cloud map is created according to the plurality of feature matching pairs; wherein, the target computing node is
  • the distributed system can complete the creation of a three-dimensional point cloud map according to the multi-frame images provided by the electronic device.
  • multiple computing nodes in the distributed system can independently process different images, so it can be processed by multiple nodes. It supports large-scale computing, thereby improving processing efficiency, and realizing the creation of 3D maps easily and quickly.
  • the target computing node creates a three-dimensional point cloud map according to the multiple feature matching pairs, including: the target computing node acquires a plurality of first pose information from the electronic device; wherein , the plurality of first pose information corresponds to the multiple frames of images one by one, and each first pose information is used to instruct the electronic device to be in the first three-dimensional space when capturing the image corresponding to the first pose information
  • the position and orientation in the first three-dimensional space is the three-dimensional space corresponding to the three-dimensional point cloud map; the target calculation node determines the plurality of feature matching pairs in the A plurality of corresponding three-dimensional points in the first three-dimensional space are obtained to obtain the three-dimensional point cloud map.
  • the first pose information of the electronic device indicates the corresponding pose, that is, the position and orientation in the three-dimensional map when the electronic device captures different images, and there are feature points corresponding to the same point in the environment in different images, that is, feature matching pairs These feature points also correspond to the same point in the 3D map, so when the electronic device takes different images, the corresponding poses in the 3D map are the poses of the same point in the 3D map under different perspectives. Therefore, based on this relationship, the target computing node can deduce the position of the 3D point according to the pose of the electronic device, and then obtain multiple 3D points for constituting the 3D point cloud map.
  • the method further includes: the target computing node acquires a plurality of second pose information from the electronic device; wherein, the plurality of second pose information and the multi-frame The images are in one-to-one correspondence, and each second pose information is used to indicate the position and orientation of the electronic device in the second three-dimensional space when the image corresponding to the second pose information is taken, and the second three-dimensional space is the The three-dimensional space corresponding to the environment; the target computing node adjusts the coordinates of the three-dimensional points in the first three-dimensional space according to the plurality of second pose information and the plurality of first pose information, and obtains the A 3D point cloud map of the same scale as the environment described above.
  • the target computing node obtains the second pose information that reflects the pose of the electronic device in the real environment measured by the electronic device, and then adjusts the 3D point cloud map according to the second pose information, so that the 3D point cloud map can be compared with the real
  • the map of the environment is aligned to obtain a 3D point cloud map of the same proportion as the real environment, improving the accuracy of the obtained 3D point cloud map.
  • the target computing node adjusts coordinates of three-dimensional points in the first three-dimensional space according to the plurality of second pose information and the plurality of first pose information, including : The target computing node determines a plurality of transformation matrices according to the plurality of second pose information and the plurality of first pose information; wherein, the plurality of transformation matrices are in one-to-one correspondence with the multi-frame images , each transformation matrix is used to characterize the transformation relationship between the second pose information corresponding to the same image and the first pose information; the target calculation node averages the multiple transformation matrices to obtain a target transformation matrix; the The target calculation node uses the target conversion matrix to perform conversion processing on the coordinates of the three-dimensional points in the first three-dimensional space.
  • the target computing node adjusts the coordinates of the three-dimensional points in one of the three-dimensional spaces according to the conversion relationship between the poses of the electronic device in different three-dimensional spaces, and the two three-dimensional spaces and the corresponding coordinate systems can be adjusted align. Therefore, the 3D point cloud map obtained after adjustment based on this method can be aligned to the 3D space of the real environment, thereby ensuring the degree of restoration and accuracy of the 3D point cloud map.
  • the position indicated by each second pose information is determined by the electronic device through GPS positioning; the orientation indicated by each second pose information is the electronic device Determined by performing inertial measurement unit IMU measurements.
  • the pose information obtained by the target computing node from the electronic device is the GPS information and IMU information of the electronic device, and the GPS information and IMU information can accurately reflect the pose of the electronic device in the real environment, so this method can ensure that the target The computing node can obtain the accurate pose of the electronic device in the real environment, thereby ensuring the accuracy of the subsequent processing.
  • the method before acquiring the feature points of at least one frame of image that has undergone the target processing process, the method further includes: determining the at least one frame of image, including: extracting the global feature; obtain the global feature of each frame image that has been subjected to the target processing process, and select at least one global feature with the highest similarity with the global feature of the target image among the acquired global features; determine the at least one global feature The corresponding image is the at least one frame of image.
  • the distributed system further includes queue nodes, and before each computing node respectively selects a frame of image to be processed from multiple frames of images as the target image, the method further includes: The queue node receives the multi-frame images from the electronic device, and adds the multi-frame images to the target image queue in the order in which the images are received from the electronic device; Selecting a frame of image to be processed as the target image includes: each computing node reads a frame of image from the target image queue of the queue node, and uses the read image as the target image.
  • multiple computing nodes in the distributed system can sequentially read images from the queue nodes and process the read images, which can ensure that different nodes process multiple frames of images in an orderly manner, improving the processing efficiency. efficiency.
  • the target processing process further includes: after extracting the first feature point of the target image, saving the first feature point to a storage node;
  • the feature points of at least one frame of images include: acquiring the feature points of the at least one frame of images from the storage node.
  • each computing node saves the feature points to the storage node after extracting the feature points from the processed image.
  • the feature points determined by all nodes are stored in the storage node, so that each computing node can obtain the feature points of other images extracted by other computing nodes from the storage node and perform subsequent processing, which improves the processing efficiency.
  • the method further includes: the target computing node sending the three-dimensional point cloud map to the electronic device.
  • the system distributes the three-dimensional map and then sends the three-dimensional map to the electronic device for use by the electronic device.
  • electronic devices only need to provide information such as multi-frame images and poses, while the complex 3D map creation process is completed by a distributed system, which can reduce the difficulty and cost of obtaining 3D maps on the electronic device side.
  • the present application provides an electronic device, where the electronic device includes a memory and one or more processors;
  • the memory is used to store computer program codes, and the computer program codes include computer instructions; when the computer instructions are executed by the one or more processors, the electronic device is made to perform the above-mentioned first aspect or the first aspect.
  • Any possible design of one aspect is a method performed by at least one computing node among a plurality of computing nodes in the described method.
  • the present application provides an electronic device, the electronic device includes at least one computing node among multiple computing nodes for performing the method described in the first aspect or any possible design of the first aspect The module/unit of the method to execute.
  • the present application provides a distributed system, where the distributed system includes a plurality of electronic devices, wherein each electronic device can be realized by the electronic device of the second aspect or the electronic device of the third aspect.
  • the present application provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, and when the computer program runs on a computer, the computer executes the above-mentioned first aspect or the first aspect.
  • Any possible design is a method executed by at least one computing node among a plurality of computing nodes in the described method.
  • the present application provides a computer program product, the computer program product includes a computer program or an instruction, and when the computer program or instruction is run on a computer, the computer executes the above first aspect or the first aspect A method performed by at least one computing node among a plurality of computing nodes in any possible design of the described method.
  • FIG. 1 is a schematic diagram of a hardware architecture of an electronic device provided in an embodiment of the present application
  • FIG. 2 is a schematic diagram of a software architecture of an electronic device provided in an embodiment of the present application.
  • Figure 3a is a schematic diagram of the architecture of an application system applicable to the solution provided by the embodiment of the present application;
  • FIG. 3b is a schematic structural diagram of a possible application system of the solution provided by the embodiment of the present application.
  • FIG. 4 is a schematic diagram of an interface of an electronic device mapping initialization provided by an embodiment of the present application.
  • FIG. 5 is a schematic diagram of a display interface when an electronic device scans an environment provided by an embodiment of the present application
  • FIG. 6a is a schematic diagram of an interface of an electronic device displaying a grid provided by an embodiment of the present application
  • FIG. 6b is a schematic diagram of an interface of an electronic device displaying a grid provided by an embodiment of the present application.
  • FIG. 6c is a schematic diagram of a display interface when an electronic device triggers mapping provided in an embodiment of the present application.
  • FIG. 7 is a schematic diagram of a three-dimensional map provided by an embodiment of the present application.
  • FIG. 8a is a schematic diagram of an interface of an electronic device displaying three-dimensional map information provided by an embodiment of the present application.
  • FIG. 8b is a schematic diagram of an interface of an electronic device displaying three-dimensional map information provided by an embodiment of the present application.
  • FIG. 9 is a schematic flowchart of a method for creating a three-dimensional map provided in an embodiment of the present application.
  • Fig. 10a is a schematic diagram of a display interface during the use of a three-dimensional map by an electronic device according to an embodiment of the present application;
  • Fig. 10b is a schematic diagram of an electronic device combined with an interface displaying real scene images and virtual digital resources provided by an embodiment of the present application;
  • FIG. 11 is a schematic diagram of a method for creating a three-dimensional map provided in an embodiment of the present application.
  • FIG. 12 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • the electronic device may be a device with a wireless connection function.
  • the electronic device may be a device equipped with a display screen, a camera, and a sensor.
  • the electronic device may be a portable device, such as a mobile phone, a tablet computer, a wearable device with a wireless communication function (for example, a watch, a bracelet, a helmet, a headset, etc.), a vehicle terminal device, an augmented reality (augmented reality , AR)/virtual reality (virtual reality, VR) equipment, notebook computer, ultra-mobile personal computer (ultra-mobile personal computer, UMPC), netbook, personal digital assistant (personal digital assistant, PDA), etc.
  • Electronic devices can also be smart home devices (such as smart TVs, smart speakers, etc.), smart cars, smart robots, workshop equipment, wireless terminals in self driving (Self Driving), wireless devices in remote surgery (Remote Medical Surgery), etc. Terminals, wireless terminals in Smart Grid, wireless terminals in Transportation Safety, wireless terminals in Smart City, or wireless terminals in Smart Home, flight equipment ( For example, intelligent robots, hot air balloons, drones, airplanes), etc.
  • the electronic device may also be a portable terminal device that further includes other functions such as personal digital assistant and/or music player functions.
  • portable terminal devices include, but are not limited to Or portable terminal equipment with other operating systems.
  • the above-mentioned portable terminal device may also be other portable terminal devices, such as a laptop computer (Laptop) with a touch-sensitive surface (such as a touch panel).
  • the above-mentioned electronic device may not be a portable terminal device, but a desktop computer with a touch-sensitive surface (such as a touch panel).
  • Distributed processing refers to decomposing a task that requires large computing power into multiple small tasks, and then assigning multiple small tasks to multiple computing nodes processing, and finally combine the processing results of multiple computing nodes to obtain the final processing result. This can save overall processing time and greatly improve computational efficiency.
  • a system composed of multiple computing nodes is a distributed system. Multiple computing nodes can be deployed on the same device, or separately deployed on multiple devices connected through the network.
  • a computing node may be an electronic device (such as a server, etc.), or may be software, a program, a service, a device, etc. (such as a central processing unit CPU, an image processor GPU, etc.) in the electronic device.
  • Distributed storage is to store data dispersedly in multiple independent storage nodes.
  • the distributed storage system adopts a scalable system structure and uses multiple storage nodes to share the storage load, which not only solves the bottleneck problem of a single storage node in the traditional centralized storage system, but also improves the reliability, availability and scalability of the system.
  • a storage node may be an electronic device (such as a server), or may be a software program, service, device, etc. (such as a temporary storage, a permanent storage, etc.) in the electronic device.
  • Message middleware refers to a message-oriented system, which is the basic software for sending and receiving messages in a distributed system.
  • Message middleware can also be called a message queue, which can use an efficient and reliable message delivery mechanism for data transmission and communication, and integrate distributed systems based on data communication.
  • message middleware can extend the communication of system processes in a distributed environment.
  • local image features it is the local expression of image features, reflecting the local characteristics of the image.
  • Local image features may be features extracted from local areas of the image, such as features extracted from areas such as image edges, corners, lines, curves, and areas with special attributes.
  • Local image features include scale-invariant feature transform (SIFT), accelerated robust features (speeded up robust features, SURF) and other features. These local features can be used to detect key points or feature points in the image. Local image features have the characteristics of rich content in the image, small correlation between features, and the detection and matching of other features will not be affected by the disappearance of some features in the case of occlusion, so it is suitable for image matching, retrieval and other application processing .
  • SIFT scale-invariant feature transform
  • SURF accelerated robust features
  • Global image features refers to the overall attributes of the image, which can represent the features on the entire image.
  • the global features are relative to the local features of the image, and can be used to describe the overall features such as the color and shape of the image or target.
  • Common global features include color features, texture features, and shape features, such as weighted residual sums of local feature cluster centers, bag of words (Bow), and vector of locally aggregated descriptors (VLAD) wait.
  • Global features have the characteristics of good invariance, simple calculation, and intuitive representation.
  • local image features are referred to as local features for short
  • global image features are referred to as global features for short.
  • Voxel is the abbreviation of volume element (volume pixel). It is the smallest unit of digital data in three-dimensional space segmentation. It is conceptually similar to the smallest unit of two-dimensional space, namely pixel.
  • a voxel can represent a volumetric region with a constant scalar or vector. Volumes containing voxels can be represented by volume rendering or by extracting polygonal isosurfaces with given threshold contours.
  • White model refers to the grid of the model.
  • the 3D model is composed of polygons, and a complex polygon is actually composed of multiple triangles. So the surface of a 3D model is composed of multiple connected triangles. In three-dimensional space, the collection of points and sides of these triangles is the white mold.
  • At least one in the embodiments of the present application refers to one or more, and “multiple” refers to two or more.
  • “And/or” describes the association relationship of associated objects, indicating that there may be three types of relationships, for example, A and/or B, which can mean: A exists alone, A and B exist simultaneously, and B exists alone, where A, B can be singular or plural.
  • the character “/” generally indicates that the contextual objects are an “or” relationship.
  • At least one (item) of the following” or similar expressions refer to any combination of these items, including any combination of single item(s) or plural item(s).
  • At least one item (unit) of a, b or c can represent: a, b, c, a and b, a and c, b and c, or a, b and c, wherein a, b, c Can be single or multiple.
  • Embodiments of the present application provide a method for creating a 3D map and an electronic device.
  • the solution can easily and quickly realize the creation of a 3D map, improve the efficiency of creating a 3D map, and reduce the difficulty of creating a 3D map.
  • the solutions provided by the embodiments of the present application can be applied to a system composed of electronic devices and a distributed mapping system.
  • electronic devices are used to collect information such as environmental images and positioning parameters and report them to the distributed mapping system.
  • the distributed mapping system includes multiple computing nodes, and multiple computing nodes are used to create three-dimensional maps based on the information reported by electronic devices. Multiple compute nodes may be deployed on the same electronic device or on different electronic devices.
  • FIG. 1 the structure of an electronic device to which the method provided in the embodiment of the present application is applicable is introduced.
  • the electronic device 100 may include a processor 110, an external memory interface 120, an internal memory 121, a USB interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2, and a mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, earphone interface 170D, sensor module 180, button 190, motor 191, indicator 192, camera 193, display screen 194, and SIM card interface 195 wait.
  • a processor 110 an external memory interface 120, an internal memory 121, a USB interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2, and a mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, earphone interface 170D, sensor module 180, button 190, motor 191, indicator 192, camera 193, display screen 194, and SIM card interface 195 wait.
  • the sensor module 180 may include a gyroscope sensor, an acceleration sensor, a proximity light sensor, a fingerprint sensor, a touch sensor, a temperature sensor, a pressure sensor, a distance sensor, a magnetic sensor, an ambient light sensor, an air pressure sensor, a bone conduction sensor, and the like.
  • the electronic device 100 shown in FIG. 1 is only an example and does not constitute a limitation to the electronic device, and the electronic device may have more or fewer components than those shown in the figure, and may be combined two or more components, or may have different component configurations.
  • the various components shown in FIG. 1 may be implemented in hardware, software, or a combination of hardware and software, including one or more signal processing and/or application specific integrated circuits.
  • the processor 110 may include one or more processing units, for example: the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processing unit (graphics processing unit, GPU), an image signal processor (image signal processor, ISP), controller, memory, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural network processor (Neural-network Processing Unit, NPU) wait. Wherein, different processing units may be independent devices, or may be integrated in one or more processors. Wherein, the controller may be the nerve center and command center of the electronic device 100 . The controller can generate an operation control signal according to the instruction opcode and timing signal, and complete the control of fetching and executing the instruction.
  • a memory may also be provided in the processor 110 for storing instructions and data.
  • the memory in processor 110 is a cache memory.
  • the memory may hold instructions or data that the processor 110 has just used or recycled. If the processor 110 needs to use the instruction or data again, it can be called directly from the memory. Repeated access is avoided, and the waiting time of the processor 110 is reduced, thereby improving the efficiency of the system.
  • the execution of the method for creating a three-dimensional map provided by the embodiment of the present application can be controlled by the processor 110 or completed by calling other components, such as calling the processing program of the embodiment of the present application stored in the internal memory 121, or calling it through the external memory interface 120
  • the processing program of the embodiment of the present application stored in the third-party device is used to control the wireless communication module 160 to perform data communication with other devices, so as to improve the intelligence and convenience of the electronic device 100 and improve user experience.
  • the processor 110 may include different devices. For example, when a CPU and a GPU are integrated, the CPU and the GPU may cooperate to execute the method for creating a three-dimensional map provided in the embodiment of the present application. For example, in the method for creating a three-dimensional map, some algorithms are executed by the CPU, and the other part of the algorithm Executed by the GPU for faster processing efficiency.
  • the display screen 194 is used to display images, videos and the like.
  • the display screen 194 includes a display panel.
  • the display panel can be a liquid crystal display (LCD), an organic light-emitting diode (OLED), an active matrix organic light emitting diode or an active matrix organic light emitting diode (active-matrix organic light emitting diode, AMOLED), flexible light-emitting diode (flex light-emitting diode, FLED), Miniled, MicroLed, Micro-oLed, quantum dot light emitting diodes (quantum dot light emitting diodes, QLED), etc.
  • the electronic device 100 may include 1 or N display screens 194 , where N is a positive integer greater than 1.
  • the display screen 194 can be used to display information input by the user or information provided to the user and various graphical user interfaces (graphical user interface, GUI). For example, the display screen 194 can display photos, videos, web pages, or files, etc.
  • the display screen 194 may be an integral flexible display screen, or a spliced display screen composed of two rigid screens and a flexible screen between the two rigid screens.
  • a camera 193 (either a front camera or a rear camera, or a camera that can function as both a front camera and a rear camera) is used to capture still images or video.
  • the camera 193 may include a photosensitive element such as a lens group and an image sensor, wherein the lens group includes a plurality of lenses (convex lens or concave lens), which are used to collect light signals reflected by objects to be photographed, and transmit the collected light signals to the image sensor .
  • the image sensor generates an original image of the object to be photographed according to the light signal.
  • the internal memory 121 may be used to store computer-executable program codes including instructions.
  • the processor 110 executes various functional applications and data processing of the electronic device 100 by executing instructions stored in the internal memory 121 .
  • the internal memory 121 may include an area for storing programs and an area for storing data. Wherein, the stored program area can store the codes of the operating system and application programs (such as the function of creating a three-dimensional map, etc.).
  • the storage data area can store data created during the use of the electronic device 100 and the like.
  • the internal memory 121 may also store one or more computer programs corresponding to the three-dimensional map creation algorithm provided in the embodiment of the present application.
  • the one or more computer programs are stored in the above-mentioned internal memory 121 and are configured to be executed by the one or more processors 110, the one or more computer programs include instructions, and the above-mentioned instructions can be used to execute the following embodiments various steps.
  • the internal memory 121 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, flash memory device, universal flash storage (universal flash storage, UFS) and the like.
  • a non-volatile memory such as at least one magnetic disk storage device, flash memory device, universal flash storage (universal flash storage, UFS) and the like.
  • the code of the 3D map creation algorithm provided in the embodiment of the present application may also be stored in an external memory.
  • the processor 110 may run the code of the three-dimensional map creation algorithm stored in the external memory through the external memory interface 120 .
  • the sensor module 180 may include a gyro sensor, an acceleration sensor, a proximity light sensor, a fingerprint sensor, a touch sensor, and the like.
  • Touch sensor also known as "touch panel”.
  • the touch sensor can be arranged on the display screen 194, and the touch sensor and the display screen 194 form a touch display screen, also called “touch screen”.
  • the touch sensor is used to detect a touch operation on or near it.
  • the touch sensor can pass the detected touch operation to the application processor to determine the type of touch event.
  • Visual output related to the touch operation can be provided through the display screen 194 .
  • the touch sensor can also be disposed on the surface of the electronic device 100 , which is different from the position of the display screen 194 .
  • the display screen 194 of the electronic device 100 displays a main interface, and the main interface includes icons of multiple applications (such as a camera application, a WeChat application, etc.).
  • the display screen 194 displays an interface of the camera application, such as a viewfinder interface.
  • the wireless communication function of the electronic device 100 can be realized by the antenna 1 , the antenna 2 , the mobile communication module 150 , the wireless communication module 160 , a modem processor, a baseband processor, and the like.
  • Antenna 1 and Antenna 2 are used to transmit and receive electromagnetic wave signals.
  • Each antenna in electronic device 100 may be used to cover single or multiple communication frequency bands. Different antennas can also be multiplexed to improve the utilization of the antennas.
  • Antenna 1 can be multiplexed as a diversity antenna of a wireless local area network.
  • the antenna may be used in conjunction with a tuning switch.
  • the mobile communication module 150 can provide wireless communication solutions including 2G/3G/4G/5G applied on the electronic device 100 .
  • the mobile communication module 150 may include at least one filter, switch, power amplifier, low noise amplifier (low noise amplifier, LNA) and the like.
  • the mobile communication module 150 can receive electromagnetic waves through the antenna 1, filter and amplify the received electromagnetic waves, and send them to the modem processor for demodulation.
  • the mobile communication module 150 can also amplify the signals modulated by the modem processor, and convert them into electromagnetic waves and radiate them through the antenna 1 .
  • at least part of the functional modules of the mobile communication module 150 may be set in the processor 110 .
  • at least part of the functional modules of the mobile communication module 150 and at least part of the modules of the processor 110 may be set in the same device. In the embodiment of the present application, the mobile communication module 150 may also be used for information interaction with other devices.
  • a modem processor may include a modulator and a demodulator.
  • the modulator is used for modulating the low-frequency baseband signal to be transmitted into a medium-high frequency signal.
  • the demodulator is used to demodulate the received electromagnetic wave signal into a low frequency baseband signal. Then the demodulator sends the demodulated low-frequency baseband signal to the baseband processor for processing.
  • the low-frequency baseband signal is passed to the application processor after being processed by the baseband processor.
  • the application processor outputs sound signals through audio devices (not limited to speaker 170A, receiver 170B, etc.), or displays images or videos through display screen 194 .
  • the modem processor may be a stand-alone device.
  • the modem processor may be independent from the processor 110, and be set in the same device as the mobile communication module 150 or other functional modules.
  • the wireless communication module 160 can provide wireless local area networks (wireless local area networks, WLAN) (such as wireless fidelity (Wireless Fidelity, Wi-Fi) network), bluetooth (bluetooth, BT), global navigation satellite, etc. applied on the electronic device 100.
  • System global navigation satellite system, GNSS
  • frequency modulation frequency modulation, FM
  • near field communication technology near field communication, NFC
  • infrared technology infrared, IR
  • the wireless communication module 160 may be one or more devices integrating at least one communication processing module.
  • the wireless communication module 160 receives electromagnetic waves via the antenna 2 , frequency-modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 110 .
  • the wireless communication module 160 can also receive the signal to be sent from the processor 110 , frequency-modulate it, amplify it, and convert it into electromagnetic waves through the antenna 2 for radiation.
  • the wireless communication module 160 is configured to establish a connection with other electronic devices for data interaction.
  • the wireless communication module 160 can be used to access the access point device, send control instructions to other electronic devices, or receive data sent from other electronic devices.
  • the electronic device 100 may implement audio functions through the audio module 170 , the speaker 170A, the receiver 170B, the microphone 170C, the earphone interface 170D, and the application processor. Such as music playback, recording, etc.
  • the electronic device 100 may receive an input of the key 190 and generate a key signal input related to user setting and function control of the electronic device 100 .
  • the electronic device 100 can use the motor 191 to generate a vibration prompt (such as a vibration prompt for an incoming call).
  • the indicator 192 in the electronic device 100 can be an indicator light, which can be used to indicate the charging status, the change of the battery capacity, and can also be used to indicate messages, missed calls, notifications and the like.
  • the SIM card interface 195 in the electronic device 100 is used for connecting a SIM card.
  • the SIM card can be connected and separated from the electronic device 100 by inserting it into the SIM card interface 195 or pulling it out from the SIM card interface 195 .
  • the electronic device 100 may include more or fewer components than those shown in FIG. 1 , which is not limited in this embodiment of the present application.
  • the illustrated electronic device 100 is only one example, and the electronic device 100 may have more or fewer components than shown in the figure, may combine two or more components, or may have a different configuration of components.
  • the various components shown in the figures may be implemented in hardware, software, or a combination of hardware and software including one or more signal processing and/or application specific integrated circuits.
  • the software system of the electronic device 100 may adopt a layered architecture, an event-driven architecture, a micro-kernel architecture, a micro-service architecture, or a cloud architecture.
  • the Android system with layered architecture is taken as an example to illustrate the software structure of the electronic device.
  • the layered architecture divides the software into several layers, and each layer has a clear role and division of labor. Layers communicate through software interfaces. As shown in Figure 2, the software architecture can be divided into four layers, from top to bottom are the application program layer, the application program framework layer (framework, FWK), Android runtime and system library, and the Linux kernel layer.
  • the application program layer the application program framework layer (framework, FWK)
  • Android runtime and system library the Linux kernel layer.
  • the application layer is the top layer of the operating system, including native applications of the operating system, such as camera, gallery, calendar, Bluetooth, music, video, information, and so on.
  • the application program involved in the embodiment of the present application is referred to as an application (application, APP), which is a software program capable of realizing one or more specific functions.
  • application APP
  • multiple applications can be installed in an electronic device.
  • the applications mentioned below may be system applications installed on the electronic device when it leaves the factory, or third-party applications downloaded from the Internet or obtained from other electronic devices by the user during the use of the electronic device.
  • the application program can be developed using the Java language by calling the application programming interface (Application Programming Interface, API) provided by the application program framework layer, and the developer can communicate with the operating system through the application program framework layer.
  • API Application Programming Interface
  • the bottom layer (such as the kernel layer, etc.) to interact and develop your own applications.
  • the application framework layer provides an application programming interface (application programming interface, API) and a programming framework for applications in the application layer.
  • the application framework layer can include some predefined functions.
  • Application framework layers can include window managers, content providers, view systems, telephony managers, resource managers, notification managers, and more.
  • a window manager is used to manage window programs.
  • the window manager can get the size of the display screen, determine whether there is a status bar, lock the screen, capture the screen, etc.
  • Content providers are used to store and retrieve data and make it accessible to applications.
  • the data may include files (such as documents, videos, images, audios), texts and other information.
  • the view system includes visual controls, such as controls that display text, pictures, documents, etc.
  • the view system can be used to build applications.
  • the interface displayed in the window can be composed of one or more views.
  • a display interface including a text message notification icon may include a view for displaying text and a view for displaying pictures.
  • the phone manager is used to provide communication functions of electronic devices.
  • the notification manager enables the application to display notification information in the status bar, which can be used to convey notification-type messages, and can automatically disappear after a short stay without user interaction.
  • the Android runtime includes core libraries and a virtual machine.
  • the Android runtime is responsible for the scheduling and management of the Android system.
  • the core library of the Android system includes two parts: one part is the function function that the Java language needs to call, and the other part is the core library of the Android system.
  • the application layer and the application framework layer run in virtual machines. Taking Java as an example, the virtual machine executes the Java files of the application program layer and the application program framework layer as binary files. The virtual machine is used to perform functions such as object life cycle management, stack management, thread management, security and exception management, and garbage collection.
  • a system library can include multiple function modules. For example: surface manager, media library, 3D graphics processing library (eg: OpenGL ES), 2D graphics engine (eg: SGL), etc.
  • the surface manager is used to manage the display subsystem and provides the fusion of 2D and 3D layers for multiple applications.
  • the media library supports playback and recording of various commonly used audio and video formats, as well as still image files, etc.
  • the media library can support a variety of audio and video encoding formats, such as: MPEG4, H.564, MP3, AAC, AMR, JPG, PNG, etc.
  • the 3D graphics processing library is used to implement 3D graphics drawing, image rendering, compositing, and layer processing, etc.
  • 2D graphics engine is a drawing engine for 2D drawing.
  • the kernel (Kernel) layer provides the core system services of the operating system, such as security, memory management, process management, network protocol stack and driver model, etc. are all implemented based on the kernel layer.
  • the kernel layer also acts as an abstraction layer between the hardware and software stacks. There are many drivers related to electronic devices in this layer. The main drivers are: display driver; keyboard driver as an input device; Flash driver based on memory technology devices; camera driver; audio driver; Bluetooth driver; WiFi driver, etc.
  • Fig. 3a is a schematic structural diagram of a possible application system of the solution provided by the embodiment of the present application.
  • the application system provided by the present application includes electronic equipment and a distributed mapping system.
  • the electronic device may be a device equipped with a display screen, a camera, and a sensor. After the electronic device is used to capture images or videos of the environment, select multiple frames of images and report them to the distributed mapping system, and at the same time report the positioning parameters corresponding to each frame of the multi-frame images collected by the electronic device itself to the distributed construction system. Mapping system, and then the distributed mapping system creates a 3D map corresponding to the environment.
  • the electronic device after the electronic device collects images or videos of its environment and selects multiple frames of images, it can also perform voxel extraction and grid conversion based on the depth map after obtaining the depth maps corresponding to the multiple frames of images. and other processing to obtain the grid corresponding to the three-dimensional object in the environment, and upload the grid to the distributed mapping system, and the distributed mapping system generates a white model based on the grid, so that the electronic device can be based on the white model in the subsequent process.
  • the model integrates the virtual digital world with the real world.
  • the distributed mapping system includes at least N computing nodes (computing node 1 to computing node N), where N is an integer greater than 1.
  • the N computing nodes include at least one CPU computing node and at least one GPU computing node.
  • the N computing nodes may be deployed in the same electronic device, or may be deployed in different electronic devices.
  • N computing nodes are connected through a communication network, and wireless communication can be performed between different computing nodes.
  • the distributed mapping system can be deployed in the cloud, and can be realized by one or more cloud servers.
  • the N computing nodes are used to create a three-dimensional map corresponding to the environment based on the distributed processing method and according to the multiple frames of images taken for the same environment (scene) reported by the electronic device and the positioning parameters corresponding to each frame of images.
  • different computing nodes can perform different processing tasks in the process of creating a 3D map, and N computing nodes jointly complete the entire process of creating a 3D map.
  • different computing nodes can perform the same type of processing on different images, so that the processing tasks of multiple frames of images are distributed to multiple computing nodes to perform synchronously, thereby speeding up image processing.
  • the computing node may include a white model processing service, which is used to simplify the grid uploaded by the electronic device, and generate a white model according to the simplified grid.
  • the N computing nodes may include the CPU algorithm component and the GPU algorithm component shown in FIG. 3b.
  • the CPU algorithm component there may be multiple CPU algorithm components in the distributed mapping system, and there may be multiple GPU algorithm components.
  • the GPU algorithm component can be used to perform image processing on multiple frames of images (such as feature extraction, matching, retrieval, etc.), and the CPU algorithm component can be used to generate a three-dimensional map according to the image processing results of the GPU algorithm component.
  • the computing node may also be implemented by other types of algorithm processing components, which are not specifically limited in this embodiment of the present application.
  • the distributed mapping system may further include task queue nodes.
  • the task queue node is used to cache the processing tasks in the process of 3D map creation by queue.
  • Each computing node can read the tasks to be executed from the task queue node and process them accordingly, so as to realize the distributed sequential execution of multi-processing tasks.
  • the task queue node can be implemented by using the message middleware shown in Fig. 3b.
  • the message middleware can be used to asynchronously cache 3D map creation instructions from electronic devices, instructions for processing tasks in the 3D map creation process, etc., and can be shared or assigned to N computing nodes, so that N computing nodes share execution tasks , to balance the system load.
  • the distributed mapping system may further include at least one storage node.
  • At least one storage node is used for temporary storage or permanent storage of data related to the three-dimensional map creation process.
  • at least one storage node may store multiple frames of images, intermediate data and result data processed by multiple computing nodes, and the like.
  • the storage nodes in the distributed mapping system may include cloud databases, object storage services, elastic file services, cached message middleware, etc. .
  • the cloud database can be used to store user information on the electronic device side, instruction information on task processing during the process of creating a three-dimensional map, modification information on the three-dimensional map, and other serialized content that takes up a small storage space.
  • the object storage service can be used to store non-serialized content such as 3D models, high-definition pictures, videos, and animations involved in electronic devices that takes up a large storage space.
  • the elastic file service can be used to store map data of a 3D map generated by a 3D map creation algorithm, and data such as intermediate variables of an algorithm that takes up a large storage space.
  • Cache-type message middleware can be used for data such as intermediate variables that can be serialized and occupy less storage space during the processing of asynchronous cache algorithms, and can be shared with N computing nodes.
  • the distributed mapping system may further include at least one scheduling node. At least one scheduling node is used for overall management of the scheduling of some or all of the N computing nodes, task queue nodes, and at least one storage node.
  • the scheduling nodes in the distributed mapping system may include a cloud scheduling center and an algorithm scheduling center.
  • the cloud scheduling center can manage and schedule the algorithm scheduling center, storage nodes, task queue nodes and other nodes, and can interact with electronic devices for information and data, and can be used as an efficient message processing and distribution node, for example, the cloud scheduling center It can provide the upload address of multi-frame pictures to the electronic device, perform request scheduling on the electronic device side, request and return to the cloud database, etc.
  • the algorithm scheduling center is used to manage and schedule N computing nodes, and can also manage and schedule other algorithm services.
  • the distributed mapping system may further include a positioning node.
  • the positioning node is used for positioning the electronic device according to the three-dimensional map after the three-dimensional map is constructed.
  • the positioning node may include a global visual positioning system (global visual positioning system, GVPS) service and a vector retrieval system (vector retrieval system, VRS) service.
  • GVPS global visual positioning system
  • VRS vector retrieval system
  • the GVPS service can be used for spatial positioning, and determine the 6-degree-of-freedom coordinates of the corresponding position of the current position of the electronic device in the created three-dimensional map.
  • the VRS service is used for vector searches.
  • GVPS service and VRS service can be used as sub-services of computing nodes.
  • FIG. 3a or FIG. 3b is only an exemplary description of the system applicable to the solution provided by the embodiment of the present application, and does not limit the system architecture applicable to the solution provided by the embodiment of the present application.
  • the system to which the scheme provided in the embodiment of the present application is applicable may also add, delete or adjust some nodes, which are not specifically limited in the embodiment of the present application.
  • the execution process of the solution provided by the embodiment of the present application includes at least three stages of mapping initialization, data collection, and mapping. After the three-dimensional map is created based on the method of these three stages, stages such as positioning and adding digital resources can be further included. The method of each stage will be described in detail below.
  • the creation of the three-dimensional map is triggered by the user on the electronic device side.
  • the user operates the electronic device to trigger the electronic device to start the creation of the 3D map.
  • the electronic device sends a map initialization command to the scheduling node of the distributed mapping system.
  • the scheduling of the distributed mapping system After the node receives the mapping initialization instruction, it can assign a map identification (identity document, ID) to the 3D map to be created for the current mapping task and indicate it to the electronic device.
  • the distributed mapping system can manage different 3D maps in a unified manner by assigning map IDs and instructing them to electronic devices, and can synchronize the information of 3D maps with electronic devices to avoid information inconsistencies between electronic devices and distributed mapping systems Cause problems in subsequent map processing or use.
  • the electronic device when performing map building initialization, the electronic device may display an initialization control interface as shown in FIG.
  • Prompt information for indicating the way to trigger mapping can be displayed, for example, "click the button to start recording", and the user can trigger the mapping process by clicking the control displayed on the display screen of the electronic device according to the prompt information.
  • the electronic device can send the above-mentioned mapping initialization command to the cloud scheduling center in the distributed mapping system, and the cloud scheduling center requests the cloud database to obtain Indicate to the electronic device after the map ID.
  • the electronic device After the above-mentioned electronic device triggers the mapping process, it can collect multiple frames of images of the environment and the positioning information corresponding to each frame of images and upload them to the distributed mapping system.
  • the distributed mapping system can determine multiple frames by image processing.
  • the feature information of the frame image is used to further create a three-dimensional map corresponding to the environment. This process mainly includes the following steps 1 to 4:
  • Step 1 The electronic device scans the video of the environment where it was captured.
  • the electronic device can display the environment image currently scanned by the camera on the display screen in real time, and at the same time display prompt information for prompting the user to continue scanning. Then the user can continue to scan the environment through the mobile electronic device according to the prompt information. During this process, the electronic device continues to collect environmental images until the user instructs the electronic device to stop scanning. Then the electronic device can obtain real scenes in a certain size space in the environment through scanning. video data.
  • the electronic device may display the interface shown in FIG. 5 .
  • the interface includes the environment image scanned by the camera of the current electronic device, prompt information for prompting the user to continue scanning, and a control for triggering the stop of scanning. Then the user can scan and shoot a wider range of space through the mobile electronic device.
  • the user determines to stop scanning, he can click the control that triggers the stop of scanning shown in Figure 5, and the electronic device can stop scanning and scan according to the scanned content. Generate video files.
  • Step 2 The electronic device extracts multiple frames of images meeting key frame requirements from the video, and uploads the multiple frames of images to the distributed mapping system.
  • the electronic device may acquire the current pose information of the electronic device by running a simultaneous localization and mapping (SLAM) algorithm during the process of scanning the environment.
  • the pose information is used to represent the camera pose of the electronic device in the coordinate system of the electronic device (that is, the coordinate system created when the electronic device runs the SLAM algorithm), and the camera pose includes position and orientation.
  • the electronic device scans different images, the shooting angles are different, and the corresponding camera coordinate systems are different, so the pose information corresponding to each frame of image is different.
  • the pose information corresponding to each frame of image is the pose information measured by the SLAM algorithm when the frame of image is captured by the electronic device.
  • the electronic device After the electronic device obtains the video, it can select an image that meets the key frame requirement from the video in any of the following ways:
  • the electronic device obtains the pose information when the frame of image is collected, and compares the pose information with the pose information when the previous frame meets the requirements of the key frame. Compared. If it is determined that the offset between the camera coordinate systems indicated by the two pose information is greater than the set offset threshold, it is determined that the frame image is an image that meets the key frame requirements; otherwise, it is determined that the frame image does not meet the key frame requirements. frame requirements, continue to judge the image of the next frame until all the images in the video have been determined whether they meet the key frame requirements, so as to select the image that meets the key frame requirements.
  • the electronic device can extract the local features of each frame of images, and determine the feature points in each frame of images according to the extracted local features of each frame of images, and then use the optical flow tracking method to track the local feature points, according to the feature points
  • the point tracking situation selects the image that meets the key frame requirements.
  • the optical flow tracking method can be used to determine whether the feature points in the current frame image exist in the next frame image, so based on the optical flow tracking method, the number of identical feature points contained in the two frames of images can be judged.
  • the electronic device For each frame image in the video, after the electronic device extracts the feature points in the frame image, determine the number of the same feature points contained in the frame image and the previous frame image that meets the key frame requirements, if the number is less than the set number Threshold or the ratio of the number to the number of all features in the frame image is less than the set ratio threshold, then determine that the frame image is an image that meets the key frame requirements, otherwise, determine that the frame image does not meet the key frame requirements, and proceed to the next step Judgment of one frame of image until all the images in the video have been determined to meet the key frame requirements, so as to select the image that meets the key frame requirements.
  • the electronic device may use the first frame image in the video as the first image that meets the key frame requirements, and based on this image, continue to select images that meet the key frame requirements from the remaining frame images .
  • the electronic device when the electronic device uploads an image that meets the key frame requirements, it can upload frame by frame, that is, every time the electronic device selects an image that meets the key frame requirements, it uploads the image to the distribution network. At the same time, continue the process of selecting the next image that meets the key frame requirements.
  • the electronic device can also choose to obtain all the images meeting the key frame requirements, and then upload these images to the distributed mapping platform.
  • the electronic device After the electronic device selects a multi-frame image from the video that meets the key frame requirements, it uploads the multi-frame image to the distributed mapping system.
  • the distributed mapping system can store and manage the multi-frame image, and can perform multi-frame image Image Processing.
  • the electronic device after the electronic device selects an image meeting the key frame requirement, it sends an image transmission request to the cloud scheduling center to request uploading the image. After receiving the image transmission request, the cloud scheduling center returns the URL for uploading the image to the electronic device, and then the electronic device uploads the image to the object storage service for storage according to the URL.
  • Step 3 The electronic device collects the positioning information corresponding to each frame of image and uploads it to the distributed mapping system.
  • the positioning information corresponding to each frame of image includes pose information, global positioning system (global positioning system, GPS) information and inertial measurement unit (inertial measurement unit, IMU) information when the electronic device acquires the frame of image.
  • the pose information is measured by using the SLAM algorithm when the frame of image is captured by the electronic device.
  • the GPS information is used to indicate the position of the electronic device in the real environment determined by GPS positioning when the electronic device captures the frame of image.
  • the IMU information is used to indicate the attitude feature of the electronic device measured based on the IMU sensor when the electronic device captures the frame of image, and the attitude feature may be, for example, an attitude angle.
  • the electronic device can also send the internal camera parameters of the camera to the distributed mapping system.
  • the electronic device may upload the collected positioning information to the distributed mapping system in the form of meta data.
  • electronic devices can send metadata to the cloud dispatching center.
  • the cloud dispatching center After receiving the metadata, the cloud dispatching center sends the metadata to the cache-type message middleware for caching for CPU Algorithm component or GPU algorithm component to use.
  • the cloud scheduling center can store metadata to elastic file services.
  • Step 4 The distributed mapping system performs image processing for each frame of image.
  • each computing node in the distributed mapping system perform image processing for each frame of image respectively.
  • each computing node can select an unprocessed frame of images from multiple frames of images, and perform image processing on the frame of images, and continue to select the next frame of unprocessed images for image processing after the processing is completed. process until it is determined that all images have been processed.
  • the computing node selects one frame of images from multiple frames of images, it may choose randomly, or it may select according to the order of the multiple frames of images (for example, the order in which the images are uploaded to the distributed mapping system).
  • the above image processing process includes the following steps A1-A2:
  • A1 Feature extraction: Computational nodes extract the global features of the image.
  • the computing node can perform local feature extraction and global feature extraction on the image.
  • the computing node can extract feature vectors from multi-scale grayscale features of each region in the image, obtain local features of the image, and extract feature points in the image.
  • the feature vector can be used to represent the texture feature of the local area in the image.
  • the computing node can use the trained network model to describe the features of the visually stable areas (such as buildings and roads) in the image to obtain the global features of the image.
  • the global feature can be used to characterize the overall structural features of the image.
  • A2 Serialization processing: The calculation node selects the image that matches the image from the processed image according to the global characteristics of the image.
  • This step includes feature retrieval, feature matching and feature verification.
  • the feature retrieval means that the calculation node retrieves the global features of the processed image (that is, the image that has undergone the above-mentioned image processing process) according to the global features of the image, and obtains the set with the closest distance to the global features of the image. A certain number of global features are used, and the image corresponding to the retrieved global feature is used as a candidate frame image.
  • the electronic device may also simultaneously acquire a set number of frame images whose acquisition time is earlier than the acquisition time of the image and which is closest to the acquisition time of the image as candidate frame images.
  • Feature matching means that the calculation node matches the local features in the candidate frame image with the local features of the image, so as to select at least one group of matching pairs that meet the set matching conditions, where each matching pair includes multiple matching pairs that meet the set matching conditions.
  • feature points when matching, the calculation node can use the nearest neighbor (k-nearest neighbor, KNN) matching algorithm to select the feature points that match the local feature points in the image from the local feature points of the candidate frame image, and match the local feature points in the image.
  • KNN nearest neighbor
  • the feature points form matching pairs.
  • Computing nodes can also select matching pairs by using the deep learning model for matching after training the deep learning model.
  • the set matching condition can be any of the following: the vector distance between the local feature descriptors of the feature points is less than or equal to the set threshold; the feature descriptors of the feature points are the features with the closest vector distance Descriptor; the ratio between the closest vector distance and the next closest vector distance between feature descriptors of feature points is greater than or equal to the set threshold.
  • feature descriptors are used to describe local features.
  • Feature checking means that the electronic device filters out incorrectly matched information from the result of feature matching processing.
  • the electronic device may perform feature verification processing using algorithms such as random sampling consistency verification.
  • each computing node can determine the matching pair relationship (matching or mismatching) between the image processed by itself and other processed images. Therefore, after the image processing process of all images is completed, all Matching pair relationships between images. Wherein, multiple feature points of the same group of matching pairs correspond to the same three-dimensional point in the real environment.
  • the cloud dispatching center can send a map transfer message to the queue-type message middleware to indicate that each frame of image corresponds to The map transfer task, the queue-type message middleware caches the information of the map transfer task.
  • Each GPU algorithm component respectively reads the image transfer task from the queue-type message middleware, and after reading the image corresponding to the image transfer task from the object storage service, performs the above image processing process on the read image, and sends the processing result (That is, image matching information) is stored in the elastic file service, and at the same time, the processed identifier and the intermediate results of the processing process (such as the global characteristics of the image, etc.) are sent to the cache-type message middleware for caching. Then, during the image processing process of the subsequent GPU algorithm node, the global features of the processed image can be read from the cached message middleware for serialization processing, etc.
  • the execution sequence of some of the above steps does not have strict timing requirements, and can be adjusted according to actual conditions.
  • the execution of the above steps 3 and 4 depends on the image selected in the above step 2, but the above steps 3 and 4 can be disordered, that is, the computing node is executing the above steps 3 and 4 , you can perform one of the steps first, and then perform the other step, or you can perform both steps at the same time.
  • each computing node executes the above steps 1 to 4 independently of other computing nodes, and any two computing nodes do not interfere with each other. For example, after the electronic device uploads two frames of images, computing node 1 can process the first frame of image according to the above steps 1-4.
  • computing node 2 can also process the second frame of images according to the above steps Frame image processing. For another example, after the electronic device uploads a frame of image, the computing node can execute the above steps 1-4 for the frame of image processing, and at the same time, the electronic device side continues to execute the process of selecting the next frame of image.
  • the electronic device may also simultaneously display a grid covering the outline of the environment space on the display screen, so as to prompt and guide the user to complete the scanning process.
  • the electronic device may use a time of flight (TOF) method to collect a depth map of an image that meets the key frame requirements, or, according to the selected image that meets the key frame requirements, use multi-view stereo matching (multi-view stereo matching). , MVS) to get the corresponding depth map.
  • TOF time of flight
  • MVS multi-view stereo matching
  • the electronic device After the electronic device obtains the depth map of each frame of image, it can use algorithms such as truncated signed distance function (TSDF) based on truncated signed distance function (TSDF) to extract voxels from each frame of image and determine the depth of each voxel in the three-dimensional voxel space value. After obtaining the voxels, the electronic device can convert the voxels into grids by using the marching cubes algorithm according to the depth value of each voxel and render them, and then display them in the corresponding areas in the environment interface shown on the display screen.
  • TSDF truncated signed distance function
  • TSDF truncated signed distance function
  • the electronic device may perform voxel extraction and grid conversion on the depth map of the image corresponding to the interface shown in FIG. 5 to obtain a grid.
  • the electronic device can display the interface shown in Figure 6a.
  • the grid coverage area is the area that has been scanned and is not covered by the grid
  • the area is the area to be scanned or the area where the corresponding grid cannot be generated.
  • the electronic device can present the scanned and unscanned areas to the user in real time when the user operates the electronic device to scan the environmental space, so as to guide the user to continue operating the electronic device according to the grid prompts.
  • Scanning enables the grid to cover as many three-dimensional objects in the real environment space to be scanned as possible, thereby completing the scanning process simply and quickly, reducing the operational difficulty of collecting environmental images, and improving user experience.
  • a control that triggers the end of scanning is displayed on the display screen of the electronic device at the same time, so when the user confirms that the grid covers the area of interest, or when the user determines that each area of interest covers a grid of a certain size, Scanning can be stopped by manipulating this control.
  • the user when the grid displayed by the electronic device covers a certain area, the user can end the scan by clicking on the space that triggers the end of the scan, and the electronic device can display the grid shown in Figure 6c in response to the user's operation.
  • the interface shown above includes a control that prompts the user to start generating a map.
  • the user can operate the control to trigger the task of creating a 3D map, so as to further generate a 3D map corresponding to the environment space based on the content scanned by the user's electronic device. .
  • the shape of the grid generated by the above-mentioned electronic device may be any shape, such as regular shapes such as triangles and rectangles shown in FIG. 6b, or other irregular shapes, which are not specifically limited in this embodiment of the present application.
  • the electronic device can upload the grid obtained in the above process to the distributed mapping system, and the computing nodes in the distributed mapping system will simplify the grid and correspondingly Generation of white molds so that this information can be used in subsequent processing.
  • the corresponding white model can be generated through algorithms such as plane extraction, intersection calculation, polyhedron topology construction, and surface optimization.
  • an electronic device when it uploads a grid, it can send a grid upload request to the cloud dispatching center, and the cloud dispatching center returns the grid upload request to the electronic device after receiving the grid upload request.
  • Format upload address information The electronic device can upload the grid at the end of scanning to the object storage service according to the grid upload address for storage. After uploading the grid, the electronic device can send a grid upload completion notification message to the cloud service center, and the cloud service can send the grid simplification task to the queue-type message middleware after receiving the notification message.
  • the white model processing service in the CPU algorithm component can listen to the grid simplification tasks cached in the queue-type message middleware and execute the corresponding grid simplification tasks after receiving the tasks.
  • the result obtained from the grid simplification task performed by the white model processing service can be sent to the elastic file storage service for storage, and the corresponding grid simplification completion notification message is sent to the queue-type message middleware.
  • the cloud service center listens to the grid simplification completion notification message, it can obtain the simplified grid from the elastic file service, and send the simplified grid to the object storage service for storage, and at the same time, the grid simplification result (that is, the simplified The final grid) is stored in the cloud database.
  • the electronic device can guide the user to collect images of real-world scenes through the visual grid, which reduces the difficulty of image collection on the user side, thereby improving the user experience.
  • the distributed mapping system can be triggered to start the mapping task, and the three-dimensional map can be created according to the image uploaded by the electronic device that meets the key frame requirements and the positioning information corresponding to each frame of image.
  • the computing node of the distributed mapping system performs the image processing described in the above step 4 on all the images uploaded by the electronic device that meet the key frame requirements, the three-dimensional map can be created according to the following steps B1-B4:
  • the calculation node generates a scene matching relationship graph (scene graph) according to the multi-frame images that meet the key frame requirements, where the scene matching relationship graph is used to represent the matching pair relationship of local feature points between multiple frames of images.
  • scene graph scene graph
  • the calculation node can determine the common-view relationship of multi-frame images according to the matching relationship between all multi-frame images that meet the key frame requirements, and then obtain the scene matching relationship diagram after optimizing the common-view relationship.
  • the scene matching graph can be regarded as an abstract network composed of "vertices" and "edges". Each vertex in the network can represent a frame of image, and each edge represents a pair of matching pairs of feature points between images. Different "vertices” can be connected through “edges”, which means that two vertices connected through “edges” have an association relationship, that is, a matching relationship between two frames of images represented by two "vertices”.
  • the calculation node determines the corresponding three-dimensional points in three-dimensional space of each feature point in the multi-frame image according to the scene matching relationship graph.
  • the calculation node After the calculation node generates the scene matching relationship graph, it can determine the corresponding 3D position of each feature point in the multi-frame image in the 3D space according to the scene matching relationship graph, the pose information (camera pose) from the electronic device, and the internal camera parameters from the electronic device. point.
  • the coordinate system corresponding to the three-dimensional space is the above-mentioned coordinate system of the electronic device.
  • the calculation node can use algorithms such as direct linear transformation (DLT) combined with the camera pose and internal parameters of the electronic device to solve the corresponding angle of the feature point in the three-dimensional space. position (that is, triangulation), and determine the point at the position as the three-dimensional point corresponding to the feature point in the three-dimensional space.
  • DLT direct linear transformation
  • the calculation node determines the corresponding three-dimensional points in the three-dimensional space of all the feature points in the scene matching relationship graph, a three-dimensional map composed of these three-dimensional points can be obtained, and the three-dimensional map is a three-dimensional point cloud map.
  • B3 The computing node optimizes the coordinates of 3D points in 3D space.
  • the calculation node can perform bundle adjustment (BA) optimization on the 3D points obtained from the above solution, that is, optimize the camera pose, 3D point
  • BA bundle adjustment
  • the computing node generates a 3D map based on the optimized 3D points.
  • the computing node can combine the GPS information and IMU information corresponding to each frame of image uploaded by the electronic device to perform smoothing and denoising processing to obtain the real-world camera pose corresponding to the image, that is, the position and position of the electronic device in the environment when the image is taken.
  • Orientation that is, the orientation of the camera of the electronic device relative to the environment
  • align the coordinates of the 3D point in the 3D space with the camera pose in the real world so as to adjust the coordinate system of the 3D space to be in line with the real world
  • the coordinate system of the environment space is consistent, and then a three-dimensional point cloud map with the same proportion as the real environment is obtained, and the three-dimensional point cloud map is a point cloud map corresponding to the real environment scene.
  • the computing node in the process of aligning the coordinates of the 3D points in the 3D space with the camera pose in the real world by the computing node, the computing node first determines the pose and pose of the electronic device in the coordinate system of the electronic device corresponding to each frame of image and the The transformation matrix between the real-world poses, and then average the transformation matrices corresponding to multiple frames of images to obtain the target transformation matrix. Then use the target transformation matrix to transform the coordinates of the point cloud in the three-dimensional space.
  • FIG. 7 is a schematic diagram of a three-dimensional map.
  • the three-dimensional points in the three-dimensional map shown in FIG. It is used to characterize the position of the 3D point in the real environment corresponding to the 3D point in the real environment.
  • the electronic device after the user triggers the task of creating a three-dimensional map by operating the control displayed on the electronic device to trigger mapping, the electronic device requests the distributed mapping system to create a three-dimensional map based on the content uploaded by the electronic device. And it can display prompt information for prompting the progress of map creation.
  • the electronic device may display the interface shown in (a) schematic diagram in FIG. 8a, and the interface may include information such as the map ID of the 3D map to be created, a thumbnail, prompt information for indicating that the map is being built, and time.
  • the prompt information may be "under construction", and at the same time, the thumbnail page corresponding to the 3D map may be in an unlit state to remind the user that the map is being constructed and the 3D map is not editable.
  • the distributed mapping system can complete the creation of the three-dimensional map according to the above steps, and send the information of the created three-dimensional map to the electronic device.
  • the thumbnail page shown in (a) in FIG. The interface shown in the schematic diagram reminds the user that the map construction has been completed and the current map is editable.
  • the electronic device may also firstly display the interface shown in (a) in FIG.
  • the electronic device determines that the creation of the map is completed, it displays the interface shown in (b) schematic diagram in FIG.
  • the electronic device may also display the progress of map creation in percentage or other ways.
  • the electronic device when it triggers mapping, it can send an instruction requesting mapping and the number of image scans for this mapping (that is, the number of images that meet the key frame requirements Quantity) and other information to the cloud dispatching center, then the cloud dispatching center sends the mapping task to the queue-type message middleware, and at the same time, the basic information and user attribute information of this mapping are saved in the cloud database.
  • Each CPU algorithm component can monitor and process the mapping tasks in the queue-type message middleware, and finally generate a 3D map file and store it in the elastic file service.
  • each CPU algorithm component can send information such as map building progress, map building success or failure, map alignment matrix (transformation matrix between SLAM coordinate system and real world coordinate system) to the queue-type message center file, and save the mapping result (that is, the created 3D map) to the elastic file service.
  • the cloud scheduling center can monitor the information of the map construction progress in the queue-type message middleware, so as to obtain the processing progress, status, map alignment matrix and other information of the current map construction task and store these information in the cloud database.
  • the electronic device side provides initial data related to the real world scene required to create the 3D map
  • the distributed mapping system can complete the creation process of the 3D map according to the initial data provided by the electronic device. Therefore, the difficulty and cost of creating a map on the electronic device side are relatively low, which is convenient for popularization and promotion.
  • multiple computing nodes can simultaneously create maps, and multiple nodes can support large-scale computing, so it has high mapping efficiency.
  • SLAM pose, GPS, IMU and other information are fully used in the mapping process, which further improves the accuracy of the created 3D map.
  • FIG. 9 is a schematic flowchart of a method for creating a three-dimensional map provided in an embodiment of the present application. As shown in FIG. 9, the process of the method may include:
  • S901 The electronic device scans the video of the real-world scene during the moving process, and selects an image that meets the key frame requirement; and, runs a SLAM algorithm and provides positioning information corresponding to each frame of image.
  • the positioning information includes pose information, GPS information, and IMU information.
  • the moving process of the electronic device is controlled by the user.
  • S902 The electronic device uploads images satisfying key frame requirements to different computing nodes in the distributed mapping system.
  • Each computing node in the distributed mapping system extracts global features from the image from the electronic device.
  • Each computing node in the distributed mapping system performs feature retrieval, matching and verification according to the extracted global features, and determines the matching relationship between the processed image and other images.
  • the calculation node in the distributed mapping system generates a three-dimensional point corresponding to a feature point in the image uploaded by the electronic device in the three-dimensional space according to the scene matching relationship graph.
  • the feature points can be obtained by extracting local features of the image.
  • S907 The computing nodes in the distributed mapping system optimize the coordinates of the three-dimensional points in the three-dimensional space through the BA algorithm.
  • S908 The computing nodes in the distributed mapping system perform smoothing and denoising on the positioning information corresponding to each frame of image uploaded by the electronic device.
  • the calculation node in the distributed mapping system performs coordinate alignment on the camera pose and the three-dimensional point of the electronic device based on the processed positioning information, and obtains a three-dimensional map based on the aligned three-dimensional point.
  • S910 The computing node of the distributed mapping system instructs the electronic device on information of the created three-dimensional map.
  • step S901 it may also execute the following steps S911-S915 to generate a grid and upload it to the distributed mapping system for processing:
  • S911 The electronic device acquires a depth map corresponding to an image meeting a key frame requirement.
  • S912 The electronic device performs TSDF fusion on the depth image by combining the feature points in the image meeting the key frame requirements to generate voxels.
  • S913 The electronic device performs grid extraction and rendering on the fused voxels to obtain a grid used to cover a three-dimensional object in a real-world scene.
  • S914 The electronic device uploads the grid to the computing nodes of the distributed mapping platform.
  • S915 The computing nodes of the distributed mapping platform simplify the grid, generate and save a white model according to the simplified grid.
  • the electronic device after the electronic device obtains the three-dimensional map created by the distributed mapping system, it can use the three-dimensional map for positioning.
  • the electronic device can respond to the user's operation of selecting the three-dimensional map, and collect information about the current environment.
  • At least one frame of image is uploaded to the distributed mapping system, and the distributed mapping system uses the GVPS method to determine the corresponding position of the electronic device in the three-dimensional map based on at least one frame of image and the previously created three-dimensional map corresponding to the environment, thereby obtaining The location of the electronic device in that environment.
  • the distributed mapping platform determines the position of the electronic device, it indicates the position to the electronic device, and then the electronic device can display the interface shown in FIG. 10a.
  • the content in this interface is the environment image currently scanned by the electronic device. At this point, the position of the electronic device in the environment has been determined, so the user can add digital resources to the environment image currently scanned by the electronic device.
  • the electronic device after the user triggers positioning, the electronic device sends a positioning request and at least one currently scanned image to the cloud dispatching center, and the cloud dispatching center will receive the positioning request and at least one frame of images are sent to the GVPS positioning service.
  • the GVPS positioning service reads the map data of the three-dimensional map stored in the elastic file service, and determines the corresponding position of the current pose of the electronic device in the three-dimensional map according to the map data and at least one frame of image (this position is used to indicate that the electronic device is in the environment location), and send the location information to the cloud dispatch center.
  • the cloud dispatching center queries the POI information related to the current map from the cloud database
  • the POI information and the location information from the GVPS service are sent to the electronic device.
  • the electronic device can download the three-dimensional digital resource model from the object storage service according to the received POI information, render it, and add it to the digital world scene displayed by the electronic device.
  • the electronic device displays the materials including the three-dimensional digital resource model in the interface shown in FIG. 10a, from which the user can select materials and add them to the digital world scene shown in FIG. 10a.
  • the electronic device may display the interface shown in Figure 10b after the 3D digital resource model is added.
  • the interface includes not only the map of the real environment scene, but also the virtual resource model added by the user. Realize the fusion display of real world scenes and virtual digital scenes.
  • the electronic device may display a corresponding white model in the area to guide the user to select a suitable area to place the material.
  • the electronic device determines the position corresponding to the placement area in the three-dimensional map, and uses this position as the corresponding position of the material added by the user in the real environment scene.
  • the electronic device may request the cloud dispatch center for a list of digital resources corresponding to the three-dimensional digital resource model.
  • the cloud scheduling center obtains the digital resource list corresponding to the current user by querying the cloud database, and sends it to the electronic device.
  • the electronic device downloads the three-dimensional digital resource model from the object storage service through the URL, and adds it to the digital world scene.
  • the user can trigger the electronic device to upload the size, pose and other information of the current 3D digital resource model to the cloud dispatching center by clicking the saved control displayed on the electronic device, and the cloud dispatching center sends the information to the cloud database for processing. save.
  • the distributed mapping platform can save the information of the 3D digital resource model added by the user.
  • the electronic device When the electronic device is in the area of the model, when the electronic device displays the map corresponding to the area, it also displays the 3D digital resource model placed therein, so that the user can still see the 3D resource model previously placed in the area.
  • the electronic device can provide the user with the function of using and editing the created 3D map, and at the same time allow the user to add virtual digital resources to the created 3D map, realizing the fusion application of the real environment scene and the virtual digital scene.
  • the embodiment of the present application also provides a method for creating a three-dimensional map, which is applied to a distributed system including multiple computing nodes, as shown in Figure 11, the method includes:
  • Each of the multiple computing nodes respectively selects a frame of image to be processed from the multi-frame images from the electronic device as the target image, and performs target processing on the target image to the multi-frame image
  • the target processing process has been carried out; wherein, the multiple frames of images are images taken by the electronic device for the same environment; the target processing process includes the following steps: extracting the first feature point of the target image; The feature points of at least one frame of image for the target processing process; at least one second feature point is selected from the feature points of the at least one frame of image to form a feature matching pair with the first feature point; wherein, the first A feature point and the at least one second feature point correspond to the same point in the environment.
  • the target computing node acquires multiple feature matching pairs obtained after performing target processing on the multi-frame images, and creates a 3D point cloud map according to the multiple feature matching pairs; wherein the target computing node is the Any one of the plurality of computing nodes.
  • the embodiment of the present application further provides an electronic device, which is used to implement the method performed by one or more computing nodes among the multiple computing nodes provided in the embodiment of the present application, or, The electronic device is used to implement the method performed by the electronic device provided in the embodiment of the present application.
  • an electronic device 1200 may include: a memory 1201 , one or more processors 1202 , and one or more computer programs (not shown in the figure). The various components described above may be coupled through one or more communication buses 1203 .
  • the electronic device 1200 may further include a display screen 1204 .
  • one or more computer programs are stored in the memory 1201, and one or more computer programs include computer instructions; one or more processors 1202 call the computer instructions stored in the memory 1201, so that the electronic device 1200 executes the present application.
  • the method for creating a three-dimensional map provided in the embodiment.
  • the display screen 1204 is used to display relevant user interfaces such as images, videos, and application interfaces.
  • the memory 1201 may include a high-speed random access memory, and may also include a non-volatile memory, such as one or more magnetic disk storage devices, flash memory devices or other non-volatile solid-state storage devices.
  • the memory 1201 may store an operating system (hereinafter referred to as the system), such as embedded operating systems such as ANDROID, IOS, WINDOWS, or LINUX.
  • the memory 1201 can be used to store the implementation program of the embodiment of the present application.
  • the memory 1201 can also store a network communication program, which can be used to communicate with one or more additional devices, one or more user devices, and one or more network devices.
  • One or more processors 1202 may be a general-purpose central processing unit (Central Processing Unit, CPU), a microprocessor, a specific application integrated circuit (Application-Specific Integrated Circuit, ASIC), or one or more integrated circuit for program execution.
  • CPU Central Processing Unit
  • ASIC Application-Specific Integrated Circuit
  • FIG. 12 is only an implementation of the electronic device 1200 provided in the embodiment of the present application. In practical applications, the electronic device 1200 may include more or fewer components, which is not limited here.
  • the embodiments of the present application further provide an electronic device, the electronic device includes a method for executing the method performed by at least one computing node among the methods provided in the above embodiments module/unit.
  • an embodiment of the present application further provides a distributed system, where the distributed system includes a plurality of electronic devices, wherein each electronic device can be realized by the above-mentioned electronic device.
  • the embodiments of the present application also provide a computer-readable storage medium, the computer-readable storage medium stores a computer program, and when the computer program runs on the computer, the computer executes the A method executed by at least one computing node among a plurality of computing nodes in the method, or a method executed by an electronic device in the methods provided in the foregoing embodiments.
  • the embodiments of the present application also provide a computer program product, the computer program product includes a computer program or instruction, when the computer program or instruction is run on the computer, the computer is made to execute the method provided by the above embodiment The method executed by at least one computing node among the plurality of computing nodes, or the method executed by the electronic device in the methods provided in the foregoing embodiments.
  • the methods provided in the embodiments of the present application may be implemented in whole or in part by software, hardware, firmware or any combination thereof.
  • software When implemented using software, it may be implemented in whole or in part in the form of a computer program product.
  • a computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, the processes or functions according to the embodiments of the present invention are produced in whole or in part.
  • a computer can be a general purpose computer, special purpose computer, computer network, network equipment, user equipment, or other programmable apparatus.
  • Computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, e.g.
  • Computer-readable storage media can Any available media that can be accessed by a computer or a data storage device such as a server, data center, etc. integrated with one or more available media. Available media can be magnetic media (for example, floppy disks, hard disks, tapes), optical media (for example, A digital video disc (DVD for short), or a semiconductor medium (for example, SSD), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Multimedia (AREA)
  • Processing Or Creating Images (AREA)
  • Image Processing (AREA)

Abstract

本申请提供一种三维地图的创建方法及电子设备,应用于包含多个计算节点的分布式系统,该方法包括:每个计算节点从来自电子设备的多帧图像中选择一帧图像作为目标图像,对目标图像进行目标处理过程,至多帧图像均已进行目标处理过程;多帧图像为针对同一环境拍摄的图像;目标处理过程包括:提取目标图像的第一特征点;获取已进行目标处理过程的至少一帧图像的特征点;在至少一帧图像的特征点中选择至少一个第二特征点与第一特征点组成特征匹配对;第一特征点和至少一个第二特征点对应环境中同一点;目标计算节点获取对多帧图像进行目标处理过程后得到的多个特征匹配对,根据多个特征匹配对创建三维点云地图;目标计算节点为任一计算节点。

Description

一种三维地图的创建方法及电子设备
相关申请的交叉引用
本申请要求在2021年12月31日提交中国专利局、申请号为202111665748.X、申请名称为“一种三维地图的创建方法及电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及电子设备技术领域,尤其涉及一种三维地图的创建方法及电子设备。
背景技术
元宇宙/虚拟世界(metaverse)概念意在让人对世界的认识和感知方式提升到新的层次。数字世界是现有技术下主流的拓展人类认识和改造事物方法的新的维度。数字世界与真实世界的关键纽带在于高精度、等比例、高保真的三维地图。因此快速、高效、精确的创建三维(高精)地图是进入数字世界的首要和必要条件。三维地图的生产可能涉及地理测绘、计算机视觉等诸多领域,三维地图的应用涵盖增强现实(augmented reality,AR)游戏、虚拟现实(virtual reality,VR)游戏等虚拟世界相关的各类产品。
当前创建三维地图时,一种方案是可以通过实时差分定位(real-time kinematic,RTK)或城市控制点精确测量真实世界的真实坐标作为起始点或控制点,并通过激光扫描仪以及全景相机扫描得到彩色三维点云作为地图数据,再通过控制点对齐算法将真实世界的真实坐标与三维点云对齐后生成高精度带绝对位置的三维地图。该方案中需要专用的点云采集设备,因此对设备的依赖性较高,导致三维地图制作的成本也较高,不利于推广使用。此外,该方案中采集的点云一般比较稠密,会占用较大的存储空间,且处理点云数据的时间较长,因此创建三维地图的效率较低。
另一种方案是可以利用手机采集真实世界的图像后上传至云端服务器,云端服务器对手机上传的图片依次进行处理,通过稀疏重建产生稀疏三维点及相机视角,再结合三维点和相机视角进行稠密重建,最终生成三维地图。该方案中,云端服务器需要有手机上传的所有帧图像才可以进行三维地图的创建,且云端服务器对后一帧图像的处理过程依赖于对前一帧图像的处理结果,因此运算服务器需要按序处理手机上传的图片,导致处理效率较低。此外,该方案中,图像采集过程对使用手机的采集者有较高的技术要求,需要用户理解三维重建的概念才能采集到合适的、能够较好创建三维地图的图像,因此,该方案的通用性较差,不利于推广使用。
综上,当前创建三维地图的方法存在处理效率低、通用性差的问题,难以简便高效的进行三维地图的创建。
发明内容
本申请提供一种三维地图的创建方法及电子设备,用以提高创建三维地图的效率,降低创建三维地图的难度,能够简便快速的实现三维地图的创建。
第一方面,本申请提供一种三维地图的创建方法,应用于包含多个计算节点的分布式系统,该方法包括:每个计算节点分别从来自电子设备的多帧图像中选择待处理的一帧图像作为目标图像,并对所述目标图像进行目标处理过程,至所述多帧图像均已进行所述目标处理过程;其中,所述多帧图像为所述电子设备针对同一环境拍摄的图像;所述目标处理过程包括以下步骤:提取所述目标图像的第一特征点;获取已进行所述目标处理过程的至少一帧图像的特征点;在所述至少一帧图像的特征点中选择至少一个第二特征点与所述第一特征点组成特征匹配对;其中,所述第一特征点和所述至少一个第二特征点对应所述环境中的同一点;目标计算节点获取对所述多帧图像进行目标处理过程后得到的多个特征匹配对,并根据所述多个特征匹配对创建三维点云地图;其中,所述目标计算节点为所述多个计算节点中的任一计算节点。
在该方法中,分布式系统可以根据电子设备提供的多帧图像完成三维点云地图的创建,其中,分布式系统中的多个计算节点可以独立进行不同图像的处理过程,因此能够通过多节点支持大规模计算,进而提高处理效率,简便快速的实现三维地图的创建。
在一种可能的设计中,所述目标计算节点根据所述多个特征匹配对创建三维点云地图,包括:所述目标计算节点获取来自所述电子设备的多个第一位姿信息;其中,所述多个第一位姿信息与所述多帧图像一一对应,每个第一位姿信息用于指示所述电子设备拍摄相应第一位姿信息对应的图像时在第一三维空间中的位置和朝向,所述第一三维空间为所述三维点云地图对应的三维空间;所述目标计算节点根据所述多个第一位姿信息,确定所述多个特征匹配对在所述第一三维空间中对应的多个三维点,得到所述三维点云地图。
在该方法中,电子设备的第一位姿信息指示电子设备拍摄不同图像时在三维地图中对应的位姿即位置和朝向,而不同图像中存在对应环境中同一点的特征点即特征匹配对中的特征点,这些特征点在三维地图中也对应同一点,因此电子设备拍摄不同图像时在三维地图中对应的位姿是三维地图中同一点的不同视角下的位姿,因此,基于该关系,目标计算节点可以根据电子设备的位姿推导出三维点的位置,进而得到用于构成三维点云地图的多个三维点。
在一种可能的设计中,所述方法还包括:所述目标计算节点获取来自所述电子设备的多个第二位姿信息;其中,所述多个第二位姿信息与所述多帧图像一一对应,每个第二位姿信息用于指示所述电子设备拍摄相应第二位姿信息对应的图像时在第二三维空间中的位置和朝向,所述第二三维空间为所述环境对应的三维空间;所述目标计算节点根据所述多个第二位姿信息和所述多个第一位姿信息,对所述第一三维空间中三维点的坐标进行调整,得到与所述环境等比例的三维点云地图。
在该方法中,电子设备测得第一位姿信息与电子设备在真实环境中的位姿信息可能存在偏差,因此根据第一位姿信息确定的三维点云地图与真实环境地图可能存在偏差,因此,目标计算节点通过获取电子设备测得反映电子设备在真实环境中位姿的第二位姿信息,再根据第二位姿信息对三维点云地图进行调整,可以将三维点云地图与真实环境的地图进行对齐,进而得到与真实环境等比例的三维点云地图,提高获得的三维点云地图的精度。
在一种可能的设计中,所述目标计算节点根据所述多个第二位姿信息和所述多个第一位姿信息,对所述第一三维空间中三维点的坐标进行调整,包括:所述目标计算节点根据所述多个第二位姿信息和所述多个第一位姿信息,确定多个转换矩阵;其中,所述多个转换矩阵与所述多帧图像一一对应,每个转换矩阵用于表征同一图像对应的第二位姿信息与 第一位姿信息之间的转换关系;所述目标计算节点对所述多个转换矩阵求平均得到目标转换矩阵;所述目标计算节点利用所述目标转换矩阵对所述第一三维空间中三维点的坐标进行转换处理。
在该方法中,目标计算节点根据电子设备在不同三维空间中的位姿之间的转换关系,对其中一个三维空间中的三维点进行坐标调整,可以将两个三维空间和对应的坐标系进行对齐。因此,基于该方式进行调整后得到的三维点云地图可以对齐至真实环境的三维空间,从而保证三维点云地图的还原度和精度。
在一种可能的设计中,每个第二位姿信息所指示的位置是所述电子设备通过进行全球定位系统GPS定位确定的;每个第二位姿信息所指示的朝向是所述电子设备通过进行惯性测量单元IMU测量确定的。
在该方法中,目标计算节点从电子设备获取的位姿信息为电子设备的GPS信息和IMU信息,而GPS信息和IMU信息可以准确反映电子设备在真实环境的位姿,因此该方法可以保证目标计算节点能够获取电子设备在真实环境的准确位姿,进而保证后续处理过程的准确度。
在一种可能的设计中,在获取已进行所述目标处理过程的至少一帧图像的特征点之前,所述方法还包括:确定所述至少一帧图像,包括:提取所述目标图像的全局特征;获取已进行所述目标处理过程的每帧图像的全局特征,并在获取的全局特征中选择与所述目标图像的全局特征相似度最高的至少一个全局特征;确定所述至少一个全局特征对应的图像为所述至少一帧图像。
在该方法中,全局特征相似度高的图像之间存在较多相同或类似的特征点,因此这些图像对应环境中相同或邻近区域的可能性较大,基于这些图像能够获得较多、较准确度的特征匹配对,以便根据特征匹配对进行三维地图的创建。
在一种可能的设计中,所述分布式系统中还包括队列节点,在每个计算节点分别从多帧图像中选择待处理的一帧图像作为目标图像之前,所述方法还包括:所述队列节点接收来自所述电子设备的所述多帧图像,并按照从所述电子设备接收图像的顺序,将所述多帧图像添加至目标图像队列中;每个计算节点分别从多帧图像中选择待处理的一帧图像作为目标图像,包括:每个计算节点从所述队列节点的目标图像队列中读取一帧图像,并将读取的图像作为所述目标图像。
在该方法中,分布式系统的多个计算节点可以从队列节点中依次读取图像并对读取的图像进行处理,可以保证不同节点之间有序的对多帧图像分别进行处理,提高处理效率。
在一种可能的设计中,所述目标处理过程还包括:在提取所述目标图像的第一特征点后,将所述第一特征点保存至存储节点;获取已进行所述目标处理过程的至少一帧图像的特征点,包括:从所述存储节点获取所述至少一帧图像的特征点。
在该方法中,每个计算节点从处理的图像中提取特征点后,将特征点保存到存储节点。则所有节点确定的特征点均保存在存储节点,这样每个计算节点可以从存储节点获取其它计算节点提取的其它图像的特征点并进行后续处理,提高了处理效率。
在一种可能的设计中,所述方法还包括:所述目标计算节点将所述三维点云地图发送到所述电子设备。
在该方法中,分布式系统三维地图后将三维地图发送到电子设备以供电子设备使用。其中,在建图过程中,电子设备仅需提供多帧图像和位姿等信息,而复杂的三维地图创建 过程由分布式系统完成,因此可以降低电子设备侧获取三维地图的难度和成本。
第二方面,本申请提供一种电子设备,所述电子设备包括存储器和一个或多个处理器;
其中,所述存储器用于存储计算机程序代码,所述计算机程序代码包括计算机指令;当所述计算机指令被所述一个或多个处理器执行时,使得所述电子设备执行上述第一方面或第一方面的任一可能的设计所描述的方法中由多个计算节点中的至少一个计算节点所执行的方法。
第三方面,本申请提供一种电子设备,所述电子设备包括用于执行第一方面或第一方面的任一可能的设计所描述的方法中由多个计算节点中的至少一个计算节点所执行的方法的模块/单元。
第四方面,本申请提供一种分布式系统,该分布式系统包括多个电子设备,其中,每个电子设备可通过上述第二方面的电子设备或第三方面的电子设备实现。
第五方面,本申请提供一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,当计算机程序在计算机上运行时,使得所述计算机执行上述第一方面或第一方面的任一可能的设计所描述的方法中由多个计算节点中的至少一个计算节点所执行的方法。
第六方面,本申请提供一种计算机程序产品,所述计算机程序产品包括计算机程序或指令,当所述计算机程序或指令在计算机上运行时,使得所述计算机执行上述第一方面或第一方面的任一可能的设计所描述的方法中由多个计算节点中的至少一个计算节点所执行的方法。
附图说明
图1为本申请实施例提供的一种电子设备的硬件架构示意图;
图2为本申请实施例提供的一种电子设备的软件架构示意图;
图3a为本申请实施例提供的方案适用的应用系统的架构示意图;
图3b为本申请实施例提供的方案的一种可能的应用系统的架构示意图;
图4为本申请实施例提供的一种电子设备建图初始化的界面示意图;
图5为本申请实施例提供的一种电子设备扫描环境时的显示界面的示意图;
图6a为本申请实施例提供的一种电子设备显示网格的界面的示意图;
图6b为本申请实施例提供的一种电子设备显示网格的界面的示意图;
图6c为本申请实施例提供的一种电子设备触发建图时的显示界面的示意图;
图7为本申请实施例提供的一种三维地图的示意图;
图8a为本申请实施例提供的一种电子设备显示三维地图信息的界面的示意图;
图8b为本申请实施例提供的一种电子设备显示三维地图信息的界面的示意图;
图9为本申请实施例提供的一种三维地图的创建方法的流程示意图;
图10a为本申请实施例提供的一种电子设备使用三维地图过程中的显示界面的示意图;
图10b为本申请实施例提供的一种电子设备结合显示真实场景图像和虚拟数字资源的界面的示意图;
图11为本申请实施例提供的一种三维地图的创建方法的示意图;
图12为本申请实施例提供的一种电子设备的结构示意图。
具体实施方式
为了使本申请实施例的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施例作进一步地详细描述。其中,在本申请实施例的描述中,以下,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征。
为了便于理解,示例性的给出了与本申请相关概念的说明以供参考。
1)、电子设备,可以为具有无线连接功能的设备。在本申请一些实施例中,电子设备可以是具备显示屏、摄像头和传感器的设备。
本申请一些实施例中电子设备可以是便携式设备,诸如手机、平板电脑、具备无线通讯功能的可穿戴设备(例如,手表、手环、头盔、耳机等)、车载终端设备、增强现实(augmented reality,AR)/虚拟现实(virtual reality,VR)设备、笔记本电脑、超级移动个人计算机(ultra-mobile personal computer,UMPC)、上网本、个人数字助理(personal digital assistant,PDA)等。电子设备还可以是智能家居设备(例如,智能电视、智能音箱等)、智能汽车、智能机器人、车间设备、无人驾驶(Self Driving)中的无线终端、远程手术(Remote Medical Surgery)中的无线终端、智能电网(Smart Grid)中的无线终端、运输安全(Transportation Safety)中的无线终端、智慧城市(Smart City)中的无线终端,或智慧家庭(Smart Home)中的无线终端、飞行设备(例如,智能机器人、热气球、无人机、飞机)等。
在本申请一些实施例中,电子设备还可以是还包含其它功能诸如个人数字助理和/或音乐播放器功能的便携式终端设备。便携式终端设备的示例性实施例包括但不限于搭载
Figure PCTCN2022138459-appb-000001
或者其它操作系统的便携式终端设备。上述便携式终端设备也可以是其它便携式终端设备,诸如具有触敏表面(例如触控面板)的膝上型计算机(Laptop)等。还应当理解的是,在本申请其它一些实施例中,上述电子设备也可以不是便携式终端设备,而是具有触敏表面(例如触控面板)的台式计算机。
2)、分布式(distribution)处理(分布式计算),是指将一个需要较大计算能力才能解决的任务分解成多个小的任务,然后把多个小的任务分别分配给多个计算节点进行处理,最后把多个计算节点的处理结果综合起来得到最终的处理结果。这样可以节约整体处理时间,大大提高计算效率。其中,多个计算节点组成的系统即为分布式系统。多个计算节点可以部署在同一设备上,也可以分别部署在通过网络连接起来的多个设备上。
本申请实施例中,计算节点可以是电子设备(例如服务器等),也可以是电子设备中的软件、程序、服务、器件等(例如中央处理器CPU、图像处理器GPU等)。
分布式存储,是将数据分散地存储于多个独立的存储节点中。分布式存储系统采用可扩展的系统结构,利用多个存储节点分担存储负荷,不但解决了传统集中式存储系统中单存储节点的瓶颈问题,还提高了系统的可靠性、可用性和扩展性。
本申请实施例中,存储节点可以是电子设备(例如服务器),也可以是电子设备中的软件程序、服务、器件等(例如临时存储器、永久性存储器等)。
3)、消息中间件,指面向消息的系统,是在分布式系统中完成消息的发送和接收的基础软件。消息中间件也可以称为消息队列,可以利用高效可靠的消息传递机制进行数据传输和交流,并基于数据通信进行分布式系统的集成。通过提供消息传递和消息队列模型,消息中间件可以在分布式环境下扩展系统进程的通信。
4)、局部图像特征(local image features):是图像特征的局部表达,反映了图像上具有的局部特性。局部图像特征可以是从图像局部区域中提取的特征,例如图像边缘、角点、线、曲线和特别属性的区域等区域中提取的特征。局部图像特征包括尺度不变特征变换(scale-invariant feature transform,SIFT)、加速稳健特征(speeded up robust features,SURF)等特征。这些局部特征可以用于在图像中检测出关键点或特征点。局部图像特征具有在图像中蕴含数量丰富、特征间相关度小,遮挡情况下不会因为部分特征的消失而影响其他特征的检测和匹配等特点,因此适用于对图像进行匹配、检索等应用处理。
全局图像特征(global image features):指图像的整体属性,能表示整幅图像上的特征,全局特征是相对于图像局部特征而言的,可以用于描述图像或目标的颜色、形状等整体特征。常见的全局特征包括颜色特征、纹理特征和形状特征,例如局部特征聚类中心的加权残差和、词袋模型(bag of words,Bow)、局部积聚描述符(vector of locally aggregated descriptors,VLAD)等。全局特征具有良好的不变性、计算简单、表示直观等特点。
为便于描述,本申请实施例中将局部图像特征简称为局部特征,将全局图像特征简称为全局特征。
5)、体素,是体积元素(volume pixel)的简称,是数字数据于三维空间分割上的最小单位,在概念上类似二维空间的最小单位即像素。体素能够用恒定的标量或者向量表示一个立体的区域。包含体素的立体可以通过立体渲染或者提取给定阈值轮廓的多边形等值面表现出来。
6)、白模(mesh),指模型的网格,三维模型是由多边形拼接而成的,而一个复杂的多边形实际上是由多个三角面拼接而成的。所以一个三维模型的表面是由多个彼此相连的三角面构成的。在三维空间中,构成这些三角面的点以及三角形的边的集合就是白模。
应理解,本申请实施例中“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B的情况,其中A、B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一(项)个”或其类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a、b或c中的至少一项(个),可以表示:a,b,c,a和b,a和c,b和c,或a、b和c,其中a、b、c可以是单个,也可以是多个。
本申请实施例提供一种三维地图的创建方法及电子设备,该方案能够简便快速的实现三维地图的创建,提高创建三维地图的效率,降低创建三维地图的难度。
本申请实施例提供的方案可以应用于电子设备和分布式建图系统组成的系统中。其中,电子设备用于采集环境图像和定位参数等信息并上报至分布式建图系统,分布式建图系统包含多个计算节点,多个计算节点用于根据电子设备上报的信息创建三维地图。多个计算节点可以部署在相同的电子设备或不同的电子设备上。
下面首先参阅图1,对本申请实施例提供的方法适用的电子设备的结构进行介绍。
如图1所示,电子设备100可以包括处理器110,外部存储器接口120,内部存储器121,USB接口130,充电管理模块140,电源管理模块141,电池142,天线1,天线2,移动通信模块150,无线通信模块160,音频模块170,扬声器170A,受话器170B,麦克风170C, 耳机接口170D,传感器模块180,按键190,马达191,指示器192,摄像头193,显示屏194,以及SIM卡接口195等。
其中传感器模块180可以包括陀螺仪传感器、加速度传感器、接近光传感器、指纹传感器、触摸传感器、温度传感器、压力传感器、距离传感器、磁传感器、环境光传感器、气压传感器、骨传导传感器等。
可以理解的是,图1所示的电子设备100仅仅是一个范例,并不构成对电子设备的限定,并且电子设备可以具有比图中所示出的更多的或者更少的部件,可以组合两个或更多的部件,或者可以具有不同的部件配置。图1中所示出的各种部件可以在包括一个或多个信号处理和/或专用集成电路在内的硬件、软件、或硬件和软件的组合中实现。
处理器110可以包括一个或多个处理单元,例如:处理器110可以包括应用处理器(application processor,AP),调制解调处理器,图形处理器(graphics processing unit,GPU),图像信号处理器(image signal processor,ISP),控制器,存储器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(Neural-network Processing Unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。其中,控制器可以是电子设备100的神经中枢和指挥中心。控制器可以根据指令操作码和时序信号,产生操作控制信号,完成取指令和执行指令的控制。
处理器110中还可以设置存储器,用于存储指令和数据。在一些实施例中,处理器110中的存储器为高速缓冲存储器。该存储器可以保存处理器110刚用过或循环使用的指令或数据。如果处理器110需要再次使用该指令或数据,可从所述存储器中直接调用。避免了重复存取,减少了处理器110的等待时间,因而提高了系统的效率。
本申请实施例提供的三维地图的创建方法的执行可以由处理器110来控制或调用其他部件来完成,比如调用内部存储器121中存储的本申请实施例的处理程序,或者通过外部存储器接口120调用第三方设备中存储的本申请实施例的处理程序,来控制无线通信模块160向其它设备进行数据通信,提高电子设备100的智能化、便捷化程度,提升用户的体验。处理器110可以包括不同的器件,比如集成CPU和GPU时,CPU和GPU可以配合执行本申请实施例提供的三维地图的创建方法,比如三维地图的创建方法中部分算法由CPU执行,另一部分算法由GPU执行,以得到较快的处理效率。
显示屏194用于显示图像,视频等。显示屏194包括显示面板。显示面板可以采用液晶显示屏(liquid crystal display,LCD),有机发光二极管(organic light-emitting diode,OLED),有源矩阵有机发光二极体或主动矩阵有机发光二极体(active-matrix organic light emitting diode的,AMOLED),柔性发光二极管(flex light-emitting diode,FLED),Miniled,MicroLed,Micro-oLed,量子点发光二极管(quantum dot light emitting diodes,QLED)等。在一些实施例中,电子设备100可以包括1个或N个显示屏194,N为大于1的正整数。显示屏194可用于显示由用户输入的信息或提供给用户的信息以及各种图形用户界面(graphical user interface,GUI)。例如,显示屏194可以显示照片、视频、网页、或者文件等。
在本申请实施例中,显示屏194可以是一个一体的柔性显示屏,也可以采用两个刚性屏以及位于两个刚性屏之间的一个柔性屏组成的拼接显示屏。
摄像头193(前置摄像头或者后置摄像头,或者一个摄像头既可作为前置摄像头,也可作为后置摄像头)用于捕获静态图像或视频。通常,摄像头193可以包括感光元件比如镜头组和图像传感器,其中,镜头组包括多个透镜(凸透镜或凹透镜),用于采集待拍摄 物体反射的光信号,并将采集的光信号传递给图像传感器。图像传感器根据所述光信号生成待拍摄物体的原始图像。
内部存储器121可以用于存储计算机可执行程序代码,所述可执行程序代码包括指令。处理器110通过运行存储在内部存储器121的指令,从而执行电子设备100的各种功能应用以及数据处理。内部存储器121可以包括存储程序区和存储数据区。其中,存储程序区可存储操作系统,应用程序(比如三维地图的创建功能等)的代码等。存储数据区可存储电子设备100使用过程中所创建的数据等。
内部存储器121还可以存储本申请实施例提供的三维地图的创建算法对应的一个或多个计算机程序。该一个或多个计算机程序被存储在上述内部存储器121中并被配置为被一个或多个处理器110执行,该一个或多个计算机程序包括指令,上述指令可以用于执行以下实施例中的各个步骤。
此外,内部存储器121可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件,闪存器件,通用闪存存储器(universal flash storage,UFS)等。
当然,本申请实施例提供的三维地图的创建算法的代码还可以存储在外部存储器中。这种情况下,处理器110可以通过外部存储器接口120运行存储在外部存储器中的三维地图的创建算法的代码。
传感器模块180可以包括陀螺仪传感器、加速度传感器、接近光传感器、指纹传感器、触摸传感器等。
触摸传感器,也称“触控面板”。触摸传感器可以设置于显示屏194,由触摸传感器与显示屏194组成触摸显示屏,也称“触控屏”。触摸传感器用于检测作用于其上或附近的触摸操作。触摸传感器可以将检测到的触摸操作传递给应用处理器,以确定触摸事件类型。可以通过显示屏194提供与触摸操作相关的视觉输出。在另一些实施例中,触摸传感器也可以设置于电子设备100的表面,与显示屏194所处的位置不同。
示例性的,电子设备100的显示屏194显示主界面,主界面中包括多个应用(比如相机应用、微信应用等)的图标。用户通过触摸传感器点击主界面中相机应用的图标,触发处理器110启动相机应用,打开摄像头193。显示屏194显示相机应用的界面,例如取景界面。
电子设备100的无线通信功能可以通过天线1,天线2,移动通信模块150,无线通信模块160,调制解调处理器以及基带处理器等实现。
天线1和天线2用于发射和接收电磁波信号。电子设备100中的每个天线可用于覆盖单个或多个通信频带。不同的天线还可以复用,以提高天线的利用率。例如:可以将天线1复用为无线局域网的分集天线。在另外一些实施例中,天线可以和调谐开关结合使用。
移动通信模块150可以提供应用在电子设备100上的包括2G/3G/4G/5G等无线通信的解决方案。移动通信模块150可以包括至少一个滤波器,开关,功率放大器,低噪声放大器(low noise amplifier,LNA)等。移动通信模块150可以由天线1接收电磁波,并对接收的电磁波进行滤波,放大等处理,传送至调制解调处理器进行解调。移动通信模块150还可以对经调制解调处理器调制后的信号放大,经天线1转为电磁波辐射出去。在一些实施例中,移动通信模块150的至少部分功能模块可以被设置于处理器110中。在一些实施例中,移动通信模块150的至少部分功能模块可以与处理器110的至少部分模块被设置在同一个器件中。在本申请实施例中,移动通信模块150还可以用于与其它设备进行信息交互。
调制解调处理器可以包括调制器和解调器。其中,调制器用于将待发送的低频基带信号调制成中高频信号。解调器用于将接收的电磁波信号解调为低频基带信号。随后解调器将解调得到的低频基带信号传送至基带处理器处理。低频基带信号经基带处理器处理后,被传递给应用处理器。应用处理器通过音频装置(不限于扬声器170A,受话器170B等)输出声音信号,或通过显示屏194显示图像或视频。在一些实施例中,调制解调处理器可以是独立的器件。在另一些实施例中,调制解调处理器可以独立于处理器110,与移动通信模块150或其他功能模块设置在同一个器件中。
无线通信模块160可以提供应用在电子设备100上的包括无线局域网(wireless local area networks,WLAN)(如无线保真(wireless fidelity,Wi-Fi)网络),蓝牙(bluetooth,BT),全球导航卫星系统(global navigation satellite system,GNSS),调频(frequency modulation,FM),近距离无线通信技术(near field communication,NFC),红外技术(infrared,IR)等无线通信的解决方案。无线通信模块160可以是集成至少一个通信处理模块的一个或多个器件。无线通信模块160经由天线2接收电磁波,将电磁波信号调频以及滤波处理,将处理后的信号发送到处理器110。无线通信模块160还可以从处理器110接收待发送的信号,对其进行调频,放大,经天线2转为电磁波辐射出去。本申请实施例中,无线通信模块160,用于与其它电子设备建立连接,进行数据交互。或者无线通信模块160可以用于接入接入点设备,向其它电子设备发送控制指令,或者接收来自其它电子设备发送的数据。
另外,电子设备100可以通过音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,以及应用处理器等实现音频功能。例如音乐播放,录音等。电子设备100可以接收按键190输入,产生与电子设备100的用户设置以及功能控制有关的键信号输入。电子设备100可以利用马达191产生振动提示(比如来电振动提示)。电子设备100中的指示器192可以是指示灯,可以用于指示充电状态,电量变化,也可以用于指示消息,未接来电,通知等。电子设备100中的SIM卡接口195用于连接SIM卡。SIM卡可以通过插入SIM卡接口195,或从SIM卡接口195拔出,实现和电子设备100的接触和分离。
应理解,在实际应用中,电子设备100可以包括比图1所示的更多或更少的部件,本申请实施例不作限定。图示电子设备100仅是一个范例,并且电子设备100可以具有比图中所示出的更多的或者更少的部件,可以组合两个或更多的部件,或者可以具有不同的部件配置。图中所示出的各种部件可以在包括一个或多个信号处理和/或专用集成电路在内的硬件、软件、或硬件和软件的组合中实现。
电子设备100的软件系统可以采用分层架构,事件驱动架构,微核架构,微服务架构,或云架构。本申请实施例以分层架构的Android系统为例,示例性说明电子设备的软件结构。
分层架构将软件分成若干个层,每一层都有清晰的角色和分工。层与层之间通过软件接口通信。如图2所示,该软件架构可以分为四层,从上至下分别为应用程序层,应用程序框架层(framework,FWK),安卓运行时和系统库,以及Linux内核层。
应用程序层是操作系统的最上一层,包括操作系统的原生应用程序,例如相机、图库、日历、蓝牙、音乐、视频、信息等等。本申请实施例涉及的应用程序简称应用(application,APP),为能够实现某项或多项特定功能的软件程序。通常,电子设备中可以安装多个应用。比如,相机应用、邮箱应用、智能家居控制应用等。下文中提到的应用,可以是电子设备 出厂时已安装的系统应用,也可以是用户在使用电子设备的过程中从网络下载或从其他电子设备获取的第三方应用。
当然,对于开发者来说,开发者可以编写应用程序并安装到该层。一种可能的实现方式中,应用程序可以使用Java语言开发,通过调用应用程序框架层所提供的应用程序编程接口(Application Programming Interface,API)来完成,开发者可以通过应用程序框架来与操作系统的底层(例如内核层等)进行交互,开发自己的应用程序。
应用程序框架层为应用程序层的应用程序提供应用编程接口(application programming interface,API)和编程框架。应用程序框架层可以包括一些预先定义的函数。应用程序框架层可以包括窗口管理器,内容提供器,视图系统,电话管理器,资源管理器,通知管理器等。
窗口管理器用于管理窗口程序。窗口管理器可以获取显示屏大小,判断是否有状态栏,锁定屏幕,截取屏幕等。
内容提供器用来存放和获取数据,并使这些数据可以被应用程序访问。所述数据可以包括文件(例如文档、视频、图像、音频),文本等信息。
视图系统包括可视控件,例如显示文字、图片、文档等内容的控件等。视图系统可用于构建应用程序。显示窗口中的界面可以由一个或多个视图组成的。例如,包括短信通知图标的显示界面,可以包括显示文字的视图以及显示图片的视图。
电话管理器用于提供电子设备的通信功能。通知管理器使应用程序可以在状态栏中显示通知信息,可以用于传达告知类型的消息,可以短暂停留后自动消失,无需用户交互。
安卓运行时包括核心库和虚拟机。安卓运行时负责安卓系统的调度和管理。
安卓系统的核心库包含两部分:一部分是Java语言需要调用的功能函数,另一部分是安卓系统的核心库。应用程序层和应用程序框架层运行在虚拟机中。以Java举例,虚拟机将应用程序层和应用程序框架层的Java文件执行为二进制文件。虚拟机用于执行对象生命周期的管理,堆栈管理,线程管理,安全和异常的管理,以及垃圾回收等功能。
系统库可以包括多个功能模块。例如:表面管理器,媒体库,三维图形处理库(例如:OpenGL ES),二维图形引擎(例如:SGL)等。表面管理器用于对显示子系统进行管理,并且为多个应用程序提供了二维和三维图层的融合。媒体库支持多种常用的音频,视频格式回放和录制,以及静态图像文件等。媒体库可以支持多种音视频编码格式,例如:MPEG4,H.564,MP3,AAC,AMR,JPG,PNG等。三维图形处理库用于实现三维图形绘图,图像渲染,合成,和图层处理等。二维图形引擎是二维绘图的绘图引擎。
内核(Kernel)层提供操作系统的核心系统服务,如安全性、内存管理、进程管理、网络协议栈和驱动模型等都基于内核层实现。内核层同时也作为硬件和软件栈之间的抽象层。该层有许多与电子设备相关的驱动程序,主要的驱动有:显示驱动;作为输入设备的键盘驱动;基于内存技术设备的Flash驱动;照相机驱动;音频驱动;蓝牙驱动;WiFi驱动等。
需要理解的是,如上所述的功能服务只是一种示例,在实际应用中,电子设备也可以按照其他因素来划分为更多或更少的功能服务,或者可以按照其他方式来划分各个服务的功能,或者也可以不划分功能服务,而是按照整体来工作。
需要说明的是,本申请实施例中所述的电子设备可以通过以上硬件架构和软件架构实现。
下面对本申请实施例提供的方案进行详细说明。
图3a为本申请实施例提供的方案的一种可能的应用系统的架构示意图。如图3a中所示,本申请提供的方案适用的应用系统中包含电子设备和分布式建图系统。
本申请实施例中,电子设备可以为具备显示屏、摄像头和传感器的设备。电子设备用于拍摄所在环境的图像或视频后,从中选择多帧图像上报至分布式建图系统,同时将电子设备自身采集的、多帧图像中每帧图像对应的定位参数上报至分布式建图系统,进而由分布式建图系统进行该环境对应的三维地图的创建。
在本申请一些实施例中,电子设备在采集所在环境的图像或视频并从中选择多帧图像后,还可以在获取多帧图像对应的深度图后,基于深度图进行体素提取、网格转换等处理,得到环境中三维物体对应的网格,并将该网格上传至分布式建图系统,由分布式建图系统基于该网格生成白模,以使电子设备在后续过程中基于白模进行虚拟数字世界与真实世界的融合。
分布式建图系统中至少包含N个计算节点(计算节点1~计算节点N),N为大于1的整数。可选的,所述N个计算节点中包含至少一个CPU计算节点和至少一个GPU计算节点。
其中,N个计算节点可以部署在同一电子设备中,也可以部署在不同的电子设备中。N个计算节点通过通信网络连接,不同计算节点之间可以进行无线通信。
示例性的,分布式建图系统可以部署在云端,可以通过一个或多个云端服务器实现。
N个计算节点用于基于分布式处理方式,根据电子设备上报的针对同一环境(场景)拍摄的多帧图像以及每帧图像对应的定位参数,创建该环境对应的三维地图。其中,不同计算节点可以执行三维地图创建过程中的不同处理任务,N个计算节点共同完成整个三维地图的创建过程。例如,不同计算节点可分别对不同的图像进行相同类型的处理,从而将多帧图像的处理任务分散到多个计算节点中同步进行,进而加快图像处理的速度。
在本申请一些实施例中,计算节点中可以包括白模处理服务,白模处理服务用于对电子设备上传的网格进行简化,并根据简化后的网格生成白模。
示例性的,N个计算节点可以包括图3b中所示的CPU算法组件和GPU算法组件。其中,分布式建图系统中的CPU算法组件可以有多个,GPU算法组件也可以有多个。GPU算法组件可以用于对多帧图像进行图像处理(如特征提取、匹配、检索等),CPU算法组件可以用于根据GPU算法组件的图像处理结果,生成三维地图。当然,计算节点也可以通过其它类型的算法处理组件实现,本申请实施例中不做具体限制。
在本申请一些实施例中,分布式建图系统中还可以包括任务队列节点。任务队列节点用于按队列缓存三维地图创建过程中的处理任务,每个计算节点可以从任务队列节点读取待执行的任务后进行相应处理,从而实现多处理任务的分布式按序执行。
示例性的,任务队列节点可以利用图3b中所示的消息中间件实现。该消息中间件可以用于异步缓存来自电子设备的三维地图创建指令、三维地图创建过程中的处理任务的指令等,并可以共享或分配给N个计算节点,以使N个计算节点分担执行任务,均衡系统负载。
在本申请一些实施例中,分布式建图系统中还可以包括至少一个存储节点。至少一个存储节点用于对三维地图创建过程相关的数据进行临时存储或永久性存储。例如,至少一个存储节点可以存储多帧图像、多个计算节点进行相应处理的中间数据和结果数据等。
示例性的,当分布式建图系统部署在云端时,如图3b中所示,分布式建图系统中的存储节点可以包括云端数据库、对象存储服务、弹性文件服务、缓存型消息中间件等。其中,云端数据库可以用于存储电子设备侧的用户信息、创建三维地图过程中任务处理情况的指示信息、对三维地图的修改信息等占用较小存储空间的序列化内容。对象存储服务可以用于存储电子设备中涉及的三维模型、高清图片、视频、动画等占用较大存储空间的非序列化内容。弹性文件服务可以用于存储利用三维地图创建算法所生成的三维地图的地图数据、以及占用存储空间较大的算法的中间变量等数据。缓存型消息中间件可以用于异步缓存算法处理过程中的可序列化且占用存储空间较小的中间变量等数据,并可以共享给N个计算节点。
在本申请一些实施例中,分布式建图系统中还可以包括至少一个调度节点。至少一个调度节点用于对N个计算节点、任务队列节点、至少一个存储节点中的部分或全部节点的调度进行统筹管理。
示例性的,如图3b中所示,分布式建图系统中的调度节点可以包括云端调度中心和算法调度中心。其中,云端调度中心可以对算法调度中心、存储节点、任务队列节点等节点进行管理和调度,并可以与电子设备进行信息和数据交互,可以作为高效的消息处理及分发节点,例如,云端调度中心能够向电子设备提供多帧图片的上传地址,进行电子设备侧的请求调度,云端数据库的请求及返回等。算法调度中心用于对N个计算节点进行管理和调度,还可以对其它的一些算法服务进行管理和调度。
在本申请一些实施例中,分布式建图系统中还可以包括定位节点。定位节点用于在构建三维地图后,根据该三维地图对电子设备进行定位。
示例性的,如图3b中所示,定位节点可以包括全局视觉定位系统(global visual positioning system,GVPS)服务和向量检索系统(vector retrieval system,VRS)服务。其中,GVPS服务可以用于进行空间定位,确定电子设备当前所处位置在创建的三维地图中对应位置的6自由度坐标。VRS服务用于进行向量搜索。可选的,GVPS服务和VRS服务可以作为计算节点的子服务。
关于上述系统中各节点、服务或组件的具体功能,下文中会结合具体实施例进行说明,这里暂不详述。
需要说明的是,图3a或图3b所示的系统仅是对本申请实施例提供的方案适用的系统的一种示例性说明,并不对本申请实施例提供的方案适用的系统架构造成限制。本申请实施例提供的方案适用的系统与图3a或图3b所示的结构相比,也可以增加、删除或调整部分节点,本申请实施例中不进行具体限定。
下面结合具体实施例,对本申请实施例提供的方案进行说明。
本申请实施例提供的方案的执行过程至少包括建图初始化、数据采集、建图三个阶段。基于这三个阶段的方法创建三维地图后,还可以进一步包括定位、添加数字资源等阶段。下面分别对各个阶段的方法进行详细说明。
一、建图初始化
本申请实施例中,三维地图的创建由电子设备侧的用户触发。用户通过对电子设备进行操作,触发电子设备启动三维地图创建,电子设备响应于用户触发三维地图创建的操作,向分布式建图系统的调度节点发送建图初始化指令,分布式建图系统的调度节点接收到建 图初始化指令后,可以为当前建图任务所要创建的三维地图分配地图标识(identity document,ID)并指示给电子设备。其中,分布式建图系统通过分配地图ID并指示给电子设备,可以对不同的三维地图进行统一管理,并可以与电子设备同步三维地图的信息,避免电子设备与分布式建图系统的信息不一致导致后续的地图处理或使用过程中出现问题。
示例性的,以电子设备为手机为例,进行建图初始化时,电子设备可以在显示屏显示如图4中所示的初始化控制界面,该界面中显示有用于触发建图流程的控件,还可以显示用于指示触发建图的方式的提示信息,例如“点击按钮开始录制”,则用户根据该提示信息,可以通过点击电子设备显示屏中显示的控件来触发建图流程。
示例性的,基于图3b所示的系统,用户触发建图流程后,电子设备可以将上述的建图初始化指令发送到分布式建图系统中的云端调度中心,云端调度中心向云端数据库请求得到地图ID后指示给电子设备。
二、数据采集
上述电子设备触发建图流程后,可以采集所在环境的多帧图像以及每帧图像对应的定位信息并上传至分布式建图系统,分布式建图系统可以通过对多帧图像进行图像处理确定多帧图像的特征信息,以进一步创建所述环境对应的三维地图。该过程主要包括以下步骤1~4:
步骤1:电子设备扫描拍摄所在的环境的视频。
具体的,电子设备可以将摄像头当前扫描到的环境图像实时显示在显示屏上,同时显示用于提示用户继续扫描的提示信息。则用户可以按照提示信息,通过移动电子设备继续扫描环境,在此过程中电子设备继续采集环境图像,直至用户指示电子设备停止扫描,则电子设备可以通过扫描得到反映环境中一定大小空间内真实场景的视频数据。
例如,上述用户点击图4中所示的触发建图流程的控件后,电子设备可以显示图5所示的界面。该界面中包括当前电子设备的摄像头扫描到的环境图像、用于提示用户继续扫描的提示信息以及用于触发停止扫描的控件。则用户可以通过移动电子设备进行更大范围空间的扫描拍摄,当用户确定停止扫描时,可以点击图5中所示的触发停止扫描的控件,则电子设备可以停止扫描,并根据已扫描的内容生成视频文件。
步骤2:电子设备从视频中提取满足关键帧要求的多帧图像,并将多帧图像上传至分布式建图系统。
在本申请一些实施例中,电子设备在扫描环境过程中,可以通过运行同步地图创建与定位(simultaneous localization and mapping,SLAM)算法,获取电子设备当前的位姿信息。其中,位姿信息用于表示电子设备在电子设备的坐标系(即电子设备运行SLAM算法时创建的坐标系)下的相机位姿,相机位姿包括位置和朝向。电子设备扫描得到不同图像时的拍摄角度不同,对应的相机坐标系不同,因此每帧图像对应的位姿信息不同。每帧图像对应的位姿信息为电子设备拍摄得到该帧图像时利用SLAM算法测得的位姿信息。
电子设备在获取视频后,可以采用如下任一种方式从视频中选择满足关键帧要求的图像:
1)根据视频中各帧图像对应的位姿之间的变化约束关系选择满足关键帧要求的图像。
在该方式中,针对视频中的每帧图像,电子设备获取采集到该帧图像时的位姿信息, 将该位姿信息与采集到前一帧满足关键帧要求的图像时的位姿信息进行对比。若确定两个位姿信息所指示的相机坐标系之间的偏移量大于设定的偏移量阈值,则确定该帧图像为满足关键帧要求的图像,否则,确定该帧图像不满足关键帧要求,继续进行下一帧图像的判断,直至视频中所有图像均已确定是否满足关键帧要求,从而选择出满足关键帧要求的图像。
2)根据视频中各帧图像的局部特征选择满足关键帧要求的图像。
在该方式中,电子设备可以提取各帧图像的局部特征,并根据提取各帧图像的局部特征确定各帧图像中的特征点,然后利用光流跟踪法对局部特征点进行跟踪,根据对特征点的跟踪情况选择满足关键帧要求的图像。其中,利用光流跟踪法可以确定当前帧图像中的特征点是否存在于下一帧图像中,因此基于光流跟踪法可以判断两帧图像中包含的相同特征点的数量。针对视频中的每帧图像,电子设备提取该帧图像中的特征点后,确定该帧图像与前一帧满足关键帧要求的图像包含的相同特征点的数量,若该数量小于设定的数量阈值或者该数量与该帧图像中所有特征的数量之比小于设定的比例阈值,则确定该帧图像为满足关键帧要求的图像,否则,确定该帧图像不满足关键帧要求,继续进行下一帧图像的判断,直至视频中所有图像均已确定是否满足关键帧要求,从而选择出满足关键帧要求的图像。
可选的,上述两种方式中,电子设备可以将视频中的第一帧图像作为第一个满足关键帧要求的图像,从而基于该图像,在其余帧图像中继续选择满足关键帧要求的图像。
在本申请一些实施例中,电子设备在上传满足关键帧要求的图像时,可以采用逐帧上传的方式,即电子设备每选择到一帧满足关键帧要求的图像,就将该图像上传至分布式建图平台,同时继续进行下一帧满足关键帧要求的图像的选择过程。当然,电子设备也可以选择得到所有满足关键帧要求的图像后,再将这些图像一并上传至分布式建图平台。
电子设备从视频中选择满足关键帧要求的多帧图像后,将多帧图像上传到分布式建图系统,分布式建图系统可以对多帧图像进行存储和管理,并可以对多帧图像进行图像处理。
示例性的,基于图3b所示的系统,电子设备在选择满足关键帧要求的图像后,向云端调度中心发送图像传输请求,以请求上传图像。云端调度中心接收到图像传输请求后,向电子设备返回上传图像的URL,然后电子设备根据URL将图像上传到对象存储服务进行存储。
步骤3:电子设备采集每帧图像对应的定位信息并上传至分布式建图系统。
每帧图像对应的定位信息包括电子设备采集得到该帧图像时的位姿信息、全球定位系统(global positioning system,GPS)信息和惯性测量单元(inertial measurement unit,IMU)信息。其中,位姿信息为电子设备拍摄得到该帧图像时利用SLAM算法测得的,具体可参照上述实施例中的介绍,此处不再详述。GPS信息用于指示电子设备拍摄得到该帧图像时通过进行GPS定位确定的电子设备在真实环境中的位置。IMU信息用于指示电子设备拍摄得到该帧图像时基于IMU传感器测量到的电子设备的姿态特征,该姿态特征例如可以为姿态角等。电子设备还可以将摄像头的相机内参发送至分布式建图系统。
示例性的,电子设备可以将采集到的定位信息以元(meta)数据的形式上传至分布式建图系统。基于图3b中所示的分布式建图系统,电子设备可以将元数据发送至云端调度中心,云端调度中心接收到元数据后,将元数据发送至缓存型消息中间件进行缓存,以供CPU算法组件或GPU算法组件使用。同时,云端调度中心可以将元数据存储至弹性文件 服务。
步骤4:分布式建图系统分别针对每帧图像进行图像处理过程。
上述电子设备将满足关键帧要求的多帧图像上传至分布式建图系统后,分布式建图系统中的多个计算节点分别针对每帧图像进行图像处理过程。具体的,每个计算节点可以从多帧图像中选择一帧未经处理的图像,并对该帧图像进行图像处理过程,在处理完毕后,继续选择下一帧未经处理的图像进行图像处理过程,直至确定所有图像均已被处理完毕。计算节点从多帧图像中选择一帧图像时可以采用随机选择的方式,也可以按照多帧图像的顺序(例如图像被上传至分布式建图系统的顺序)进行选择。
其中,上述图像处理过程包括以下步骤A1~A2:
A1:特征提取:计算节点提取图像的全局特征。
该步骤中,计算节点可以对图像进行局部特征提取和全局特征提取。其中,在进行局部特征提取时,计算节点可以对图像中各个区域的多尺度的灰度特征提取特征向量,得到图像的局部特征,并提取得到图像中的特征点。其中,特征向量可以用于表示图像中局部区域的纹理特征。在进行全局特征提取时,计算节点可以利用已训练的网络模型对图像中视觉稳定的区域(如建筑,道路)进行特征描述得到图像的全局特征。其中,全局特征可以用于表征图像的整体结构特征。
A2:序列化处理:计算节点根据图像的全局特征,从已处理图像中选择与该图像匹配的图像。
该步骤包括特征检索、特征匹配以及特征校验。其中,特征检索指计算节点根据该图像的全局特征,对从已处理图像(即已进行过上述的图像处理过程的图像)的全局特征进行检索,得到与该图像的全局特征距离最接近的设定数量个全局特征,并将检索得到的全局特征对应的图像作为候选帧图像。可选的,电子设备还可以同时采集时间早于该图像的采集时间且与该图像的采集时间最接近的设定数量帧图像作为候选帧图像。
特征匹配指计算节点将候选帧图像中的局部特征与该图像的局部特征进行匹配,从而选取满足设定匹配条件的至少一组匹配对,其中,每个匹配对包含满足设定匹配条件的多个特征点。其中,在进行匹配时,计算节点可以利用最近邻(k-nearest neighbor,KNN)匹配算法从候选帧图像局部特征点中选择与该图像中局部特征点匹配的特征点,并与该图像中局部特征点组成匹配对。计算节点也可以通过训练深度学习模型后利用深度学习模型进行匹配的方式选择匹配对。示例性的,设定匹配条件可以为以下任一种:特征点的局部特征描述子之间的向量距离小于或等于设定的阈值;特征点的特征描述子之间互为向量距离最近的特征描述子;特征点的特征描述子之间的最近向量距离与次近向量距离之间的比值大于或等于设定的阈值。其中,特征描述子用于描述局部特征。
特征校验指电子设备从特征匹配处理的结果中滤除错误匹配的信息。可选的,电子设备可以采用随机抽样一致性校验等算法进行特征校验处理。
基于以上图像处理过程,每个计算节点可以确定自身处理的图像与已处理的其它图像之间的匹配对关系(匹配或不匹配),因此,在所有图像的图像处理过程结束后,可以得到所有图像之间的匹配对关系。其中,同一组匹配对的多个特征点对应真实环境中的同一三维点。
示例性的,基于图3b中所示的分布式建图系统,电子设备根据URL将图像上传到对象存储服务后,云端调度中心可以向队列型消息中间件发送传图消息以指示每帧图像对应 的传图任务,队列型消息中间件对传图任务的信息进行缓存。各GPU算法组件分别从队列型消息中间件中读取传图任务,并从对象存储服务读取传图任务对应的图像后,对读取的图像进行上述的图像处理过程,并将处理结果(即图像的匹配信息)保存至弹性文件服务中,同时把处理完成的标识符以及处理过程的中间结果(例如图像的全局特征等)发送到缓存型消息中间件进行缓存。则后续GPU算法节点进行图像处理过程中可以从缓存型消息中间件读取已处理图像的全局特征,以便进行序列化处理等。
需要说明的是,上述部分步骤的执行顺序并无严格的时序要求,可以根据实际情况进行调整。例如,对于同一计算节点来说,上述步骤3、4的执行依赖于上述步骤2中选取的图像,但是,上述步骤3、4之间可以无序,即该计算节点在执行上述步骤3、4时,可以先执行其中任一步骤,再执行另一步骤,也可以同时执行两个步骤。对于不同计算节点来说,每个计算节点执行上述步骤1~4的过程独立于其它计算节点,任意两个计算节点之间互不干扰。例如在电子设备上传2帧图像后,计算节点1可以按照上述步骤1~4执行对第1帧图像的处理过程,在此过程中,计算节点2也可以按照上述步骤1~4执行对第2帧图像的处理过程。再例如,电子设备在上传一帧图像后,计算节点即可对该帧图像执行上述步骤1~4的处理过程,同时,电子设备侧继续执行选择下一帧图像的过程。
在本申请一些实施例中,上述电子设备在扫描拍摄所在环境的过程中,还可以在显示屏上同时显示覆盖环境空间轮廓的网格,以提示和引导用户完成扫描过程。具体的,电子设备可以采用飞行时间(time of flight,TOF)方法采集满足关键帧要求的图像的深度图,或者,根据选择的满足关键帧要求的图像,采用多视立体匹配(multi-view stereo,MVS)得到对应的深度图。电子设备得到每帧图像的深度图后,可以采用基于截断的带符号距离函数(truncated signed distance function,TSDF)等算法,根据每帧图像提取体素并确定三维体素空间中各个体素的深度值。得到体素后,电子设备可以根据各个体素的深度值,利用移动立方体(marching cubes)算法将体素转换为网格并进行渲染,然后显示在显示屏所示的环境界面中的对应区域。
例如,电子设备当前扫描到的内容为图5中所示的内容时,电子设备可以对图5所示的界面对应的图像的深度图进行体素提取和网格转换得到网格。电子设备将该网格覆盖到图5所示的显示界面中的对应位置后,可以显示图6a所示的界面,该界面中,网格覆盖区域为已经扫描过的区域,未被网格覆盖的区域为待扫描区域或者无法生成对应网格的区域。基于此方式,电子设备可以在用户操作电子设备扫描环境空间过程时,将已扫描和未扫描过的区域实时呈现给用户,来引导用户根据网格提示继续操作电子设备针对未扫描的环境区域进行扫描,使得网格尽可能多的覆盖待扫描的真实环境空间中的三维物体,从而简便快速的完成扫描过程,降低采集环境图像的操作难度,提高用户的使用体验。其中,电子设备显示屏中同时显示有触发扫描结束的控件,因此当用户确定网格覆盖其感兴趣的区域后,或者当用户确定其感兴趣的各区域中均覆盖一定大小的网格时,可以通过对该控件进行操作来停止扫描。例如,如图6b中所示,当电子设备显示的网格覆盖一定范围的区域后,用户可以通过点击触发扫描结束的空间来结束扫描,电子设备响应于用户进行的操作可以显示如图6c所示的界面,该界面中包括提示用户开始生成地图的控件,用户可以通过对该控件进行操作来触发三维地图的创建任务,从而进一步根据用户操作电子设备扫描得到的内容生成环境空间对应的三维地图。
上述电子设备生成的网格的形状可以是任意形状,例如图6b中所示的三角形、矩形等规则形状,也可以是其它不规则的形状,本申请实施例中不做具体限定。
在本申请一些实施例中,在扫描结束后,电子设备可以将上述过程中得到的网格上传至分布式建图系统,由分布式建图系统中的计算节点对网格进行简化和对应的白模的生成,以便于在后续处理过程中使用该信息。其中,计算节点进行网格简化后,可以通过平面提取、相交计算、多面体拓扑构建及表面优化等算法生成对应的白模。
示例性的,基于图3b中所示的分布式建图系统,电子设备上传网格时,可以发送网格上传请求至云端调度中心,云端调度中心接收网格上传请求后,向电子设备返回网格上传地址的信息。电子设备可以根据网格上传地址将扫描结束时的网格上传到对象存储服务中进行存储。电子设备上传网格完毕后可以向云端服务中心发送网格上传完毕通知消息,云端服务接收到该通知消息后可以发送网格简化任务至队列型消息中间件。CPU算法组件中的白模处理服务可以监听队列型消息中间件中缓存的网格简化任务并领取任务后执行对应的网格简化任务。白模处理服务执行网格简化任务得到的结果可以发送至弹性文件存储服务进行存储,同时发送对应的网格简化完成通知消息至队列型消息中间件。云端服务中心监听到该网格简化完成通知消息后可以从弹性文件服务中获取简化后的网格,并将简化后的网格发送至对象存储服务进行存储,同时将网格简化结果(即简化后的网格)存储至云端数据库。
基于以上方式,电子设备可以通过可视化网格引导用户采集真实世界场景的图像,降低了用户侧进行图像采集的难度,进而提升了用户使用体验。
三、建图
上述电子设备结束扫描后,可以触发分布式建图系统启动建图任务,根据电子设备上传的满足关键帧要求的图像和每帧图像对应的定位信息进行三维地图的创建。具体的,在分布式建图系统的计算节点对电子设备上传的所有满足关键帧要求的图像进行上述步骤4中所述的图像处理过程后,可以按如下步骤B1~B4进行三维地图的创建:
B1:计算节点根据满足关键帧要求的多帧图像生成场景匹配关系图(scence graph),其中,场景匹配关系图用于表征多帧图像之间局部特征点的匹配对关系。
计算节点可以根据所有满足关键帧要求的多帧图像之间的匹配关系确定多帧图像的共视关系,再通过对共视关系进行优化后得到场景匹配关系图。
其中,场景匹配关系图可以视为一种由“顶点”和“边”组成的抽象网络,网络中每个顶点可以代表一帧图像,每个边代表图像间的一对特征点匹配对。不同“顶点”可以通过“边”实现连接,表示通过“边”连接的两个顶点具有关联关系,即两个“顶点”代表的两帧图像的匹配关系。
B2:计算节点根据场景匹配关系图确定多帧图像中各特征点在三维空间中对应的三维点。
计算节点生成场景匹配关系图后,可以根据场景匹配关系图、来自电子设备的位姿信息(相机位姿)和来自电子设备的相机内参确定多帧图像中各特征点在三维空间中对应的三维点。其中,该三维空间对应的坐标系为上述的电子设备的坐标系。
针对场景匹配关系图中同一特征点的不同视角,计算节点可以通过诸如直接线性变换(direct linear transformation,DLT)等算法结合电子设备的相机位姿、相机内参求解该特 征点在三维空间中对应的位置(即三角化),并将该位置处的点确定为该特征点在三维空间中对应的三维点。计算节点确定场景匹配关系图中所有特征点在三维空间中对应的三维点后,可以得到这些三维点组成的三维地图,该三维地图为三维点云地图。
B3:计算节点对三维点在三维空间中的坐标进行优化。
计算节点可以对上述求解得到的三维点进行光束平差法(bundle adjustment,BA)优化,即通过根据相机模型将三维空间中的三维点反投影回图像的位置误差,优化相机位姿、三维点位置以及电子设备摄像头的相机内参矩阵,从而得到精确的相机位姿、相机内参以及三维点的坐标,进而得到优化后的三维点云地图。
B4:计算节点根据优化后的三维点生成三维地图。
计算节点可以结合电子设备上传的每一帧图像对应的GPS信息、IMU信息进行平滑及去噪处理,得到图像对应的真实世界的相机位姿,即电子设备拍摄该图像时在环境中的位置及朝向(即电子设备摄像头相对环境的朝向),再根据图像对应的位置及朝向,将三维空间中三维点的坐标与真实世界的相机位姿进行对齐,从而将三维空间的坐标系调整至与真实环境空间的坐标系一致,进而得到与真实环境等比例的三维点云地图,该三维点云地图为真实环境场景对应的点云地图。
其中,计算节点将三维空间中三维点的坐标与真实世界的相机位姿进行对齐的过程中,计算节点首先分别确定每帧图像对应的、电子设备在电子设备的坐标系中的位姿和在真实世界的位姿之间的转换矩阵,然后对多帧图像对应的转换矩阵求平均后得到目标转换矩阵。再利用目标转换矩阵对三维空间中点云的坐标进行转换。
示例性的,图7为一种三维地图的示意图,如图7中所示的三维地图中的三维点分别对应电子设备扫描的真实环境中的三维点,每个三维点在三维空间中的位置用于表征该三维点对应的真实环境中的三维点在真实环境中的位置。
示例性的,上述图6c中,用户通过对电子设备显示的触发建图的控件进行操作来触发三维地图的创建任务后,电子设备请求分布式建图系统根据电子设备上传的内容创建三维地图,并可以显示用于提示地图创建进度的提示信息。例如,电子设备可以显示图8a中的(a)示意图所示的界面,该界面中可以包括待创建的三维地图的地图ID、缩略图、用于指示正在建图的提示信息以及时间等信息。例如,该提示信息可以为“建图中”,同时,三维地图对应的缩略图页面可以是未被点亮的状态,以提示用户此时正在建图过程中,且三维地图为不可编辑状态。分布式建图系统接收到电子设备的请求后,可以根据上述步骤完成三维地图的创建,并将创建完成的三维地图的信息发送给电子设备。电子设备确定三维地图创建完成后,可以将图8a中的(a)示意图中所示的缩略图页面点亮,以及去除用于指示正在建图的提示信息,即显示图8a中的(b)示意图所示的界面,以提示用户已完成建图,且当前地图为可编辑状态。再例如,电子设备也可以先显示图8b中的(a)示意图所示的界面,该界面中可以显示环境的图像,且其中与部分三维物体对应的区域处于未被点亮状态。电子设备在确定地图创建完成后,再显示图8b中的(b)示意图所示的界面,该界面中显示的环境中三维物体的区域变为点亮状态。又例如,电子设备也可以以百分比的方式或其它方式显示地图创建的进度等。
示例性的,基于图3b中所示的分布式建图系统,电子设备触发建图时,可以发送请求建图的指令及本次建图的图像扫描张数(即满足关键帧要求的图像的数量)等信息到云端调度中心,则云端调度中心发送建图任务到队列型消息中间件,同时本次建图的基本信 息及用户属性信息到云端数据库进行保存。各CPU算法组件可以监听队列型消息中间件中的建图任务并进行处理,最终生成三维地图文件并存储到弹性文件服务。在建图过程中,各CPU算法组件可以将建图进度、建图成功或失败的信息、地图对齐矩阵(SLAM坐标系与真实世界坐标系之间的转换矩阵)等信息发送至队列型消息中间件,将建图结果(即创建的三维地图)保存到弹性文件服务中。云端调度中心可以监听队列型消息中间件中的建图进度的信息,从而获得当前建图任务的处理进度、状态、地图对齐矩阵等信息并将这些信息存储到云端数据库。
上述实施例中,在三维地图创建过程中,电子设备侧提供创建三维地图所需的真实世界场景相关的初始数据,分布式建图系统可以根据电子设备提供的初始数据完成三维地图的创建过程。因此,电子设备侧地图创建的难度和成本都较低,便于普及和推广。而分布式建图系统中多计算节点可以同时进行地图的创建任务,能够多节点支持大规模计算,因此具有较高的建图效率。此外,建图过程中充分使用了SLAM位姿、GPS、IMU等信息,进一步提高了创建的三维地图的精度。
图9为本申请实施例提供的一种三维地图的创建方法的流程示意图,如图9中所示,该方法的流程可以包括:
S901:电子设备在移动过程中扫描得到真实世界场景的视频,并从中选择满足关键帧要求的图像;以及,运行SLAM算法并提供每帧图像对应的定位信息。
其中,定位信息包括位姿信息、GPS信息、IMU信息。电子设备的移动过程由用户控制实现。
S902:电子设备将满足关键帧要求的图像分别上传至分布式建图系统中不同的计算节点。
S903:分布式建图系统中每个计算节点对来自电子设备的图像提取全局特征。
S904:分布式建图系统中每个计算节点根据提取全局特征进行特征检索、匹配及校验,确定所处理的图像与其它图像之间的匹配关系。
S905:在分布式建图系统中的计算节点对电子设备上传的所有图像均确定匹配关系后,分布式建图系统中的计算节点根据确定的匹配关系生成场景匹配关系图。
S906:分布式建图系统中的计算节点根据场景匹配关系图生成电子设备上传的图像中的特征点在三维空间中对应的三维点。
其中,特征点可以通过提取图像的局部特征得到。
S907:分布式建图系统中的计算节点通过BA算法对三维点在三维空间的坐标进行优化。
S908:分布式建图系统中的计算节点对电子设备上传的每帧图像对应的定位信息进行平滑去噪。
S909:分布式建图系统中的计算节点基于处理后的定位信息对电子设备的相机位姿和三维点进行坐标对齐,基于对齐后的三维点得到三维地图。
S910:分布式建图系统的计算节点将创建完成的三维地图的信息指示给电子设备。
可选的,电子设备执行上述步骤S901后,还可以执行以下步骤S911~S915来生成网格并上传至分布式建图系统进行处理:
S911:电子设备获取满足关键帧要求的图像对应的深度图。
S912:电子设备结合满足关键帧要求的图像中的特征点,对深度图像进行TSDF融合,生成体素。
S913:电子设备对融合后的体素进行网格提取并渲染,得到用于覆盖真实世界场景中三维物体的网格。
S914:电子设备将网格上传至分布式建图平台的计算节点。
S915:分布式建图平台的计算节点对网格进行简化,根据简化后的网格生成白模并保存。
上述步骤的具体执行可参照上述实施例中的相关介绍,本实例中不再赘述。
需要说明的是,上述实例提供的具体实施流程,仅是对本申请实施例适用方法流程的举例说明,其中各步骤的执行顺序可根据实际需求进行相应调整,还可以增加其它步骤,或减少部分步骤。
四、定位
本申请实施例中,电子设备获取分布式建图系统创建的三维地图后,可以使用该三维地图进行定位。
例如,对于图8a中的(b)示意图所示的三维地图,当用户和电子设备再次处于该三维地图对应的环境中时,电子设备可以响应于用户选择该三维地图的操作,采集当前环境的至少一帧图像并上传至分布式建图系统,分布式建图系统根据至少一帧图像及先前创建的该环境对应的三维地图,采用GVPS方法确定电子设备在三维地图中对应的位置,从而得到电子设备在该环境中的位置。分布式建图平台确定电子设备的位置后,将该位置指示给电子设备,则电子设备可以显示图10a所示的界面。该界面中的内容为电子设备当前扫描到的环境图像。此时,电子设备在环境中的位置已确定,因此用户可以在电子设备当前扫描到的环境图像中添加数字资源。
示例性的,基于图3b中所示的分布式建图系统,用户触发定位后,电子设备向云端调度中心发送定位请求和当前扫描到的至少一帧图像,云端调度中心将接收到的定位请求和至少一帧图像发送至GVPS定位服务。GVPS定位服务读取弹性文件服务中存储的三维地图的地图数据,根据该地图数据和至少一帧图像确定电子设备当前位姿在三维地图中对应的位置(该位置用于表示电子设备在环境中的位置),并将该位置的信息发送至云端调度中心。云端调度中心从云端数据库查询当前地图相关的兴趣点(point of interest)POI信息后,将该POI信息及来自GVPS服务的位置信息发送给电子设备。电子设备可以根据接收到的POI信息从对象存储服务下载三维数字资源模型并进行渲染后添加到电子设备显示的数字世界场景中。
五、添加数字资源
上述定位完成后,电子设备显示图10a所示的界面中包含三维数字资源模型的素材,用户可以从中选择素材后添加到图10a所示的数字世界场景中。例如,在用户选择添加某一素材后电子设备可以显示图10b中所示的添加三维数字资源模型后的界面,该界面中既包含真实环境场景的地图,又包含用户添加的虚拟资源模型,可以实现真实世界场景与虚拟数字场景的融合显示。其中,在添加素材过程中,当用户选中某一素材并移动至某一区域时,电子设备可以在该区域显示对应的白模,以引导用户选择合适的区域放置素材。当 用户确定将素材放置在某一区域内时,电子设备确定放置区域在三维地图中对应的位置,并将该位置作为用户添加的素材在真实环境场景中对应的位置。
示例性的,基于图3b中所示的分布式建图系统,定位完成后,电子设备可以向云端调度中心请求三维数字资源模型对应的数字资源列表。云端调度中心通过查询云端数据库获取当前用户所对应的数字资源列表,并发送给电子设备。在用户选择素材后,电子设备通过URL到对象存储服务下载三维数字资源模型,并添加进数字世界场景。三维数字资源添加完成后,用户可以通过点击电子设备显示的保存的控件,触发电子设备上传当前三维数字资源模型的大小及位姿等信息给云端调度中心,云端调度中心发送该信息到云端数据库进行保存。
上述用户添加三维数字资源模型后,分布式建图平台可以对用户添加的三维数字资源模型的信息进行保存,当用户再次打开创建的三维地图并重新进行定位后,若定位至先前放置三维数字资源模型的区域中时,则电子设备在显示该区域对应的地图时,同时显示放置在其中的三维数字资源模型,使得用户仍可以看到先前在该区域内放置的三维资源模型。
基于以上方式,电子设备可以为用户提供使用、编辑创建的三维地图的功能,同时允许用户在创建的三维地图中添加虚拟数字资源,实现了真实环境场景与虚拟数字场景的融合应用。
基于以上实施例及相同构思,本申请实施例还提供一种三维地图的创建方法,该方法应用于包含多个计算节点的分布式系统,如图11中所示,该方法包括:
S1101:多个计算节点中的每个计算节点分别从来自电子设备的多帧图像中选择待处理的一帧图像作为目标图像,并对所述目标图像进行目标处理过程,至所述多帧图像均已进行所述目标处理过程;其中,所述多帧图像为所述电子设备针对同一环境拍摄的图像;所述目标处理过程包括以下步骤:提取所述目标图像的第一特征点;获取已进行所述目标处理过程的至少一帧图像的特征点;在所述至少一帧图像的特征点中选择至少一个第二特征点与所述第一特征点组成特征匹配对;其中,所述第一特征点和所述至少一个第二特征点对应所述环境中的同一点。
S1102:目标计算节点获取对所述多帧图像进行目标处理过程后得到的多个特征匹配对,并根据所述多个特征匹配对创建三维点云地图;其中,所述目标计算节点为所述多个计算节点中的任一计算节点。
具体的,该方法中各计算节点所执行的方法的具体实现可参阅前述实施例中的相关介绍,在此不再过多赘述。
基于以上实施例及相同构思,本申请实施例还提供一种电子设备,该电子设备用于实现本申请实施例提供的多个计算节点中的一个或多个计算节点所执行的方法,或者,该电子设备用于实现本申请实施例提供的电子设备所执行的方法。
如图12中所示,电子设备1200可以包括:存储器1201,一个或多个处理器1202,以及一个或多个计算机程序(图中未示出)。上述各器件可以通过一个或多个通信总线1203耦合。可选的,当电子设备1200用于实现本申请实施例提供的电子设备所执行的方法时,电子设备1200还可以包括显示屏1204。
其中,存储器1201中存储有一个或多个计算机程序(代码),一个或多个计算机程序 包括计算机指令;一个或多个处理器1202调用存储器1201中存储的计算机指令,使得电子设备1200执行本申请实施例提供的三维地图的创建方法。显示屏1204用于显示图像、视频、应用界面等相关用户界面。
具体实现中,存储器1201可包括高速随机存取的存储器,并且也可包括非易失性存储器,例如一个或多个磁盘存储设备、闪存设备或其他非易失性固态存储设备。存储器1201可以存储操作系统(下述简称系统),例如ANDROID,IOS,WINDOWS,或者LINUX等嵌入式操作系统。存储器1201可用于存储本申请实施例的实现程序。存储器1201还可以存储网络通信程序,该网络通信程序可用于与一个或多个附加设备,一个或多个用户设备,一个或多个网络设备进行通信。一个或多个处理器1202可以是一个通用中央处理器(Central Processing Unit,CPU),微处理器,特定应用集成电路(Application-Specific Integrated Circuit,ASIC),或一个或多个用于控制本申请方案程序执行的集成电路。
需要说明的是,图12仅仅是本申请实施例提供的电子设备1200的一种实现方式,实际应用中,电子设备1200还可以包括更多或更少的部件,这里不作限制。
基于以上实施例及相同构思,本申请实施例还提供一种电子设备,所述电子设备包括用于执行上述实施例提供的方法中由多个计算节点中的至少一个计算节点所执行的方法的模块/单元。
基于以上实施例及相同构思,本申请实施例还提供一种分布式系统,该分布式系统包括多个电子设备,其中,每个电子设备可以通过上述的电子设备实现。
基于以上实施例及相同构思,本申请实施例还提供一种计算机可读存储介质,该计算机可读存储介质存储有计算机程序,当计算机程序在计算机上运行时,使得计算机执行上述实施例提供的方法中由多个计算节点中的至少一个计算节点所执行的方法,或者执行上述实施例提供的方法中由电子设备所执行的方法。
基于以上实施例及相同构思,本申请实施例还提供一种计算机程序产品,该计算机程序产品包括计算机程序或指令,当计算机程序或指令在计算机上运行时,使得计算机执行上述实施例提供的方法中由多个计算节点中的至少一个计算节点所执行的方法,或者执行上述实施例提供的方法中由电子设备所执行的方法。
本申请实施例提供的方法中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行计算机程序指令时,全部或部分地产生按照本发明实施例的流程或功能。计算机可以是通用计算机、专用计算机、计算机网络、网络设备、用户设备或者其他可编程装置。计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,简称DSL)或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。计算机可读存储介质可以是计算机可以存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如,数字视频光盘(digital video disc,简称DVD)、或者半导体介质(例如,SSD)等。
显然,本领域的技术人员可以对本申请进行各种改动和变型而不脱离本申请的范围。 这样,倘若本申请的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包含这些改动和变型在内。

Claims (13)

  1. 一种三维地图的创建方法,应用于包含多个计算节点的分布式系统,其特征在于,所述方法包括:
    每个计算节点分别从来自电子设备的多帧图像中选择待处理的一帧图像作为目标图像,并对所述目标图像进行目标处理过程,至所述多帧图像均已进行所述目标处理过程;其中,所述多帧图像为所述电子设备针对同一环境拍摄的图像;所述目标处理过程包括以下步骤:提取所述目标图像的第一特征点;获取已进行所述目标处理过程的至少一帧图像的特征点;在所述至少一帧图像的特征点中选择至少一个第二特征点与所述第一特征点组成特征匹配对;其中,所述第一特征点和所述至少一个第二特征点对应所述环境中的同一点;
    目标计算节点获取对所述多帧图像进行目标处理过程后得到的多个特征匹配对,并根据所述多个特征匹配对创建三维点云地图;其中,所述目标计算节点为所述多个计算节点中的任一计算节点。
  2. 如权利要求1所述的方法,其特征在于,所述目标计算节点根据所述多个特征匹配对创建三维点云地图,包括:
    所述目标计算节点获取来自所述电子设备的多个第一位姿信息;其中,所述多个第一位姿信息与所述多帧图像一一对应,每个第一位姿信息用于指示所述电子设备拍摄相应第一位姿信息对应的图像时在第一三维空间中的位置和朝向,所述第一三维空间为所述三维点云地图对应的三维空间;
    所述目标计算节点根据所述多个第一位姿信息,确定所述多个特征匹配对在所述第一三维空间中对应的多个三维点,得到所述三维点云地图。
  3. 如权利要求2所述的方法,其特征在于,所述方法还包括:
    所述目标计算节点获取来自所述电子设备的多个第二位姿信息;其中,所述多个第二位姿信息与所述多帧图像一一对应,每个第二位姿信息用于指示所述电子设备拍摄相应第二位姿信息对应的图像时在第二三维空间中的位置和朝向,所述第二三维空间为所述环境对应的三维空间;
    所述目标计算节点根据所述多个第二位姿信息和所述多个第一位姿信息,对所述第一三维空间中三维点的坐标进行调整,得到与所述环境等比例的三维点云地图。
  4. 如权利要求3所述的方法,其特征在于,所述目标计算节点根据所述多个第二位姿信息和所述多个第一位姿信息,对所述第一三维空间中三维点的坐标进行调整,包括:
    所述目标计算节点根据所述多个第二位姿信息和所述多个第一位姿信息,确定多个转换矩阵;其中,所述多个转换矩阵与所述多帧图像一一对应,每个转换矩阵用于表征同一图像对应的第二位姿信息与第一位姿信息之间的转换关系;
    所述目标计算节点对所述多个转换矩阵求平均得到目标转换矩阵;
    所述目标计算节点利用所述目标转换矩阵对所述第一三维空间中三维点的坐标进行转换处理。
  5. 如权利要求3或4所述的方法,其特征在于,
    每个第二位姿信息所指示的位置是所述电子设备通过进行全球定位系统GPS定位确定的;
    每个第二位姿信息所指示的朝向是所述电子设备通过进行惯性测量单元IMU测量确定的。
  6. 如权利要求1~5任一所述的方法,其特征在于,在获取已进行所述目标处理过程的至少一帧图像的特征点之前,所述方法还包括:
    确定所述至少一帧图像,包括:
    提取所述目标图像的全局特征;
    获取已进行所述目标处理过程的每帧图像的全局特征,并在获取的全局特征中选择与所述目标图像的全局特征相似度最高的至少一个全局特征;
    确定所述至少一个全局特征对应的图像为所述至少一帧图像。
  7. 如权利要求1~6任一所述的方法,其特征在于,所述分布式系统中还包括队列节点,在每个计算节点分别从多帧图像中选择待处理的一帧图像作为目标图像之前,所述方法还包括:
    所述队列节点接收来自所述电子设备的所述多帧图像,并按照从所述电子设备接收图像的顺序,将所述多帧图像添加至目标图像队列中;
    每个计算节点分别从多帧图像中选择待处理的一帧图像作为目标图像,包括:
    每个计算节点从所述队列节点的目标图像队列中读取一帧图像,并将读取的图像作为所述目标图像。
  8. 如权利要求1~7任一所述的方法,其特征在于,所述目标处理过程还包括:
    在提取所述目标图像的第一特征点后,将所述第一特征点保存至存储节点;
    获取已进行所述目标处理过程的至少一帧图像的特征点,包括:
    从所述存储节点获取所述至少一帧图像的特征点。
  9. 如权利要求1~8任一所述的方法,其特征在于,所述方法还包括:
    所述目标计算节点将所述三维点云地图发送到所述电子设备。
  10. 一种电子设备,其特征在于,所述电子设备包括存储器和一个或多个处理器;
    其中,所述存储器用于存储计算机程序代码,所述计算机程序代码包括计算机指令;当所述计算机指令被所述一个或多个处理器执行时,使得所述电子设备执行如权利要求1~9任一所述的方法中由所述多个计算节点中的至少一个计算节点所执行的方法。
  11. 一种分布式系统,其特征在于,所述系统包括多个如权利要求10所述的电子设备。
  12. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机程序,当计算机程序在计算机上运行时,使得所述计算机执行如权利要求1~9任一所述的方法中由所述多个计算节点中的至少一个计算节点所执行的方法。
  13. 一种计算机程序产品,其特征在于,所述计算机程序产品包括计算机程序或指令,当所述计算机程序或指令在计算机上运行时,使得所述计算机执行如权利要求1~9任一所述的方法中由所述多个计算节点中的至少一个计算节点所执行的方法。
PCT/CN2022/138459 2021-12-31 2022-12-12 一种三维地图的创建方法及电子设备 WO2023124948A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111665748.X 2021-12-31
CN202111665748.XA CN116433830A (zh) 2021-12-31 2021-12-31 一种三维地图的创建方法及电子设备

Publications (1)

Publication Number Publication Date
WO2023124948A1 true WO2023124948A1 (zh) 2023-07-06

Family

ID=86997688

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/138459 WO2023124948A1 (zh) 2021-12-31 2022-12-12 一种三维地图的创建方法及电子设备

Country Status (2)

Country Link
CN (1) CN116433830A (zh)
WO (1) WO2023124948A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116932802A (zh) * 2023-07-10 2023-10-24 上海鱼微阿科技有限公司 一种图像检索方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111046125A (zh) * 2019-12-16 2020-04-21 视辰信息科技(上海)有限公司 一种视觉定位方法、系统及计算机可读存储介质
CN111368759A (zh) * 2020-03-09 2020-07-03 河海大学常州校区 基于单目视觉的移动机器人语义地图构建系统
US20200226794A1 (en) * 2017-09-29 2020-07-16 Panasonic Intellectual Property Corporation Of America Three-dimensional data creation method, client device, and server
CN111652934A (zh) * 2020-05-12 2020-09-11 Oppo广东移动通信有限公司 定位方法及地图构建方法、装置、设备、存储介质
CN111833447A (zh) * 2020-07-13 2020-10-27 Oppo广东移动通信有限公司 三维地图构建方法、三维地图构建装置及终端设备

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200226794A1 (en) * 2017-09-29 2020-07-16 Panasonic Intellectual Property Corporation Of America Three-dimensional data creation method, client device, and server
CN111046125A (zh) * 2019-12-16 2020-04-21 视辰信息科技(上海)有限公司 一种视觉定位方法、系统及计算机可读存储介质
CN111368759A (zh) * 2020-03-09 2020-07-03 河海大学常州校区 基于单目视觉的移动机器人语义地图构建系统
CN111652934A (zh) * 2020-05-12 2020-09-11 Oppo广东移动通信有限公司 定位方法及地图构建方法、装置、设备、存储介质
CN111833447A (zh) * 2020-07-13 2020-10-27 Oppo广东移动通信有限公司 三维地图构建方法、三维地图构建装置及终端设备

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116932802A (zh) * 2023-07-10 2023-10-24 上海鱼微阿科技有限公司 一种图像检索方法
CN116932802B (zh) * 2023-07-10 2024-05-14 玩出梦想(上海)科技有限公司 一种图像检索方法

Also Published As

Publication number Publication date
CN116433830A (zh) 2023-07-14

Similar Documents

Publication Publication Date Title
US11748054B2 (en) Screen projection method and terminal device
US10360479B2 (en) Device and method for processing metadata
CN114730546A (zh) 具有定位服务和基于位置的共享内容的交叉现实系统
JP2022537614A (ja) マルチ仮想キャラクターの制御方法、装置、およびコンピュータプログラム
WO2023131090A1 (zh) 一种增强现实系统、多设备构建三维地图的方法及设备
CN113538227B (zh) 一种基于语义分割的图像处理方法及相关设备
CN114782296B (zh) 图像融合方法、装置及存储介质
WO2023124948A1 (zh) 一种三维地图的创建方法及电子设备
CN112053360B (zh) 图像分割方法、装置、计算机设备及存储介质
CN112561084B (zh) 特征提取方法、装置、计算机设备及存储介质
WO2021088497A1 (zh) 虚拟物体显示方法、全局地图更新方法以及设备
CN114283299A (zh) 图像聚类方法、装置、计算机设备及存储介质
CN113822263A (zh) 图像标注方法、装置、计算机设备及存储介质
US11238622B2 (en) Method of providing augmented reality contents and electronic device therefor
WO2023051383A1 (zh) 一种设备定位方法、设备及系统
CN114842069A (zh) 一种位姿确定方法以及相关设备
CN114170366B (zh) 基于点线特征融合的三维重建方法及电子设备
EP4135317A2 (en) Stereoscopic image acquisition method, electronic device and storage medium
CN112711636B (zh) 数据同步方法、装置、设备及介质
CN115880350A (zh) 图像处理方法、设备、系统及计算机可读存储介质
CN116051723B (zh) 集束调整方法及电子设备
WO2023216957A1 (zh) 一种目标定位方法、系统和电子设备
CN116091572B (zh) 获取图像深度信息的方法、电子设备及存储介质
WO2023131089A1 (zh) 一种增强现实系统、增强现实场景定位方法及设备
WO2023072113A1 (zh) 显示方法及电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22914203

Country of ref document: EP

Kind code of ref document: A1