WO2023131089A1 - 一种增强现实系统、增强现实场景定位方法及设备 - Google Patents

一种增强现实系统、增强现实场景定位方法及设备 Download PDF

Info

Publication number
WO2023131089A1
WO2023131089A1 PCT/CN2022/144272 CN2022144272W WO2023131089A1 WO 2023131089 A1 WO2023131089 A1 WO 2023131089A1 CN 2022144272 W CN2022144272 W CN 2022144272W WO 2023131089 A1 WO2023131089 A1 WO 2023131089A1
Authority
WO
WIPO (PCT)
Prior art keywords
electronic device
target
panorama
server
relative position
Prior art date
Application number
PCT/CN2022/144272
Other languages
English (en)
French (fr)
Inventor
温裕祥
何凯文
郑亚
李江伟
唐忠伟
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP22918530.1A priority Critical patent/EP4414941A1/en
Publication of WO2023131089A1 publication Critical patent/WO2023131089A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/006Mixed reality

Definitions

  • the present application relates to the field of augmented reality technology, and in particular to an augmented reality system, a method for positioning an augmented reality scene, and a device.
  • Augmented reality is a technology that integrates real world information and virtual world information.
  • AR technology can simulate physical information that is difficult to experience in the real world to obtain virtual information, and apply the virtual information to the real world, so that the real environment and virtual objects can be superimposed on the same screen or space in real time and be perceived by the user at the same time. To achieve a sensory experience beyond reality.
  • the user can simultaneously observe the real world and virtual items in the AR scene displayed on the display screen of the AR device, such as the ground, tables and other items in the real world observed by the user on the display screen of the AR device, and can also observe the ground Virtual items such as anime characters placed on the display screen of the AR device.
  • the present application provides an augmented reality system, a method and a device for positioning an augmented reality scene.
  • the present application provides an augmented reality system, which includes an electronic device and a server;
  • the electronic device is configured to send the first user image and the identification of the target scene to the server in response to a first instruction triggered by the user; the first instruction is the selection of the target scene from multiple candidate scenes triggered by the user Instructions, the first user image is an image taken by the electronic device of the current environment; receive the first relative position sent by the server, and the first relative position is the geographical position of the electronic device relative to the target.
  • the location of the location, the target geographic location is the geographic location of the real environment corresponding to the target scene; the target route is determined according to the first relative position, and the target route is from the current location of the electronic device to the target geographic location the route of the location;
  • the server is configured to receive the first user image and the identification of the target scene sent by the electronic device; determine multiple panoramas of the target scene according to the identification of the target scene; according to the first The user image and multiple panoramic images of the target scene determine the first relative position, and send the first relative position to the electronic device.
  • the electronic device After the electronic device receives the first instruction to select the target scene triggered by the user, the electronic device sends the first user image and the target scene identifier to the server, so that the server can A user image accurately locates the electronic device, and determines a first relative position of the electronic device relative to the geographic location of the real environment corresponding to the target scene.
  • the electronic device can determine the target route from the current position to the geographic location corresponding to the target scene according to the first relative position, and can accurately guide the user to hold the electronic device to find the target scene and re-enter the target scene to play.
  • the server is specifically configured to: perform feature extraction on the first user image, and determine global feature information of the first user image; Determine the multi-frame target panorama slices in the panorama slices of the multiple panoramas of the target scene; the similarity between the global feature information of each frame target panorama slice and the global feature information of the first user image is greater than the preset Threshold; Obtain the target pose information of each target panorama slice, and the target pose information of each target panorama slice is used to represent the position and azimuth angle of the electronic device in the real world when shooting the panorama slice; according to the The target pose information of multiple target panorama slices determines the first relative position.
  • the server can determine the target panorama slice whose global feature information is similar to the first user image from the panorama slices of multiple panoramas of the target scene according to the global feature information of the first user image, and then locate the target panorama through visual triangulation.
  • the target pose information of the target panorama slice the first relative position relative to the target geographic location when the electronic device captures the first user image is accurately determined.
  • the first relative position includes an angle range and a distance range of the electronic device relative to the target geographic location
  • the electronic device is specifically configured to: select any angle value from the angle range, and select any distance value from the distance range; and determine the target route according to the selected angle value and the selected distance value.
  • the first relative position includes the angle range and distance range of the electronic device relative to the target geographic location.
  • the electronic device can choose the angle value and distance value to determine the target route. After the user moves the electronic device according to the target route, he can move to In the real environment area corresponding to the target scene.
  • the electronic device is further configured to: render a navigation arrow according to the target route, and display an environment image currently captured by the electronic device and the navigation arrow.
  • the electronic device can display the navigation arrow corresponding to the target route, guide the user to move with the electronic device in the form of images, and improve user experience.
  • the electronic device is further configured to: receive the three-dimensional map of the target scene created by the server, and collect multiple panoramas of the target scene, each panorama including multiple frames of panorama slices ; Send multiple panoramas of the target scene to the server;
  • the server is also used to: receive multiple panoramas uploaded by the electronic device, perform feature extraction on multi-frame panorama slices of each panorama, and determine global feature information of each panorama slice; according to the target scene The 3D map and the global feature information of each panorama slice determine the target pose information of each panorama slice.
  • the electronic device After the electronic device receives the 3D map created by the server, it can collect multiple panoramas of the real environment corresponding to the target scene, and determine the target pose information of each panorama slice, so that the user can upload the first After the user image is captured, the electronic device is visually triangulated based on the target pose information of the panorama slice of the target scene, so as to accurately determine the relative position of the electronic device relative to the target geographic location of the target scene.
  • the server is specifically configured to: obtain multi-frame environmental images corresponding to the target scene and global feature information and feature points of each frame of environmental images; extract global feature information and feature points of the first panorama slice point, the first panorama slice is any frame panorama slice in multiple panoramas; determine at least one frame of environment image matching with the first panorama slice according to the global characteristics of the first panorama slice, and determine the The three-dimensional points corresponding to the feature points in the at least one frame of the environment image matched by the first panoramic image slice are in the three-dimensional map, and the determined three-dimensional points are used as the three-dimensional points corresponding to the feature points of the first panoramic image slice; according to The feature points of the first panorama slice, the three-dimensional points corresponding to the feature points of the first panorama slice, and the camera intrinsic parameters of the electronic device determine target pose information of the first panorama slice.
  • the server can determine the target pose information of each frame of the panorama slice, so that after receiving the first user image, the first electronic device can be positioned by a visual triangulation positioning method.
  • the electronic device is further configured to: collect a second user image after moving to the target geographic location, where the second user image is after the electronic device moves to the target geographic location The image obtained by shooting the environment; sending the second user image to the server, and receiving the second relative position returned by the server, the second relative position being the position of the electronic device on the three-dimensional map The position in the three-dimensional coordinate system; display the target scene according to the second relative position, the three-dimensional map and the environmental image collected by the electronic device in real time;
  • the server is further configured to: receive the second user image uploaded by the electronic device; determine a second relative position of the electronic device according to the second user image based on a GVPS algorithm, and store the second relative position sent to the electronic device.
  • the electronic device can collect the second user image and send it to the server, and the server can determine the position of the electronic device in the three-dimensional coordinate system of the three-dimensional map based on the GVPS algorithm , and then the electronic device can display the target scene based on the position in the three-dimensional coordinate system of the three-dimensional map, so that the real environment captured by the electronic device and the virtual items placed by the user seen by the user when playing in the target scene are more realistic, improving user experience .
  • the present application provides a method for positioning an augmented reality scene, which is applied to an augmented reality system, where the augmented reality system includes an electronic device and a server, and the method includes:
  • the electronic device sends the first user image and the identification of the target scene to the server in response to a first instruction triggered by the user; the first instruction is an instruction triggered by the user to select a target scene from multiple candidate scenes,
  • the first user image is an image taken by the electronic device of the current environment;
  • the server receives the first user image and the target scene identifier sent by the electronic device;
  • the identifier of the target scene determines multiple panoramas of the target scene;
  • the server determines a first relative position according to the first user image and multiple panoramas of the target scene, and the first relative position is the The position of the electronic device relative to the target geographic location, the target geographic location is the geographic location of the real environment corresponding to the target scene;
  • the server sends the first relative location to the electronic device;
  • a target route is determined according to the first relative position, and the target route is a route from the current position of the electronic device to the target geographic location.
  • the server determines the first relative position according to the first user image and multiple panoramas of the target scene, including: the server performs feature extraction on the first user image, and determines The global feature information of the first user image; the server determines multi-frame target panorama slices from the panorama slices of multiple panoramas of the target scene according to the global feature information of the first user image; each frame The similarity between the global feature information of the target panorama slice and the global feature information of the first user image is greater than a preset threshold; the server obtains the target pose information of each target panorama slice, and each target panorama The target pose information of the slice is used to represent the position and azimuth angle of the electronic device in the real world when shooting the panorama slice; the server determines the first relative Location.
  • the first relative position includes an angle range and a distance range of the electronic device relative to the target geographic location; the electronic device determines a target route according to the first relative position, including: The electronic device selects any angle value from the angle range, and selects any distance value from the distance range; the electronic device determines the target route according to the selected angle value and the selected distance value.
  • the method further includes: the electronic device rendering a navigation arrow according to the target route, and displaying the environment image currently captured by the electronic device and the navigation arrow.
  • the method further includes: the electronic device receives the three-dimensional map of the target scene created by the server, and collects multiple panoramas of the target scene, each panorama including a multi-frame panorama Image slice; the electronic device sends multiple panoramas of the target scene to the server; the server performs feature extraction on multi-frame panorama slices of each panorama, and determines the global features of each panorama slice information; the server determines the target pose information of each frame of panorama slice according to the three-dimensional map of the target scene and the global feature information of each frame of panorama slice.
  • the server determines the target pose information of each frame of panorama slice according to the 3D map of the target scene and the global feature information of each frame of panorama slice, including: the server obtains the target scene Corresponding multi-frame environment images and the global feature information and feature points of each frame of environment images; the server extracts the global feature information and feature points of the first panorama slice, and the first panorama slice is the Any frame of panorama slice; the server determines at least one frame of environment image that matches the first panorama slice according to the global characteristics of the first panorama slice, and determines at least one frame of environment image that matches the first panorama slice.
  • the server determines at least one frame of environment image that matches the first panorama slice according to the global characteristics of the first panorama slice, and determines at least one frame of environment image that matches the first panorama slice
  • the three-dimensional points corresponding to the feature points in the image in the three-dimensional map the server uses the determined three-dimensional points as the three-dimensional points corresponding to the feature points of the first panorama slice;
  • the method further includes: after the electronic device moves to the target geographic location, collecting a second user image, where the second user image is the electronic device moving to the target geographic location After the location, the image obtained by shooting the environment; the electronic device sends the second user image to the server; the server determines the second relative position of the electronic device based on the second user image based on the GVPS algorithm position, and send the second relative position to the electronic device, the second relative position being the position of the electronic device in the three-dimensional coordinate system of the three-dimensional map of the target scene; the electronic device according to the The second relative position, the three-dimensional map, and the environmental image collected by the electronic device in real time are used to display the target scene.
  • the present application provides a method for positioning an augmented reality scene, which is applied to a server, and the method includes:
  • the first user image is an image captured by the electronic device of the current environment
  • the target scene is the first electronic Determined by the device in response to a first instruction
  • the first instruction is an instruction triggered by a user to select a target scene from multiple candidate scenes
  • the first user image and multiple panoramas of the target scene determine a first relative position, the first relative position is the position of the electronic device relative to the target geographic location, and the target geographic location is the target scene The geographic location of the corresponding real environment; sending the first relative position to the electronic device, so that the electronic device determines a target route according to the first relative position, and the target route is from the current position of the electronic device Directions from a location to said target geographic location.
  • the determining the first relative position according to the first user image and multiple panoramas of the target scene includes: performing feature extraction on the first user image, and determining the first The global feature information of the user image; according to the global feature information of the first user image, determine the multi-frame target panorama slice from the panorama slices of the multiple panoramas of the target scene; the global feature of each frame of the target panorama slice The similarity between the information and the global feature information of the first user image is greater than a preset threshold; the target pose information of each target panorama slice is obtained, and the target pose information of each target panorama slice is used to represent the electronic The position and azimuth angle of the device in the real world when the panorama slice is taken; the first relative position is determined according to the target pose information of the plurality of target panorama slices.
  • the first relative position includes an angle range and a distance range of the electronic device relative to the target geographic location; Angle value and any distance value in the distance range.
  • the method before receiving the first user image and the identification of the target scene sent by the electronic device, the method further includes: sending the three-dimensional map of the target scene created by the server to the electronic device, Receive multiple panoramas of the target scene sent by the electronic device; each panorama includes multiple frames of panorama slices; perform feature extraction on the multi-frame panorama slices of each panorama, and determine the number of panorama slices of each frame Global feature information: determine the target pose information of each frame of panorama slice according to the 3D map of the target scene and the global feature information of each frame of panorama slice.
  • the determining the target pose information of each frame of panorama slice according to the 3D map of the target scene and the global feature information of each frame of panorama slice includes: acquiring multiple frames corresponding to the target scene The global feature information and feature points of the environment image and each frame of the environment image; the global feature information and feature points of the first panorama slice are extracted, and the first panorama slice is any frame panorama slice in multiple panoramas; Determine at least one frame of environment image matching the first panorama slice according to the global features of the first panorama slice, and determine that the feature points in the at least one frame of environment image matching the first panorama slice correspond to the three-dimensional map The three-dimensional point of the determined three-dimensional point is used as the three-dimensional point corresponding to the feature point of the first panoramic image slice; according to the feature point of the first panoramic image slice, the corresponding feature point of the first panoramic image slice The 3D point and the camera intrinsic parameters of the electronic device determine the target pose information of the first panoramic image slice.
  • the method further includes: receiving a second user image sent by the electronic device, where the second user image is obtained by photographing the environment where the electronic device moves to the target geographic location image; determine the second relative position of the electronic device based on the GVPS algorithm according to the second user image, and send the second relative position to the electronic device, the second relative position is the electronic device The position in the three-dimensional coordinate system of the three-dimensional map of the target scene; causing the electronic device to display the target scene according to the second relative position, the three-dimensional map, and the environmental image collected by the electronic device in real time.
  • the present application provides a method for positioning an augmented reality scene, which is applied to an electronic device, and the method includes:
  • the first user image and the identification of the target scene are sent to the server;
  • the first instruction is an instruction triggered by the user to select a target scene from multiple candidate scenes, and the first
  • the user image is an image taken by the electronic device of the current environment;
  • receiving the first relative position sent by the server the first relative position is the position of the electronic device relative to the target geographic location, and the The target geographic location is the geographic location of the real environment corresponding to the target scene, and the first relative position is determined by the server according to the first user image and multiple panoramas of the target scene; according to the The first relative position determines a target route, and the target route is a route from the current position of the electronic device to the target geographic location.
  • the first relative position includes an angle range and a distance range of the electronic device relative to the target geographic location; and determining the target route according to the first relative position includes: Select any angle value from the angle range, and select any distance value from the distance range; the electronic device determines the target route according to the selected angle value and the selected distance value.
  • the method further includes: rendering a navigation arrow according to the target route, and displaying the environment image currently captured by the electronic device and the navigation arrow.
  • the method before sending the first user image and the identification of the target scene to the server in response to the first instruction triggered by the user, the method further includes: receiving, by the electronic device, the A three-dimensional map of the target scene, collecting multiple panoramas of the target scene, each panorama comprising multiple frames of panorama slices; the electronic device sends multiple panoramas of the target scene to the server, so that The server determines the global feature information of each frame of panorama slices and the target pose information of each frame of panorama slices; the target pose information of multi-frame panorama slices of the multiple panoramas is used to determine the first relative Location.
  • the method further includes: after moving to the target geographic location, collecting a second user image, where the second user image is an image of the electronic device after moving to the target geographic location An image captured in the environment; sending the second user image to the server, so that the server determines the second relative position of the electronic device based on the GVPS algorithm based on the second user image, and the second The relative position is the position of the electronic device in the three-dimensional coordinate system of the three-dimensional map of the target scene; receiving the second relative position sent by the server, according to the second relative position, the three-dimensional map and the The environment image collected by the electronic device in real time displays the target scene.
  • the present application provides an electronic device, where the electronic device includes a plurality of functional modules; the plurality of functional modules interact to implement the methods performed by the electronic device in any of the above aspects and implementations thereof.
  • the multiple functional modules can be implemented based on software, hardware or a combination of software and hardware, and the multiple functional modules can be combined or divided arbitrarily based on specific implementations.
  • the present application provides an electronic device, including at least one processor and at least one memory, where computer program instructions are stored in the at least one memory, and when the electronic device is running, the at least one processor executes any of the above-mentioned Aspects and methods performed by electronic devices in various implementations thereof.
  • the present application provides a server, where the server includes a plurality of functional modules; the plurality of functional modules interact to implement the method executed by the server in any of the above aspects and implementations thereof.
  • the multiple functional modules can be implemented based on software, hardware or a combination of software and hardware, and the multiple functional modules can be combined or divided arbitrarily based on specific implementations.
  • the present application provides a server, including at least one processor and at least one memory, where computer program instructions are stored in the at least one memory, and when the electronic device is running, the at least one processor executes any of the above aspects and the methods executed by the server in various implementations thereof.
  • the present application further provides a computer program, which, when running on a computer, causes the computer to execute the method performed by the electronic device or the server in any of the above aspects and implementations thereof.
  • the present application also provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a computer, the computer executes any one of the above aspects and its A method executed by an electronic device or a server in each implementation manner.
  • the present application further provides a chip, which is used to read the computer program stored in the memory, and execute the method executed by the electronic device or the server in any one of the above-mentioned aspects and each implementation manner thereof.
  • the present application further provides a system-on-a-chip, where the system-on-a-chip includes a processor, configured to support a computer device to implement the method executed by the electronic device or the server in any one of the above-mentioned aspects and various implementation manners thereof.
  • the chip system further includes a memory, and the memory is used to store necessary programs and data of the computer device.
  • the system-on-a-chip may consist of chips, or may include chips and other discrete devices.
  • FIG. 1 is a schematic diagram of an AR scene displayed by an electronic device provided in an embodiment of the present application
  • FIG. 2 is a schematic diagram of an augmented reality system provided by an embodiment of the present application.
  • FIG. 3 is a schematic structural diagram of an electronic device provided in an embodiment of the present application.
  • FIG. 4 is a software structural block diagram of an electronic device provided in an embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of a server provided by an embodiment of the present application.
  • FIG. 6 is a flowchart of an augmented reality scene positioning method provided by an embodiment of the present application.
  • FIG. 7 is a schematic diagram of collecting panorama slices provided by an embodiment of the present application.
  • FIG. 8 is a schematic diagram of a candidate scenario provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram of determining a first relative position of an electronic device based on a VTPS service provided by an embodiment of the present application.
  • FIG. 10 is a schematic diagram of a process flow of an augmented reality scene positioning method provided by an embodiment of the present application.
  • FIG. 11 is a flowchart of an augmented reality scene positioning method provided by an embodiment of the present application.
  • Augmented reality is a technology that integrates real world information and virtual world information.
  • AR technology can simulate physical information that is difficult to experience in the real world to obtain virtual information, and apply the virtual information to the real world, so that the real environment and virtual objects can be superimposed on the same screen or space in real time and be perceived by the user at the same time. To achieve a sensory experience beyond reality.
  • the user can control the electronic device to collect real-world environment images to generate a 3D map corresponding to the real world, and the 3D map can be used to construct an AR scene.
  • the user shoots the real world in real time through the camera device of the electronic device, the user can add virtual items to the AR scene displayed on the display screen of the electronic device, and the electronic device adds the virtual object to the AR scene based on the three-dimensional map corresponding to the real world. Users can observe the real world and virtual items added by users in the same screen.
  • FIG. 1 is a schematic diagram of an AR scene provided by an embodiment of the present application. Referring to Fig. 1, the ground and the road in Fig. 1 are real-world pictures actually captured by the camera device of the electronic device, and the cartoon characters on the road are virtual items added by the user in the current AR scene. Real-world ground, roads and virtual cartoon characters are simultaneously observed on the display screen.
  • the user can control the electronic device to generate three-dimensional maps for multiple regions of the real world, so as to experience AR scenes in different regions.
  • the user wants to play again in the AR scene where the 3D map has been generated, how to find and enter the AR scene where the 3D map has been generated becomes a problem to be solved.
  • professional equipment may be used to collect a real-world panorama image sequence corresponding to an AR scene, and the panorama image sequence includes real geographic location information and sequence information of multiple images.
  • a database containing panoramic image sequences is established in the server, and the relationship between the three-dimensional map and the panoramic image sequences is saved.
  • the electronic device can find the location of the user's point of interest in the electronic map of the real world according to the search term entered by the user, and search for the panoramic image sequence corresponding to the location information from the database according to the location information of the point of interest. Then, the AR scene is displayed on the display screen of the electronic device.
  • this implementation method needs to rely on professional equipment to collect panoramic image sequences, and the cost is relatively high.
  • the electronic device when the electronic device generates the three-dimensional map, it extracts features from the environmental images collected by the camera device of the electronic device and stores the feature information of each image in the database.
  • the electronic device performs feature extraction on the user image collected by the camera device of the electronic device when the user is at the current location, and calculates the feature information of the user image and the feature information of the image in the database The similarity between them, select the location information of the image with the highest similarity between the feature information in the database and the feature information of the user's image, and use the location information of the selected image as the user's location information, and then display the AR scene according to the user's location.
  • the limited density of pictures collected by the electronic device when generating the three-dimensional map results in low search accuracy, and it is impossible to accurately determine the current location of the user.
  • Fig. 2 is a schematic diagram of an augmented reality system provided by an embodiment of the present application.
  • the augmented reality system includes an electronic device 11 and a server 12, wherein there may be multiple servers 12. Multiple servers 12 can work together and interact with the electronic device 11 to implement the augmented reality scene positioning method provided in the embodiment of the present application.
  • the electronic device 11 responds to the first instruction triggered by the user to select a target scene among multiple candidate scenes, the electronic device 11 sends the first user image captured by the camera to the server 12, and the server 12 according to The global feature information of the first user image determines multiple first panoramas from multiple panoramas corresponding to the target scene, and determines the electronic device according to the global feature information of the first user image and the local feature information of the multiple first panoramas 11 With respect to the first relative position of the target scene, the server 12 sends the first relative position to the electronic device 11, and the electronic device 11 determines a target route from the current position of the electronic device 11 to the position corresponding to the target scene according to the first relative position.
  • the server 12 can accurately locate the electronic device 11 according to the first user image captured by the camera device of the electronic device and the panorama of the target scene, and then determine the route of the electronic device 11 from the current position to the position corresponding to the target scene , can accurately guide the user to hold the electronic device 11 to find the AR scene and re-enter the AR scene.
  • the electronic device of the embodiment of the present application may have a camera and a display device, for example, the electronic device of the embodiment of the present application may be a tablet computer, a mobile phone, a vehicle-mounted device, an augmented reality (augmented reality, AR) device, a notebook computer, a super mobile personal computer (ultra-mobile personal computer, UMPC), netbook, personal digital assistant (personal digital assistant, PDA), wearable devices, etc., the embodiments of the present application do not impose any restrictions on the specific types of electronic devices.
  • FIG. 3 is a schematic structural diagram of an electronic device 100 provided in an embodiment of the present application.
  • the electronic device 100 may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (universal serial bus, USB) interface 130, a charging management module 140, a power management module 141, and a battery 142 , antenna 1, antenna 2, mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, earphone jack 170D, sensor module 180, button 190, motor 191, indicator 192, camera 193 , a display screen 194, and a subscriber identification module (subscriber identification module, SIM) card interface 195, etc.
  • SIM subscriber identification module
  • the processor 110 may include one or more processing units, for example: the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processing unit (graphics processing unit, GPU), an image signal processor (image signal processor, ISP), controller, memory, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural network processor (neural-network processing unit, NPU) wait. Wherein, different processing units may be independent devices, or may be integrated in one or more processors. Wherein, the controller may be the nerve center and command center of the electronic device 100 . The controller can generate an operation control signal according to the instruction opcode and timing signal, and complete the control of fetching and executing the instruction.
  • a memory may also be provided in the processor 110 for storing instructions and data.
  • the memory in processor 110 is a cache memory.
  • the memory may hold instructions or data that the processor 110 has just used or recycled. If the processor 110 needs to use the instruction or data again, it can be called directly from the memory. Repeated access is avoided, and the waiting time of the processor 110 is reduced, thereby improving the efficiency of the system.
  • the USB interface 130 is an interface conforming to the USB standard specification, specifically, it can be a Mini USB interface, a Micro USB interface, a USB Type C interface, and the like.
  • the USB interface 130 can be used to connect a charger to charge the electronic device 100 , and can also be used to transmit data between the electronic device 100 and peripheral devices.
  • the charging management module 140 is configured to receive a charging input from a charger.
  • the power management module 141 is used for connecting the battery 142 , the charging management module 140 and the processor 110 .
  • the power management module 141 receives the input from the battery 142 and/or the charging management module 140 to provide power for the processor 110 , the internal memory 121 , the external memory, the display screen 194 , the camera 193 , and the wireless communication module 160 .
  • the wireless communication function of the electronic device 100 can be realized by the antenna 1 , the antenna 2 , the mobile communication module 150 , the wireless communication module 160 , a modem processor, a baseband processor, and the like.
  • Antenna 1 and Antenna 2 are used to transmit and receive electromagnetic wave signals.
  • Each antenna in electronic device 100 may be used to cover single or multiple communication frequency bands. Different antennas can also be multiplexed to improve the utilization of the antennas.
  • Antenna 1 can be multiplexed as a diversity antenna of a wireless local area network.
  • the antenna may be used in conjunction with a tuning switch.
  • the mobile communication module 150 can provide wireless communication solutions including 2G/3G/4G/5G applied on the electronic device 100 .
  • the mobile communication module 150 may include at least one filter, switch, power amplifier, low noise amplifier (low noise amplifier, LNA) and the like.
  • the mobile communication module 150 can receive electromagnetic waves through the antenna 1, filter and amplify the received electromagnetic waves, and send them to the modem processor for demodulation.
  • the mobile communication module 150 can also amplify the signals modulated by the modem processor, and convert them into electromagnetic waves and radiate them through the antenna 1 .
  • at least part of the functional modules of the mobile communication module 150 may be set in the processor 110 .
  • at least part of the functional modules of the mobile communication module 150 and at least part of the modules of the processor 110 may be set in the same device.
  • the wireless communication module 160 can provide wireless local area networks (wireless local area networks, WLAN) (such as wireless fidelity (Wireless Fidelity, Wi-Fi) network), bluetooth (bluetooth, BT), global navigation satellite, etc. applied on the electronic device 100.
  • System global navigation satellite system, GNSS
  • frequency modulation frequency modulation, FM
  • near field communication technology near field communication, NFC
  • infrared technology infrared, IR
  • the wireless communication module 160 may be one or more devices integrating at least one communication processing module.
  • the wireless communication module 160 receives electromagnetic waves via the antenna 2 , frequency-modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 110 .
  • the wireless communication module 160 can also receive the signal to be sent from the processor 110 , frequency-modulate it, amplify it, and convert it into electromagnetic waves through the antenna 2 for radiation.
  • the antenna 1 of the electronic device 100 is coupled to the mobile communication module 150, and the antenna 2 is coupled to the wireless communication module 160, so that the electronic device 100 can communicate with the network and other devices through wireless communication technology.
  • the wireless communication technology may include global system for mobile communications (GSM), general packet radio service (general packet radio service, GPRS), code division multiple access (code division multiple access, CDMA), broadband Code division multiple access (wideband code division multiple access, WCDMA), time division code division multiple access (time-division code division multiple access, TD-SCDMA), long term evolution (long term evolution, LTE), BT, GNSS, WLAN, NFC , FM, and/or IR techniques, etc.
  • GSM global system for mobile communications
  • GPRS general packet radio service
  • code division multiple access code division multiple access
  • CDMA broadband Code division multiple access
  • WCDMA wideband code division multiple access
  • time division code division multiple access time-division code division multiple access
  • TD-SCDMA time-division code division multiple access
  • the GNSS may include a global positioning system (global positioning system, GPS), a global navigation satellite system (global navigation satellite system, GLONASS), a Beidou satellite navigation system (beidou navigation satellite system, BDS), a quasi-zenith satellite system (quasi -zenith satellite system (QZSS) and/or satellite based augmentation systems (SBAS).
  • GPS global positioning system
  • GLONASS global navigation satellite system
  • Beidou satellite navigation system beidou navigation satellite system
  • BDS Beidou navigation satellite system
  • QZSS quasi-zenith satellite system
  • SBAS satellite based augmentation systems
  • the display screen 194 is used for displaying a display interface of an application, for example, displaying a display page of an application installed on the electronic device 100 .
  • the display screen 194 includes a display panel.
  • the display panel can be a liquid crystal display (LCD), an organic light-emitting diode (OLED), an active matrix organic light emitting diode or an active matrix organic light emitting diode (active-matrix organic light emitting diode, AMOLED), flexible light-emitting diode (flex light-emitting diode, FLED), Miniled, MicroLed, Micro-oLed, quantum dot light emitting diodes (quantum dot light emitting diodes, QLED), etc.
  • LCD liquid crystal display
  • OLED organic light-emitting diode
  • AMOLED active matrix organic light emitting diode
  • FLED flexible light-emitting diode
  • Miniled MicroLed, Micro-oLed
  • quantum dot light emitting diodes quantum do
  • the electronic device 100 may include 1 or N display screens 194 , where N is a positive integer greater than 1.
  • the display screen 194 may be used to display an AR scene, and the AR scene displayed on the display screen 194 may include images captured by the camera 193 in real time and virtual items placed by the user in the AR scene.
  • Camera 193 is used to capture still images or video.
  • the object generates an optical image through the lens and projects it to the photosensitive element.
  • the photosensitive element may be a charge coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor.
  • CMOS complementary metal-oxide-semiconductor
  • the photosensitive element converts the light signal into an electrical signal, and then transmits the electrical signal to the ISP to convert it into a digital image signal.
  • the ISP outputs the digital image signal to the DSP for processing.
  • DSP converts digital image signals into standard RGB, YUV and other image signals.
  • the electronic device 100 may include 1 or N cameras 193 , where N is a positive integer greater than 1.
  • the camera 193 can collect images for building a three-dimensional map of the AR scene, and the camera 193 can also be used to shoot panoramic images. A panorama corresponding to the location of the electronic device 100 .
  • the internal memory 121 may be used to store computer-executable program codes including instructions.
  • the processor 110 executes various functional applications and data processing of the electronic device 100 by executing instructions stored in the internal memory 121 .
  • the internal memory 121 may include an area for storing programs and an area for storing data. Wherein, the storage program area can store an operating system, software codes of at least one application program, and the like.
  • the data storage area can store data generated during use of the electronic device 100 (such as captured images, recorded videos, etc.) and the like.
  • the internal memory 121 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, flash memory device, universal flash storage (universal flash storage, UFS) and the like.
  • the external memory interface 120 may be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the electronic device.
  • the external memory card communicates with the processor 110 through the external memory interface 120 to implement a data storage function. For example, save pictures, videos and other files in the external memory card.
  • the electronic device 100 can implement audio functions through the audio module 170 , the speaker 170A, the receiver 170B, the microphone 170C, the earphone interface 170D, and the application processor. Such as music playback, recording, etc.
  • the sensor module 180 may include a pressure sensor 180A, an acceleration sensor 180B, a touch sensor 180C and the like.
  • the pressure sensor 180A is used to sense the pressure signal and convert the pressure signal into an electrical signal.
  • pressure sensor 180A may be disposed on display screen 194 .
  • the touch sensor 180C is also called “touch panel”.
  • the touch sensor 180C can be disposed on the display screen 194, and the touch sensor 180C and the display screen 194 form a touch screen, also called “touch screen”.
  • the touch sensor 180C is used to detect a touch operation on or near it.
  • the touch sensor can pass the detected touch operation to the application processor to determine the type of touch event.
  • Visual output related to the touch operation can be provided through the display screen 194 .
  • the touch sensor 180C may also be disposed on the surface of the electronic device 100, which is different from the position of the display screen 194.
  • the keys 190 include a power key, a volume key and the like.
  • the key 190 may be a mechanical key. It can also be a touch button.
  • the electronic device 100 may receive key input and generate key signal input related to user settings and function control of the electronic device 100 .
  • the motor 191 can generate a vibrating reminder.
  • the motor 191 can be used for incoming call vibration prompts, and can also be used for touch vibration feedback. For example, touch operations applied to different applications (such as taking pictures, playing audio, etc.) may correspond to different vibration feedback effects.
  • the touch vibration feedback effect can also support customization.
  • the indicator 192 can be an indicator light, and can be used to indicate charging status, power change, and can also be used to indicate messages, missed calls, notifications, and the like.
  • the SIM card interface 195 is used for connecting a SIM card. The SIM card can be connected and separated from the electronic device 100 by inserting it into the SIM card interface 195 or pulling it out from the SIM card interface 195 .
  • FIG. 3 do not constitute a specific limitation on the electronic device 100, and the electronic device may also include more or fewer components than shown in the figure, or combine some components, or split some components , or different component arrangements.
  • the combination/connection relationship between the components in FIG. 3 can also be adjusted and modified.
  • FIG. 4 is a software structural block diagram of an electronic device provided by an embodiment of the present application.
  • the software structure of the electronic device may be a layered architecture, for example, the software may be divided into several layers, and each layer has a clear role and division of labor. Layers communicate through software interfaces.
  • the operating system is divided into four layers, which are application program layer, application program framework layer (framework, FWK), runtime (runtime) and system library, and kernel layer from top to bottom.
  • the application layer can include a series of application packages. As shown in FIG. 4 , the application layer may include camera, setting, skin module, user interface (user interface, UI), three-party application program, and the like. Among them, the three-party application may include gallery, calendar, call, map, navigation, WLAN, Bluetooth, music, video, short message and so on.
  • the application framework layer provides an application programming interface (application programming interface, API) and a programming framework for applications in the application layer.
  • the application framework layer can include some predefined functions. As shown in Figure 4, the application framework layer can include a window manager, a content provider, a view system, a phone manager, a resource manager, and a notification manager.
  • a window manager is used to manage window programs.
  • the window manager can get the size of the display screen, determine whether there is a status bar, lock the screen, capture the screen, etc.
  • Content providers are used to store and retrieve data and make it accessible to applications. Said data may include video, images, audio, calls made and received, browsing history and bookmarks, phonebook, etc.
  • the view system includes visual controls, such as controls for displaying text, controls for displaying pictures, and so on.
  • the view system can be used to build applications.
  • a display interface can consist of one or more views.
  • a display interface including a text message notification icon may include a view for displaying text and a view for displaying pictures.
  • the phone manager is used to provide communication functions of electronic devices. For example, the management of call status (including connected, hung up, etc.).
  • the resource manager provides various resources for the application, such as localized strings, icons, pictures, layout files, video files, and so on.
  • the notification manager enables the application to display notification information in the status bar, which can be used to convey notification-type messages, and can automatically disappear after a short stay without user interaction.
  • the notification manager is used to notify the download completion, message reminder, etc.
  • the notification manager can also be a notification that appears on the top status bar of the system in the form of a chart or scroll bar text, such as a notification of an application running in the background, or a notification that appears on the screen in the form of a dialog window. For example, prompting text information in the status bar, sending out prompt sounds, electronic equipment vibrating, and indicator lights flashing, etc.
  • the runtime includes the core library and virtual machine.
  • the runtime is responsible for the scheduling and management of the operating system.
  • the core library consists of two parts: one part is the function function that the java language needs to call, and the other part is the core library of the operating system.
  • the application layer and the application framework layer run in virtual machines.
  • the virtual machine executes the java files of the application program layer and the application program framework layer as binary files.
  • the virtual machine is used to perform functions such as object life cycle management, stack management, thread management, security and exception management, and garbage collection.
  • a system library can include multiple function modules. For example: surface manager (surface manager), media library (media libraries), 3D graphics processing library (eg: OpenGL ES), 2D graphics engine (eg: SGL), etc.
  • the surface manager is used to manage the display subsystem and provides the fusion of 2D and 3D layers for multiple applications.
  • the media library supports playback and recording of various commonly used audio and video formats, as well as still image files, etc.
  • the media library can support a variety of audio and video encoding formats, such as: MPEG4, H.264, MP3, AAC, AMR, JPG, PNG, etc.
  • the 3D graphics processing library is used to implement 3D graphics drawing, image rendering, compositing, and layer processing, etc.
  • 2D graphics engine is a drawing engine for 2D drawing.
  • the kernel layer is the layer between hardware and software.
  • the kernel layer includes at least a display driver, a camera driver, an audio driver, and a sensor driver.
  • the hardware layer may include various types of sensors, such as acceleration sensors, gyroscope sensors, touch sensors, and the like.
  • FIG. 5 is a schematic structural diagram of a server provided by an embodiment of the present application.
  • the server provided in the embodiment of the present application may have a distributed structure.
  • the server may include a plurality of computing nodes (as shown in FIG. 5, the server includes N computing nodes, and N is a positive integer), at least one storage node, A task queue node, at least one scheduling node and a positioning node.
  • the functions of each node in the server shown in Figure 5 are introduced below:
  • the computing node is configured to create a three-dimensional map corresponding to the environment based on the distributed processing method and according to the multiple frames of images uploaded by the electronic device and the positioning parameters corresponding to each frame of the image in the real world.
  • different computing nodes can perform different processing tasks in the process of creating a 3D map, and N computing nodes jointly complete the entire process of creating a 3D map.
  • different computing nodes can perform the same type of processing on different images, so that the processing tasks of multiple frames of images are distributed to multiple computing nodes to perform synchronously, thereby speeding up image processing.
  • N computing nodes may include CPU algorithm components and GPU algorithm components shown in FIG. 5 .
  • CPU algorithm components there may be multiple CPU algorithm components in the server, and there may also be multiple GPU algorithm components.
  • the GPU algorithm component can be used to perform image processing on multiple frames of images (such as feature extraction, matching, retrieval, etc.), and the CPU algorithm component can be used to generate a three-dimensional map according to the image processing results of the GPU algorithm component.
  • the GPU algorithm component and the CPU algorithm component can queue the map construction instructions in the message middleware and perform algorithm automatic processing.
  • the computing node may also include a white model processing service, which is used to simplify the grid uploaded by the electronic device, and generate a white model according to the simplified grid.
  • a white model processing service which is used to simplify the grid uploaded by the electronic device, and generate a white model according to the simplified grid.
  • the computing node may also be implemented by other types of algorithm processing components, which are not specifically limited in this embodiment of the present application.
  • the task queue node is used to cache the processing tasks in the process of creating the 3D map by queue.
  • Each computing node can read the tasks to be executed from the task queue node and perform corresponding processing, so as to realize the distributed and sequential execution of multi-processing tasks.
  • the task queue node can be implemented by using the queue-shaped message middleware shown in FIG. 5 .
  • the queue-shaped message middleware can be used to asynchronously cache 3D map creation instructions from multiple electronic devices, instructions for processing tasks in the 3D map creation process, etc., and can be shared or assigned to N computing nodes, so that N computing nodes Nodes share execution tasks and balance system load.
  • At least one storage node is used for temporarily or permanently storing data related to the three-dimensional map creation process.
  • at least one storage node may store multiple frames of images, intermediate data and result data processed by multiple computing nodes, and the like.
  • the storage nodes may include cloud databases, object storage services, elastic file services, cached message middleware, and the like.
  • the cloud database can be used to store user information on the electronic device side, instruction information on task processing during the process of creating a three-dimensional map, modification information on the three-dimensional map, and other serialized content that takes up a small storage space.
  • the object storage service can be used to store non-serialized content such as 3D models, high-definition pictures, videos, and animations involved in electronic devices that takes up a large storage space.
  • the elastic file service can be used to store map data of a 3D map generated by a 3D map creation algorithm, and data such as intermediate variables of an algorithm that takes up a large storage space.
  • Cache-shaped message middleware can be used for data such as intermediate variables that can be serialized and occupy less storage space during the processing of asynchronous cache algorithms, and can be shared with N computing nodes.
  • At least one scheduling node is used for overall management of the scheduling of some or all of the N computing nodes, task queue nodes, and at least one storage node.
  • the scheduling nodes in the server may include a cloud scheduling center and an algorithm scheduling center.
  • the cloud scheduling center can manage and schedule the algorithm scheduling center, storage nodes, task queue nodes and other nodes, and can interact with electronic devices for information and data, and can be used as an efficient message processing and distribution node, for example, the cloud scheduling center It can provide the upload address of multi-frame pictures to the electronic device, perform request scheduling on the electronic device side, request and return to the cloud database, etc.
  • the algorithm scheduling center is used to manage and schedule N computing nodes, and can also manage and schedule other algorithm services.
  • the positioning node is configured to locate the electronic device according to the image uploaded by the electronic device, so as to determine the position of the electronic device relative to the AR scene or the position of the electronic device in the coordinate system of the three-dimensional map.
  • the positioning node may include a visual triangulation positioning system (visual triangulation positioning system, VTPS) service, a global visual positioning system (global visual positioning system, GVPS) service, and a vector retrieval system (vector retrieval system, VRS) service.
  • VTPS visual triangulation positioning system
  • GVPS global visual positioning system
  • VRS vector retrieval system
  • the VTPS service can be used to determine the position of the electronic device relative to the AR scene according to the characteristic information of the panorama of the AR scene.
  • the GVPS service can be used for spatial positioning, and determines the 6-degree-of-freedom coordinates of the corresponding position of the current position of the electronic device in the created three-dimensional map.
  • the VRS service is used for vector searches.
  • VTPS service, GVPS service and VRS service can be used as sub-services of computing nodes.
  • the server shown in FIG. 5 is only an exemplary description of the server provided by the embodiment of the present application, and does not limit the architecture of the server to which the solution provided by the embodiment of the present application is applicable. Compared with the structure shown in FIG. 5 , the applicable server of the solution provided by the embodiment of the present application may also add, delete or adjust some nodes, which are not specifically limited in the embodiment of the present application.
  • FIG. 6 is a flow chart of an augmented reality scene positioning method provided in the embodiment of the present application.
  • the augmented reality scene positioning method shown in FIG. 6 can be applied to FIG. 2 shows the augmented reality system. With reference to Fig. 6, this method comprises the following steps:
  • S601 When the electronic device requests the server to create a three-dimensional map of the augmented reality scene, collect multiple panoramas of the augmented reality scene.
  • the electronic device before the user operates the electronic device to play in the AR scene, the electronic device needs to collect the real-world environment image corresponding to the AR scene, and the electronic device can send the collected environment image to the server,
  • the server can obtain a three-dimensional map of the AR scene by processing the environment image.
  • the three-dimensional map is composed of multiple three-dimensional points, and each three-dimensional point corresponds to a feature point in the environment image.
  • the three-dimensional map can be used to construct the AR scene.
  • an electronic device can display images captured by a camera in real time on the display screen, and the user can operate to place virtual items in the displayed image, and the electronic device can determine the position of the virtual item in the AR scene according to the three-dimensional map, so that the user sees the real The effect that the environment and virtual objects are superimposed on the same screen in real time.
  • the virtual item placed by the user in the AR scene may be a three-dimensional digital resource model.
  • the virtual cartoon character in the AR scene shown in FIG. 1 is a three-dimensional digital resource model rendered according to the digital resource.
  • a variety of digital resources can be sent to electronic devices for users to choose to add to the AR scene for play.
  • the server after the electronic device collects the environmental image and uploads it to the server, the server generates a three-dimensional map according to the environmental image uploaded by the electronic device.
  • the electronic device can display a message prompting the user to take a panorama, and the user can operate the electronic device to collect panoramas at multiple locations, and multiple panoramas of the AR scene can be used to locate the electronic device , so that the user can operate the electronic device to move to the geographic location in the real world corresponding to the AR scene, and then the electronic device can display the AR scene on the display screen for the user to play.
  • the electronic device may determine multiple locations where the panoramic image needs to be collected according to the moving path when the user operates the electronic device to collect the environment image used to generate the three-dimensional map. For example, the electronic device may determine multiple locations on the moving path of the user collecting the environmental image, and guide the user to operate the electronic device at the multiple locations to collect the panoramic image.
  • FIG. 7 is a schematic diagram of capturing panorama slices provided by an embodiment of the present application.
  • the electronic device may capture multiple frames of panorama slices for each panorama.
  • FIG. 7 is a schematic diagram of capturing panorama slices provided by an embodiment of the present application.
  • S602 The electronic device sends the multiple panoramas of the augmented reality scene to the server.
  • the electronic device may send the collected multi-frame panorama slices of each panorama to the server.
  • the electronic device can also send the location information of each panorama to the server.
  • the location information of each panorama may include the geographic location when the electronic device collects the panorama, such as based on global positioning system (global positioning system, GPS), wireless fidelity (wireless fidelity, Wi-Fi) positioning or base station positioning, etc.
  • the positioning technology locates and determines the position of the electronic device.
  • the position information of each panorama can also include pose information and inertial measurement unit (IMU) information of each frame of panorama slices, and the pose information of each frame of panorama slices can be used for electronic devices based on synchronous map construction and Positioning (simultaneous localization and mapping, SLAM) algorithm determined.
  • IMU inertial measurement unit
  • S603 The server determines the global feature information of each frame of panorama slice in each panorama and the target pose information of each frame of panorama slice.
  • the global feature information of each frame of the panorama slice in each panorama may be a global feature vector of each frame of the panorama slice, and the global feature vector may be used to represent the overall structural features of the image.
  • the server can extract and cluster the local features of the region with better image feature invariance, and then calculate the weighted residual sum of each local vector and the cluster center to obtain the global feature vector.
  • the server may splice the multi-frame panorama slices to obtain the panorama.
  • the server may determine the target pose information of each frame of the panorama slice according to the 3D map of the AR scene, and the target pose information is the pose of the electronic device relative to the 3D coordinate system of the 3D map when the panorama slice is taken.
  • the target pose information can represent the position and azimuth angle of the electronic device in the real world when shooting the panorama slice.
  • the server may store multiple frames of environment images used for building the three-dimensional map and global feature information of each frame of environment images.
  • the server can determine the environment image similar to the panorama slice according to the global feature information of the panorama slice and the global feature information of the multi-frame environment images corresponding to the AR scene, and determine the panorama slice and matching feature points in the environment image.
  • the server determines the 3D points in the 3D map corresponding to the feature points in the panorama slice according to the matching feature points, and the target of the panorama slice can be determined according to the feature points in the panorama slice, the 3D points corresponding to the feature points, and the camera internal parameters of the electronic device pose information.
  • the server can store multiple panoramas of the AR scene, the position information of each panorama, the global feature information of each panorama slice in each panorama, and the global feature information of each panorama slice in each panorama target pose information.
  • S604 The electronic device sends the first user image and the identifier of the target scene to the server in response to the first instruction triggered by the user.
  • the first instruction may be an instruction triggered by a user to select a target scene from multiple candidate scenes.
  • the first user image may be an environmental image captured in real time by the camera device of the electronic device.
  • FIG. 8 is a schematic diagram of a candidate scenario provided in the embodiment of the present application.
  • the server can save the 3D map of the AR scene, and display the identification of multiple AR scenes as candidate scenes in the user interface, and the user can choose from multiple candidate scenes Selecting the target scene means that the user wants to operate the electronic device to enter the target scene again to play. In this manner, it is convenient for the user to re-enter the AR scene where the three-dimensional map has been generated to play, without the need for the electronic device to repeatedly perform the step of generating the three-dimensional map.
  • the electronic device may determine the geographic location of the electronic device. For example, the electronic device may determine the geographic location of the electronic device based on positioning technologies such as GPS positioning technology, wireless fidelity (Wi-Fi) positioning, or base station positioning. When the distance between the geographic location of the electronic device and the geographic location corresponding to any panorama of the target scene is greater than a preset threshold, the electronic device may determine the first route according to the geographic location of the electronic device and the geographic location corresponding to any panorama of the target scene , and display the first route to the user, so as to guide the user to operate the electronic device according to the first route to move near the target geographic location of the real environment corresponding to the target scene. After the user operates the electronic device to move near the target geographic location, the electronic device captures the first user image through the camera device, and sends the first user image and the target scene identifier to the server.
  • positioning technologies such as GPS positioning technology, wireless fidelity (Wi-Fi) positioning, or base station positioning.
  • S605 The server determines global feature information of the first user image.
  • the global feature information of the first user image may be a global feature vector of the first user image.
  • the manner in which the server determines the global feature information of the first user image can be implemented by referring to the manner in which the server determines the global feature information of the panorama slice in the above embodiment, and details are not repeated here.
  • the server determines multiple frames of target panorama slices from the panorama slices of the multiple panoramas of the target scene according to the global feature information of the first user image.
  • the server may select multiple frames of target panorama slices from the panorama slices of multiple panoramas of the target scene according to the global feature information of the first user image, and the global feature information of the target panorama slice is the same as The similarity between the global feature information of the first user image is greater than a preset threshold.
  • the server may determine multi-frame target panorama slices through the VRS service to improve efficiency.
  • the server determines a first relative position of the electronic device relative to the target geographic location according to target pose information of multiple frames of target panorama slices.
  • the server may determine, based on the VTPS service, the first relative position of the electronic device relative to the target geographic location according to target pose information of multiple frames of target panorama slices.
  • the target pose information of each frame of the target panorama slice includes the position and azimuth angle of the electronic device when shooting the frame of the panorama slice.
  • the server may locate the electronic device according to the target pose information of multiple frames of target panorama slices, and determine the first relative position of the electronic device relative to the target geographic location.
  • the azimuth angles corresponding to the first user image and the target panorama slice may be similar, and the target pose information of the target panorama slice can be Determine the possible position of the electronic device when taking the first user image, and locate the electronic device according to the multi-frame target panorama slice to determine the first relative position of the electronic device.
  • the first relative position can be the relative position of the electronic device relative to the target.
  • the angular extent and distance extent of the geographic location can be the relative position of the electronic device relative to the target.
  • FIG. 9 is a schematic diagram of determining a first relative position of an electronic device based on VTPS provided by an embodiment of the present application.
  • the server can determine the target panorama slice whose similarity with the global feature vector of the first user image is greater than a preset threshold, such as the global feature information of slice a of panorama A and the global feature vector of the first user image The similarity between them is greater than the preset threshold, the similarity between the global feature information of the slice b of the panorama B and the global feature vector of the first user image is greater than the preset threshold, the global feature information of the slice c of the panorama C and the first The similarity between the global feature vectors of the user images is greater than a preset threshold.
  • a preset threshold such as the global feature information of slice a of panorama A and the global feature vector of the first user image
  • the similarity between the global feature information of the slice b of the panorama B and the global feature vector of the first user image is greater than the preset threshold
  • the first relative position of the first electronic device may be determined as shown in the shaded area in FIG. 9, and the first relative position may be The angular range and distance range of the electronic device relative to the target geographic location.
  • the location information of the panorama stored in the server can be high-precision positioning information obtained based on technologies such as Wi-Fi positioning and GNSS positioning. Since the high-precision positioning information of the panorama has high reliability, when determining electronic For the first relative position of the device, a panorama slice of a panorama with high-precision positioning information may be preferentially selected as a target panorama slice, thereby improving the accuracy of the determined first relative position.
  • S608 The server sends the first relative position to the electronic device.
  • S609 The electronic device determines a target route from the current position of the electronic device to the geographic location corresponding to the target scene according to the first relative position.
  • the electronic device can select any angle in the angle range of the first relative position, and select any distance in the distance range to determine the target route, and the electronic device can display the target route on the display screen,
  • the target route can guide the user to move to the target geographic location with the electronic device, that is to say, guide the user to enter the area in the real world corresponding to the target scene with the electronic device.
  • the electronic device may render navigation arrows according to the target route, and superimpose and display the navigation arrows on the environmental image captured by the camera device displayed on the electronic device in real time, so as to guide the user to move to the target geographic location with the electronic device.
  • the electronic device when it is moving to the area in the real world corresponding to the target scene, it may send the user images collected by the camera device to the server multiple times, so as to perform multiple positioning of the first electronic device according to S605-S609 and Adjust the target route, and then accurately guide the user to move to the target geographic location with the electronic device.
  • the electronic device may display the target scene according to the three-dimensional map corresponding to the target scene.
  • the server may determine the second relative position of the electronic device in the coordinate system of the three-dimensional map, and the electronic device may display the target scene based on the second relative position in the coordinate system of the three-dimensional map, so that the user places the The virtual items can be superimposed and displayed on the display screen of the electronic device with the environment image of the real world.
  • the electronic device may determine the second relative position of the electronic device in the coordinate system of the three-dimensional map through the GVPS service in the server.
  • the electronic device may collect the second user image through the camera, the electronic device sends the second user image to the server, and the server determines the environment image in the database that contains the same two-dimensional feature points as the second user image.
  • the server determines the three-dimensional point cloud corresponding to the two-dimensional feature points contained in the second user image, and corresponds to the two-dimensional feature points contained in the second user image.
  • the three-dimensional point cloud determines the second relative position of the electronic device in the three-dimensional coordinate system of the three-dimensional map.
  • the electronic device that collects the environment image for generating the three-dimensional map of the AR scene and the electronic device that requests the server to locate to enter the AR scene may be the same electronic device, or may be different electronic devices.
  • the electronic device executing S601-S602 and the electronic device executing S604 and S609 may be different electronic devices. In this way, users can operate electronic devices to play in AR scenes created by other users.
  • FIG. 10 is a schematic diagram of a flow of an augmented reality scene positioning method provided by an embodiment of the present application. Referring to Figure 10, the method comprises the following steps:
  • the cloud dispatching center receives a panorama slice upload request sent by the electronic device.
  • S1002 The cloud dispatching center sends the panorama slice upload link to the electronic device.
  • the object storage service receives the panorama slice sent by the electronic device through the panorama slice upload link.
  • S1006 The GPU algorithm component performs feature extraction and matching on the panorama slice.
  • the CPU algorithm component acquires multiple frames of panorama slices from the object storage service.
  • the CPU algorithm component determines the target pose information of each frame of panorama slice.
  • the target pose information of each frame of the panorama slice is the pose of the electronic device relative to the three-dimensional coordinate system of the three-dimensional map when shooting the panorama slice.
  • the CPU algorithm component splices the panorama slices through the target pose information and matching pair information of multiple frames of panorama slices to obtain a panorama.
  • the GPU algorithm component monitors the panorama processing task of the queue-shaped message middleware.
  • the GPU algorithm component obtains the panorama from the elastic file service.
  • the GPU algorithm component extracts global features from the panorama to obtain global feature information of the panorama.
  • the GPU algorithm component stores the global feature information of the panorama in the elastic file service.
  • S1018 The electronic device sends a positioning request to the VTPS service.
  • the positioning request includes the identification of the target scene, M first user images and sensor information
  • the sensor information may be the position information when the electronic device collects the first user images
  • M is a positive integer, generally M is less than or equal to 3.
  • the VTPS service extracts global feature information of the first user image.
  • the VTPS service sends the global feature information and sensor information of the first user image to the VRS service.
  • the VRS service returns to the VTPS service M panorama slices whose global feature information is most similar to the global feature information of the first user image.
  • the VTPS service determines the first relative position of the electronic device according to the target pose information of the M panoramic image slices.
  • the first relative position may be a maximum likelihood estimation position obtained by clustering target pose information of M panoramic image slices.
  • the VTPS service sends the first relative position to the electronic device.
  • the electronic device determines the target route according to the first relative position, and renders a navigation arrow according to the target route for display to the user.
  • the electronic device may collect the second user image and sensor information every time it travels a certain distance or after a period of time, and send the second user image and sensor information to Provide GVPS service, GVPS service electronic device to locate, if the positioning is successful, obtain the precise relative position of the electronic device in the coordinate system of the AR scene, and send the relative position to the electronic device, and the electronic device will display it to the user according to the relative position AR scene. If the GVPS service fails to locate the electronic device, the GVPS service forwards the second user image and sensor information to the VTPS service, and the VTPS service performs a next round of positioning on the electronic device.
  • the present application further provides an augmented reality scene positioning method.
  • the method can be executed by the electronic device and the server in the augmented reality system shown in FIG. 2 .
  • the electronic device may have the structure shown in FIG. 3 and/or FIG. 4 of the present application
  • the server may have the structure shown in FIG. 5 of the present application.
  • FIG. 11 is a flowchart of an augmented reality scene positioning method provided by an embodiment of the present application. Referring to Figure 11, the method comprises the following steps:
  • S1101 The electronic device sends a first user image and an identifier of a target scene to a server in response to a first instruction triggered by a user.
  • the first instruction is an instruction triggered by a user to select a target scene from multiple candidate scenes
  • the first user image is an image captured by the electronic device of the current environment.
  • S1102 The server determines multiple panoramas of the target scene according to the identifier of the target scene.
  • S1103 The server determines a first relative position according to the first user image and multiple panoramic images of the target scene.
  • the first relative position is the position of the electronic device relative to the target geographic location
  • the target geographic location is the geographic location of the real environment corresponding to the target scene.
  • S1104 The server sends the first relative position to the electronic device.
  • S1105 The electronic device determines a target route according to the first relative position.
  • the target route is a route from the current position of the electronic device to the target geographic location.
  • the present application also provides an electronic device, the electronic device includes multiple functional modules; the multiple functional modules interact to realize the functions performed by the electronic device in the methods described in the embodiments of the present application .
  • S601-S602, S604, and S609 in the embodiment shown in FIG. 6 are executed.
  • the multiple functional modules can be implemented based on software, hardware or a combination of software and hardware, and the multiple functional modules can be combined or divided arbitrarily based on specific implementations.
  • the present application also provides an electronic device, which includes at least one processor and at least one memory, where computer program instructions are stored in the at least one memory, and when the electronic device is running, the at least one processing The device performs the functions performed by the electronic device in the methods described in the embodiments of the present application. For example, S601-S602, S604, and S609 in the embodiment shown in FIG. 6 are executed.
  • the present application further provides a server, the server includes multiple functional modules; the multiple functional modules interact to implement the functions performed by the server in the methods described in the embodiments of the present application. For example, execute S603, S605-S608 in the embodiment shown in FIG. 6 .
  • the multiple functional modules can be implemented based on software, hardware or a combination of software and hardware, and the multiple functional modules can be combined or divided arbitrarily based on specific implementations.
  • the present application also provides a server, which includes at least one processor and at least one memory, where computer program instructions are stored in the at least one memory, and when the electronic device is running, the at least one processor executes The functions performed by the server in the methods described in the embodiments of this application. For example, execute S603, S605-S608 in the embodiment shown in FIG. 6 .
  • the present application further provides a computer program that, when the computer program is run on a computer, causes the computer to execute the methods described in the embodiments of the present application.
  • the present application also provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a computer, the computer executes the computer program described in the embodiments of the present application. methods described.
  • the present application also provides a chip, the chip is used to read the computer program stored in the memory, and implement the methods described in the embodiments of the present application.
  • the present application provides a system-on-a-chip, where the system-on-a-chip includes a processor, configured to support a computer device to implement the methods described in the embodiments of the present application.
  • the chip system further includes a memory, and the memory is used to store necessary programs and data of the computer device.
  • the system-on-a-chip may consist of chips, or may include chips and other discrete devices.
  • the embodiments of the present application may be provided as methods, systems, or computer program products. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
  • computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to operate in a specific manner, such that the instructions stored in the computer-readable memory produce an article of manufacture comprising instruction means, the instructions
  • the device realizes the function specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Graphics (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Processing Or Creating Images (AREA)

Abstract

本申请提供一种增强现实系统、增强现实场景定位方法及设备。在该方法中,电子设备响应于用户触发的第一指令,将第一用户图像和目标场景的标识发送给服务器;服务器根据第一用户图像和目标场景的多张全景图确定第一相对位置,第一相对位置为电子设备相对于目标地理位置的位置,目标地理位置为目标场景对应的真实环境的地理位置;服务器将第一相对位置发送给电子设备;电子设备根据第一相对位置确定从当前位置到目标地理位置的目标路线。通过该方案,服务器可以根据目标场景的多张全景图和第一用户图像对电子设备进行准确定位,确定第一相对位置,电子设备可以根据第一相对位置确定目标路线,从而准确引导用户持电子设备重新进入目标场景游玩。

Description

一种增强现实系统、增强现实场景定位方法及设备
相关申请的交叉引用
本申请要求在2022年01月06日提交中华人民共和国知识产权局、申请号为202210010548.9、申请名称为“一种增强现实系统、增强现实场景定位方法及设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及增强现实技术领域,尤其涉及一种增强现实系统、增强现实场景定位方法及设备。
背景技术
增强现实(augmented reality,AR)是一种将真实世界信息和虚拟世界信息集成显示的技术。AR技术可以将原本在现实世界难以体验的实体信息进行模拟仿真得到虚拟信息,并将虚拟信息应用到真实世界,以使真实环境和虚拟物体实时叠加到同一个画面或空间同时被用户感知,以达到超越现实的感官体验。
用户可以在AR设备的显示屏中显示的AR场景中同时观察到真实世界和虚拟物品,如用户在AR设备的显示屏中观察到的真实世界的地面、桌子等物品,同时还可以观察到地面上放置的动漫人物等虚拟物品。
用户可以携带AR设备在真实世界的多个区域进行游玩,以在不同区域体验AR场景。
发明内容
本申请提供一种增强现实系统、增强现实场景定位方法与设备。
第一方面,本申请提供一种增强现实系统,该增强现实系统包括电子设备和服务器;
所述电子设备,用于响应于用户触发的第一指令,将第一用户图像和目标场景的标识发送给服务器;所述第一指令为所述用户触发的从多个候选场景中选择目标场景的指令,所述第一用户图像为所述电子设备对当前所处环境拍摄得到的图像;接收所述服务器发送的第一相对位置,所述第一相对位置为所述电子设备相对于目标地理位置的位置,所述目标地理位置为所述目标场景对应的真实环境的地理位置;根据所述第一相对位置确定目标路线,所述目标路线为从所述电子设备当前位置到所述目标地理位置的路线;
所述服务器,用于接收所述电子设备发送的所述第一用户图像和所述目标场景的标识;根据所述目标场景的标识确定所述目标场景的多张全景图;根据所述第一用户图像和所述目标场景的多张全景图确定所述第一相对位置,并将所述第一相对位置发送给所述电子设备。
基于上述系统,电子设备在接收到用户触发的选择目标场景的第一指令后,电子设备将第一用户图像和目标场景的标识发送给服务器,从而服务器可以根据目标场景的多张全景图和第一用户图像对电子设备进行准确定位,确定电子设备相对于目标场景对应的真实环境的地理位置之间的第一相对位置。电子设备可以根据第一相对位置确定从当前位置到 目标场景对应的地理位置的目标路线,可以准确引导用户持电子设备查找目标场景并重新进入目标场景游玩。
在一个可能的设计中,所述服务器具体用于:对所述第一用户图像进行特征提取,确定所述第一用户图像的全局特征信息;根据所述第一用户图像的全局特征信息从所述目标场景的多张全景图的全景图切片中确定多帧目标全景图切片;每帧目标全景图切片的全局特征信息和所述第一用户图像的全局特征信息之间的相似度大于预设阈值;获取每张目标全景图切片的目标位姿信息,每张目标全景图切片的目标位姿信息用于表示电子设备在拍摄全景图切片时在真实世界中的位置和方位角;根据所述多张目标全景图切片的目标位姿信息确定所述第一相对位置。
通过该设计,服务器可以根据第一用户图像的全局特征信息从目标场景的多张全景图的全景图切片中,确定全局特征信息与第一用户图像相似的目标全景图切片,进而通过视觉三角定位根据目标全景图切片的目标位姿信息,准确确定电子设备拍摄第一用户图像时相对于目标地理位置的第一相对位置。
在一个可能的设计中,所述第一相对位置包括所述电子设备相对于所述目标地理位置的角度范围和距离范围;
所述电子设备具体用于:从所述角度范围中选择任一角度值,以及从所述距离范围中选择任一距离值;根据选择的角度值和选择的距离值确定所述目标路线。
通过该设计,第一相对位置包括电子设备相对于目标地理位置的角度范围和距离范围,电子设备可以任选角度值和距离值确定目标路线,用户持电子设备根据目标路线移动后,可以移动至目标场景对应的真实环境区域中。
在一个可能的设计中,所述电子设备还用于:根据所述目标路线渲染导航箭头,并显示所述电子设备当前拍摄的环境图像和所述导航箭头。
通过该设计,电子设备可以显示目标路线对应的导航箭头,以图像形式引导用户持电子设备移动,提升用户体验。
在一个可能的设计中,所述电子设备还用于:接收所述服务器创建的所述目标场景的三维地图,采集所述目标场景的多张全景图,每张全景图包括多帧全景图切片;将所述目标场景的多张全景图发送给所述服务器;
所述服务器还用于:接收所述电子设备上传的多张全景图,对每张全景图的多帧全景图切片进行特征提取,确定每帧全景图切片的全局特征信息;根据所述目标场景的三维地图和每帧全景图切片的全局特征信息确定每帧全景图切片的目标位姿信息。
通过该设计,电子设备在接收到服务器创建的三维地图后,可以对目标场景对应的真实环境采集多张全景图,并确定每帧全景图切片的目标位姿信息,从而可以在用户上传第一用户图像后,基于目标场景的全景图切片的目标位姿信息对电子设备进行视觉三角定位,以准确确定电子设备相对于目标场景的目标地理位置的相对位置。
在一个可能的设计中,所述服务器具体用于:获取所述目标场景对应的多帧环境图像以及每帧环境图像的全局特征信息和特征点;提取第一全景图切片的全局特征信息和特征点,所述第一全景图切片为多张全景图中的任一帧全景图切片;根据第一全景图切片的全局特征确定与第一全景图切片匹配的至少一帧环境图像,并确定与所述第一全景图切片匹配的至少一帧环境图像中的特征点在三维地图中对应的三维点,将确定出的三维点作为所述第一全景图切片的特征点对应的三维点;根据所述第一全景图切片的特征点、所述第一 全景图切片的特征点对应的三维点以及所述电子设备的相机内参确定所述第一全景图切片的目标位姿信息。
通过该设计,服务器可以确定每帧全景图切片的目标位姿信息,从而在接收到第一用户图像后,可以通过视觉三角定位方法对第一电子设备进行定位。
在一个可能的设计中,所述电子设备还用于:在移动至所述目标地理位置之后,采集第二用户图像,所述第二用户图像为所述电子设备移动至所述目标地理位置之后对所处环境拍摄得到的图像;将所述第二用户图像发送给所述服务器,并接收所述服务器返回的第二相对位置,所述第二相对位置为所述电子设备在所述三维地图的三维坐标系中的位置;根据所述第二相对位置、所述三维地图和所述电子设备实时采集的环境图像显示所述目标场景;
所述服务器还用于:接收所述电子设备上传的所述第二用户图像;基于GVPS算法根据所述第二用户图像确定所述电子设备的第二相对位置,并将所述第二相对位置发送给所述电子设备。
通过该设计,用户持电子设备进入目标场景对应的真实环境的区域后,电子设备可以采集第二用户图像并发送给服务器,服务器可以基于GVPS算法确定电子设备在三维地图的三维坐标系中的位置,进而电子设备可以基于在三维地图的三维坐标系中的位置显示目标场景,使得用户在目标场景中游玩时看到的电子设备拍摄到的真实环境和用户放置的虚拟物品更加真实,提升用户体验。
第二方面,本申请提供一种增强现实场景定位方法,应用于增强现实系统,所述增强现实系统包括电子设备和服务器,该方法包括:
所述电子设备响应于用户触发的第一指令,将第一用户图像和目标场景的标识发送给服务器;所述第一指令为所述用户触发的从多个候选场景中选择目标场景的指令,所述第一用户图像为所述电子设备对当前所处环境拍摄得到的图像;所述服务器接收所述电子设备发送的所述第一用户图像和所述目标场景的标识;所述服务器根据所述目标场景的标识确定所述目标场景的多张全景图;所述服务器根据所述第一用户图像和所述目标场景的多张全景图确定第一相对位置,所述第一相对位置为所述电子设备相对于目标地理位置的位置,所述目标地理位置为所述目标场景对应的真实环境的地理位置;所述服务器将所述第一相对位置发送给所述电子设备;所述电子设备根据所述第一相对位置确定目标路线,所述目标路线为从所述电子设备当前位置到所述目标地理位置的路线。
在一个可能的设计中,所述服务器根据所述第一用户图像和所述目标场景的多张全景图确定第一相对位置,包括:所述服务器对所述第一用户图像进行特征提取,确定所述第一用户图像的全局特征信息;所述服务器根据所述第一用户图像的全局特征信息从所述目标场景的多张全景图的全景图切片中确定多帧目标全景图切片;每帧目标全景图切片的全局特征信息和所述第一用户图像的全局特征信息之间的相似度大于预设阈值;所述服务器获取每张目标全景图切片的目标位姿信息,每张目标全景图切片的目标位姿信息用于表示电子设备在拍摄全景图切片时在真实世界中的位置和方位角;所述服务器根据所述多张目标全景图切片的目标位姿信息确定所述第一相对位置。
在一个可能的设计中,所述第一相对位置包括所述电子设备相对于所述目标地理位置的角度范围和距离范围;所述电子设备根据所述第一相对位置确定目标路线,包括:所述 电子设备从所述角度范围中选择任一角度值,以及从所述距离范围中选择任一距离值;所述电子设备根据选择的角度值和选择的距离值确定所述目标路线。
在一个可能的设计中,所述方法还包括:所述电子设备根据所述目标路线渲染导航箭头,并显示所述电子设备当前拍摄的环境图像和所述导航箭头。
在一个可能的设计中,所述方法还包括:所述电子设备接收所述服务器创建的所述目标场景的三维地图,采集所述目标场景的多张全景图,每张全景图包括多帧全景图切片;所述电子设备将所述目标场景的多张全景图发送给所述服务器;所述服务器对每张全景图的多帧全景图切片进行特征提取,确定每帧全景图切片的全局特征信息;所述服务器根据所述目标场景的三维地图和每帧全景图切片的全局特征信息确定每帧全景图切片的目标位姿信息。
在一个可能的设计中,所述服务器根据所述目标场景的三维地图和每帧全景图切片的全局特征信息确定每帧全景图切片的目标位姿信息,包括:所述服务器获取所述目标场景对应的多帧环境图像以及每帧环境图像的全局特征信息和特征点;所述服务器提取第一全景图切片的全局特征信息和特征点,所述第一全景图切片为多张全景图中的任一帧全景图切片;所述服务器根据第一全景图切片的全局特征确定与第一全景图切片匹配的至少一帧环境图像,并确定与所述第一全景图切片匹配的至少一帧环境图像中的特征点在三维地图中对应的三维点,所述服务器将确定出的三维点作为所述第一全景图切片的特征点对应的三维点;所述服务器根据所述第一全景图切片的特征点、所述第一全景图切片的特征点对应的三维点以及所述电子设备的相机内参确定所述第一全景图切片的目标位姿信息。
在一个可能的设计中,所述方法还包括:所述电子设备在移动至所述目标地理位置之后,采集第二用户图像,所述第二用户图像为所述电子设备移动至所述目标地理位置之后对所处环境拍摄得到的图像;所述电子设备将所述第二用户图像发送给所述服务器;所述服务器基于GVPS算法根据所述第二用户图像确定所述电子设备的第二相对位置,并将所述第二相对位置发送给所述电子设备,所述第二相对位置为所述电子设备在所述目标场景的三维地图的三维坐标系中的位置;所述电子设备根据所述第二相对位置、所述三维地图和所述电子设备实时采集的环境图像显示所述目标场景。
第三方面,本申请提供一种增强现实场景定位方法,应用于服务器,该方法包括:
接收所述电子设备发送的第一用户图像和目标场景的标识;其中,所述第一用户图像为所述电子设备对当前所处环境拍摄得到的图像,所述目标场景为所述第一电子设备响应于第一指令确定的,所述第一指令为用户触发的从多个候选场景中选择目标场景的指令;根据所述目标场景的标识确定所述目标场景的多张全景图;根据所述第一用户图像和所述目标场景的多张全景图确定第一相对位置,所述第一相对位置为所述电子设备相对于目标地理位置的位置,所述目标地理位置为所述目标场景对应的真实环境的地理位置;将所述第一相对位置发送给所述电子设备,以使所述电子设备根据所述第一相对位置确定目标路线,所述目标路线为从所述电子设备当前位置到所述目标地理位置的路线。
在一个可能的设计中,所述根据所述第一用户图像和所述目标场景的多张全景图确定第一相对位置,包括:对所述第一用户图像进行特征提取,确定所述第一用户图像的全局特征信息;根据所述第一用户图像的全局特征信息从所述目标场景的多张全景图的全景图切片中确定多帧目标全景图切片;每帧目标全景图切片的全局特征信息和所述第一用户图 像的全局特征信息之间的相似度大于预设阈值;获取每张目标全景图切片的目标位姿信息,每张目标全景图切片的目标位姿信息用于表示电子设备在拍摄全景图切片时在真实世界中的位置和方位角;根据所述多张目标全景图切片的目标位姿信息确定所述第一相对位置。
在一个可能的设计中,所述第一相对位置包括所述电子设备相对于所述目标地理位置的角度范围和距离范围;所述目标路线为所述电子设备根据所述角度范围中的任一角度值和所述距离范围中的任一距离值确定的。
在一个可能的设计中,在接收所述电子设备发送的第一用户图像和目标场景的标识之前,所述方法还包括:将所述服务器创建的目标场景的三维地图发送给所述电子设备,接收所述电子设备发送的所述目标场景的多张全景图;每张全景图包括多帧全景图切片;对每张全景图的多帧全景图切片进行特征提取,确定每帧全景图切片的全局特征信息;根据所述目标场景的三维地图和每帧全景图切片的全局特征信息确定每帧全景图切片的目标位姿信息。
在一个可能的设计中,所述根据所述目标场景的三维地图和每帧全景图切片的全局特征信息确定每帧全景图切片的目标位姿信息,包括:获取所述目标场景对应的多帧环境图像以及每帧环境图像的全局特征信息和特征点;提取第一全景图切片的全局特征信息和特征点,所述第一全景图切片为多张全景图中的任一帧全景图切片;根据第一全景图切片的全局特征确定与第一全景图切片匹配的至少一帧环境图像,并确定与所述第一全景图切片匹配的至少一帧环境图像中的特征点在三维地图中对应的三维点,将确定出的三维点作为所述第一全景图切片的特征点对应的三维点;根据所述第一全景图切片的特征点、所述第一全景图切片的特征点对应的三维点以及所述电子设备的相机内参确定所述第一全景图切片的目标位姿信息。
在一个可能的设计中,所述方法还包括:接收所述电子设备发送的第二用户图像,所述第二用户图像为所述电子设备移动至所述目标地理位置之后对所处环境拍摄得到的图像;基于GVPS算法根据所述第二用户图像确定所述电子设备的第二相对位置,并将所述第二相对位置发送给所述电子设备,所述第二相对位置为所述电子设备在所述目标场景的三维地图的三维坐标系中的位置;以使所述电子设备根据所述第二相对位置、所述三维地图和所述电子设备实时采集的环境图像显示所述目标场景。
第四方面,本申请提供一种增强现实场景定位方法,应用于电子设备,该方法包括:
响应于用户触发的第一指令,将第一用户图像和目标场景的标识发送给服务器;所述第一指令为所述用户触发的从多个候选场景中选择目标场景的指令,所述第一用户图像为所述电子设备对当前所处环境拍摄得到的图像;接收所述服务器发送的所述第一相对位置,所述第一相对位置为所述电子设备相对于目标地理位置的位置,所述目标地理位置为所述目标场景对应的真实环境的地理位置,所述第一相对位置为所述服务器根据所述第一用户图像和所述目标场景的多张全景图确定的;根据所述第一相对位置确定目标路线,所述目标路线为从所述电子设备当前位置到所述目标地理位置的路线。
在一个可能的设计中,所述第一相对位置包括所述电子设备相对于所述目标地理位置的角度范围和距离范围;所述根据所述第一相对位置确定目标路线,包括:从所述角度范围中选择任一角度值,以及从所述距离范围中选择任一距离值;所述电子设备根据选择的角度值和选择的距离值确定所述目标路线。
在一个可能的设计中,所述方法还包括:根据所述目标路线渲染导航箭头,并显示所述电子设备当前拍摄的环境图像和所述导航箭头。
在一个可能的设计中,在响应于用户触发的第一指令,将第一用户图像和目标场景的标识发送给服务器之前,所述方法还包括:所述电子设备接收所述服务器创建的所述目标场景的三维地图,采集所述目标场景的多张全景图,每张全景图包括多帧全景图切片;所述电子设备将所述目标场景的多张全景图发送给所述服务器,以使所述服务器确定每帧全景图切片的全局特征信息和每帧全景图切片的目标位姿信息;所述多张全景图的多帧全景图切片的目标位姿信息用于确定所述第一相对位置。
在一个可能的设计中,所述方法还包括:在移动至所述目标地理位置之后,采集第二用户图像,所述第二用户图像为所述电子设备移动至所述目标地理位置之后对所处环境拍摄得到的图像;将所述第二用户图像发送给所述服务器,以使所述服务器基于GVPS算法根据所述第二用户图像确定所述电子设备的第二相对位置,所述第二相对位置为所述电子设备在所述目标场景的三维地图的三维坐标系中的位置;接收所述服务器发送的所述第二相对位置,根据所述第二相对位置、所述三维地图和所述电子设备实时采集的环境图像显示所述目标场景。
第五方面,本申请提供一种电子设备,所述电子设备包括多个功能模块;所述多个功能模块相互作用,实现上述任一方面及其各实施方式中电子设备所执行的方法。所述多个功能模块可以基于软件、硬件或软件和硬件的结合实现,且所述多个功能模块可以基于具体实现进行任意组合或分割。
第六方面,本申请提供一种电子设备,包括至少一个处理器和至少一个存储器,所述至少一个存储器中存储计算机程序指令,所述电子设备运行时,所述至少一个处理器执行上述任一方面及其各实施方式中电子设备执行的方法。
第七方面,本申请提供一种服务器,所述服务器包括多个功能模块;所述多个功能模块相互作用,实现上述任一方面及其各实施方式中服务器所执行的方法。所述多个功能模块可以基于软件、硬件或软件和硬件的结合实现,且所述多个功能模块可以基于具体实现进行任意组合或分割。
第八方面,本申请提供一种服务器,包括至少一个处理器和至少一个存储器,所述至少一个存储器中存储计算机程序指令,所述电子设备运行时,所述至少一个处理器执行上述任一方面及其各实施方式中服务器执行的方法。
第九方面,本申请还提供一种计算机程序,当所述计算机程序在计算机上运行时,使得所述计算机执行上述任一方面及其各实施方式中电子设备或服务器执行的方法。
第十方面,本申请还提供一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机程序,当所述计算机程序被计算机执行时,使得所述计算机执行上述任一方面及其各实施方式中电子设备或服务器执行的方法。
第十一方面,本申请还提供一种芯片,所述芯片用于读取存储器中存储的计算机程序,执行上述任一方面及其各实施方式中电子设备或服务器执行的方法。
第十二方面,本申请还提供一种芯片系统,该芯片系统包括处理器,用于支持计算机装置实现上述任一方面及其各实施方式中电子设备或服务器执行的方法。在一种可能的设计中,所述芯片系统还包括存储器,所述存储器用于保存该计算机装置必要的程序和数据。该芯片系统可以由芯片构成,也可以包含芯片和其他分立器件。
附图说明
图1为本申请实施例提供的一种电子设备显示的AR场景示意图;
图2为本申请实施例提供的一种增强现实系统的示意图;
图3为本申请实施例提供的一种电子设备的结构示意图;
图4为本申请实施例提供的一种电子设备的软件结构框图;
图5为本申请实施例提供的一种服务器的结构示意图;
图6为本申请实施例提供的一种增强现实场景定位方法的流程图;
图7为本申请实施例提供的一种采集全景图切片的示意图;
图8为本申请实施例提供的一种候选场景的示意图;
图9为本申请实施例提供的一种基于VTPS服务确定电子设备的第一相对位置的示意图;
图10为本申请实施例提供的一种增强现实场景定位方法流程的示意图;
图11为本申请实施例提供的一种增强现实场景定位方法的流程图。
具体实施方式
增强现实(augmented reality,AR)是一种将真实世界信息和虚拟世界信息集成显示的技术。AR技术可以将原本在现实世界难以体验的实体信息进行模拟仿真得到虚拟信息,并将虚拟信息应用到真实世界,以使真实环境和虚拟物体实时叠加到同一个画面或空间同时被用户感知,以达到超越现实的感官体验。
本申请实施例中,用户可以控制电子设备采集真实世界的环境图像,以生成真实世界对应的三维地图,三维地图可以用于构建AR场景。当用户通过电子设备的摄像装置实时拍摄真实世界时,用户可以在电子设备的显示屏中显示的AR场景中添加虚拟物品,电子设备基于真实世界对应的三维地图将虚拟物体添加到AR场景中,用户可以在同一个画面中观察到真实世界和用户添加的虚拟物品。例如,图1为本申请实施例提供的一种AR场景示意图。参考图1,图1中地面、道路为电子设备的摄像装置实际拍摄到的真实世界的画面,道路上的卡通人物则为用户在当前的AR场景中添加的虚拟物品,用户可以在电子设备的显示屏上同时观察到真实世界中的地面、道路和虚拟的卡通人物。
可选地,用户可以控制电子设备对真实世界的多个区域生成三维地图,以在不同区域体验AR场景。当用户想要再次进入已生成三维地图的AR场景游玩时,如何查找并进入已生成三维地图的AR场景成为一个待解决的问题。
一种实施方式中,可以采用专业设备采集AR场景对应的真实世界的全景图像序列,该全景图像序列包含真实地理位置信息以及多张图像的序列信息。在服务器中建立包含全景图像序列的数据库,并保存三维地图与全景图像序列的关系。当用户查找AR场景时,电子设备可以根据用户输入的搜索词在真实世界的电子地图中查找用户关注点的位置,并根据关注点的位置信息从数据库中查找该位置信息对应的全景图像序列,进而在电子设备的显示屏上展示AR场景。而这种实施方式需要依赖专业设备采集全景图像序列,成本较高。
另一种实施方式中,电子设备在生成三维地图时,对电子设备的摄像装置采集到的环境图像进行特征提取并将每张图像的特征信息存储到数据库中。当用户想要再次进入三维 地图对应的AR场景时,电子设备对用户在当前位置时电子设备的摄像装置采集到的用户图像进行特征提取,并计算用户图像的特征信息与数据库中图像的特征信息之间的相似度,选择数据库中特征信息与用户图像的特征信息相似度最高的图像的位置信息,并将选择的图像的位置信息作为用户的位置信息,进而根据用户的位置展示AR场景。而这种实施方式中,电子设备在生成三维地图时采集图片的密度有限导致搜索精度较低,无法准确确定用户当前所处的位置。
基于上述问题,本申请提供一种增强现实系统,用以提供一种高效且准确的查找并进入已生成三维地图的AR场景的方法。图2为本申请实施例提供的一种增强现实系统的示意图。参考图2,该增强现实系统包括电子设备11和服务器12,其中,服务器12可以有多个。多个服务器12可以协同工作并与电子设备11交互,以实现本申请实施例提供的增强现实场景定位方法。
在本申请实施例中,电子设备11响应于用户触发的在多个候选场景中选择目标场景的第一指令,电子设备11将摄像装置拍摄得到的第一用户图像发送给服务器12,服务器12根据第一用户图像的全局特征信息从目标场景对应的多个全景图中确定多个第一全景图,并根据第一用户图像的全局特征信息和多个第一全景图的局部特征信息确定电子设备11相对于目标场景的第一相对位置,服务器12将第一相对位置发送给电子设备11,电子设备11根据第一相对位置确定从电子设备11当前位置到目标场景对应的位置的目标路线。通过本方案,服务器12可以根据电子设备的摄像装置拍摄得到的第一用户图像以及目标场景的全景图对电子设备11进行准确定位,进而确定电子设备11从当前位置到目标场景对应的位置的路线,可以准确引导用户持电子设备11查找AR场景并重新进入AR场景。
下面介绍电子设备、服务器和用于这样的电子设备和服务器的实施例。本申请实施例的电子设备可以具有摄像装置和显示装置,例如本申请实施例的电子设备可以为平板电脑、手机、车载设备、增强现实(augmented reality,AR)设备、笔记本电脑、超级移动个人计算机(ultra-mobile personal computer,UMPC)、上网本、个人数字助理(personal digital assistant,PDA)、可穿戴设备等,本申请实施例对电子设备的具体类型不作任何限制。
图3为本申请实施例提供的一种电子设备100的结构示意图。如图3所示,电子设备100可以包括处理器110,外部存储器接口120,内部存储器121,通用串行总线(universal serial bus,USB)接口130,充电管理模块140,电源管理模块141,电池142,天线1,天线2,移动通信模块150,无线通信模块160,音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,传感器模块180,按键190,马达191,指示器192,摄像头193,显示屏194,以及用户标识模块(subscriber identification module,SIM)卡接口195等。
处理器110可以包括一个或多个处理单元,例如:处理器110可以包括应用处理器(application processor,AP),调制解调处理器,图形处理器(graphics processing unit,GPU),图像信号处理器(image signal processor,ISP),控制器,存储器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。其中,控制器可以是电子设备100的神经中枢和指挥中心。控制器可以根据指令操作码和时序信号,产生操作控制信号,完成取指令和执行指令的控制。处理器110 中还可以设置存储器,用于存储指令和数据。在一些实施例中,处理器110中的存储器为高速缓冲存储器。该存储器可以保存处理器110刚用过或循环使用的指令或数据。如果处理器110需要再次使用该指令或数据,可从所述存储器中直接调用。避免了重复存取,减少了处理器110的等待时间,因而提高了系统的效率。
USB接口130是符合USB标准规范的接口,具体可以是Mini USB接口,Micro USB接口,USB Type C接口等。USB接口130可以用于连接充电器为电子设备100充电,也可以用于电子设备100与外围设备之间传输数据。充电管理模块140用于从充电器接收充电输入。电源管理模块141用于连接电池142,充电管理模块140与处理器110。电源管理模块141接收电池142和/或充电管理模块140的输入,为处理器110,内部存储器121,外部存储器,显示屏194,摄像头193,和无线通信模块160等供电。
电子设备100的无线通信功能可以通过天线1,天线2,移动通信模块150,无线通信模块160,调制解调处理器以及基带处理器等实现。天线1和天线2用于发射和接收电磁波信号。电子设备100中的每个天线可用于覆盖单个或多个通信频带。不同的天线还可以复用,以提高天线的利用率。例如:可以将天线1复用为无线局域网的分集天线。在另外一些实施例中,天线可以和调谐开关结合使用。
移动通信模块150可以提供应用在电子设备100上的包括2G/3G/4G/5G等无线通信的解决方案。移动通信模块150可以包括至少一个滤波器,开关,功率放大器,低噪声放大器(low noise amplifier,LNA)等。移动通信模块150可以由天线1接收电磁波,并对接收的电磁波进行滤波,放大等处理,传送至调制解调处理器进行解调。移动通信模块150还可以对经调制解调处理器调制后的信号放大,经天线1转为电磁波辐射出去。在一些实施例中,移动通信模块150的至少部分功能模块可以被设置于处理器110中。在一些实施例中,移动通信模块150的至少部分功能模块可以与处理器110的至少部分模块被设置在同一个器件中。
无线通信模块160可以提供应用在电子设备100上的包括无线局域网(wireless local area networks,WLAN)(如无线保真(wireless fidelity,Wi-Fi)网络),蓝牙(bluetooth,BT),全球导航卫星系统(global navigation satellite system,GNSS),调频(frequency modulation,FM),近距离无线通信技术(near field communication,NFC),红外技术(infrared,IR)等无线通信的解决方案。无线通信模块160可以是集成至少一个通信处理模块的一个或多个器件。无线通信模块160经由天线2接收电磁波,将电磁波信号调频以及滤波处理,将处理后的信号发送到处理器110。无线通信模块160还可以从处理器110接收待发送的信号,对其进行调频,放大,经天线2转为电磁波辐射出去。
在一些实施例中,电子设备100的天线1和移动通信模块150耦合,天线2和无线通信模块160耦合,使得电子设备100可以通过无线通信技术与网络以及其他设备通信。所述无线通信技术可以包括全球移动通讯系统(global system for mobile communications,GSM),通用分组无线服务(general packet radio service,GPRS),码分多址接入(code division multiple access,CDMA),宽带码分多址(wideband code division multiple access,WCDMA),时分码分多址(time-division code division multiple access,TD-SCDMA),长期演进(long term evolution,LTE),BT,GNSS,WLAN,NFC,FM,和/或IR技术等。所述GNSS可以包括全球卫星定位系统(global positioning system,GPS),全球导航卫星系统(global navigation satellite system,GLONASS),北斗卫星导航系统(beidou navigation satellite system,BDS), 准天顶卫星系统(quasi-zenith satellite system,QZSS)和/或星基增强系统(satellite based augmentation systems,SBAS)。
显示屏194用于显示应用的显示界面,例如显示电子设备100上安装的应用的显示页面等。显示屏194包括显示面板。显示面板可以采用液晶显示屏(liquid crystal display,LCD),有机发光二极管(organic light-emitting diode,OLED),有源矩阵有机发光二极体或主动矩阵有机发光二极体(active-matrix organic light emitting diode的,AMOLED),柔性发光二极管(flex light-emitting diode,FLED),Miniled,MicroLed,Micro-oLed,量子点发光二极管(quantum dot light emitting diodes,QLED)等。在一些实施例中,电子设备100可以包括1个或N个显示屏194,N为大于1的正整数。在本申请实施例中,显示屏194可以用于显示AR场景,显示屏194中显示的AR场景可以包括摄像头193实时拍摄得到的图像以及用户在AR场景中放置的虚拟物品。
摄像头193用于捕获静态图像或视频。物体通过镜头生成光学图像投射到感光元件。感光元件可以是电荷耦合器件(charge coupled device,CCD)或互补金属氧化物半导体(complementary metal-oxide-semiconductor,CMOS)光电晶体管。感光元件把光信号转换成电信号,之后将电信号传递给ISP转换成数字图像信号。ISP将数字图像信号输出到DSP加工处理。DSP将数字图像信号转换成标准的RGB,YUV等格式的图像信号。在一些实施例中,电子设备100可以包括1个或N个摄像头193,N为大于1的正整数。在本申请实施例中,摄像头193可以采集用于构建AR场景的三维地图的图像,摄像头193还可以用于拍摄全景图像,如用户持电子设备100水平旋转360度,摄像头193可以采集到一张电子设备100所处位置对应的全景图。
内部存储器121可以用于存储计算机可执行程序代码,所述可执行程序代码包括指令。处理器110通过运行存储在内部存储器121的指令,从而执行电子设备100的各种功能应用以及数据处理。内部存储器121可以包括存储程序区和存储数据区。其中,存储程序区可存储操作系统,以及至少一个应用程序的软件代码等。存储数据区可存储电子设备100使用过程中所产生的数据(例如拍摄的图像、录制的视频等)等。此外,内部存储器121可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件,闪存器件,通用闪存存储器(universal flash storage,UFS)等。
外部存储器接口120可以用于连接外部存储卡,例如Micro SD卡,实现扩展电子设备的存储能力。外部存储卡通过外部存储器接口120与处理器110通信,实现数据存储功能。例如将图片,视频等文件保存在外部存储卡中。
电子设备100可以通过音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,以及应用处理器等实现音频功能。例如音乐播放,录音等。
其中,传感器模块180可以包括压力传感器180A,加速度传感器180B,触摸传感器180C等。
压力传感器180A用于感受压力信号,可以将压力信号转换成电信号。在一些实施例中,压力传感器180A可以设置于显示屏194。
触摸传感器180C,也称“触控面板”。触摸传感器180C可以设置于显示屏194,由触摸传感器180C与显示屏194组成触摸屏,也称“触控屏”。触摸传感器180C用于检测作用于其上或附近的触摸操作。触摸传感器可以将检测到的触摸操作传递给应用处理器,以确定触摸事件类型。可以通过显示屏194提供与触摸操作相关的视觉输出。在另一些实施例 中,触摸传感器180C也可以设置于电子设备100的表面,与显示屏194所处的位置不同。
按键190包括开机键,音量键等。按键190可以是机械按键。也可以是触摸式按键。电子设备100可以接收按键输入,产生与电子设备100的用户设置以及功能控制有关的键信号输入。马达191可以产生振动提示。马达191可以用于来电振动提示,也可以用于触摸振动反馈。例如,作用于不同应用(例如拍照,音频播放等)的触摸操作,可以对应不同的振动反馈效果。触摸振动反馈效果还可以支持自定义。指示器192可以是指示灯,可以用于指示充电状态,电量变化,也可以用于指示消息,未接来电,通知等。SIM卡接口195用于连接SIM卡。SIM卡可以通过插入SIM卡接口195,或从SIM卡接口195拔出,实现与电子设备100的接触和分离。
可以理解的是,图3所示的部件并不构成对电子设备100的具体限定,电子设备还可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。此外,图3中的部件之间的组合/连接关系也是可以调整修改的。
图4为本申请实施例提供的一种电子设备的软件结构框图。如图4所示,电子设备的软件结构可以是分层架构,例如可以将软件分成若干个层,每一层都有清晰的角色和分工。层与层之间通过软件接口通信。在一些实施例中,将操作系统分为四层,从上至下分别为应用程序层,应用程序框架层(framework,FWK),运行时(runtime)和系统库,以及内核层。
应用程序层可以包括一系列应用程序包(application package)。如图4所示,应用程序层可以包括相机、设置、皮肤模块、用户界面(user interface,UI)、三方应用程序等。其中,三方应用程序可以包括图库,日历,通话,地图,导航,WLAN,蓝牙,音乐,视频,短信息等。
应用程序框架层为应用程序层的应用程序提供应用编程接口(application programming interface,API)和编程框架。应用程序框架层可以包括一些预先定义的函数。如图4所示,应用程序框架层可以包括窗口管理器,内容提供器,视图系统,电话管理器,资源管理器,通知管理器。
窗口管理器用于管理窗口程序。窗口管理器可以获取显示屏大小,判断是否有状态栏,锁定屏幕,截取屏幕等。内容提供器用来存放和获取数据,并使这些数据可以被应用程序访问。所述数据可以包括视频,图像,音频,拨打和接听的电话,浏览历史和书签,电话簿等。
视图系统包括可视控件,例如显示文字的控件,显示图片的控件等。视图系统可用于构建应用程序。显示界面可以由一个或多个视图组成的。例如,包括短信通知图标的显示界面,可以包括显示文字的视图以及显示图片的视图。
电话管理器用于提供电子设备的通信功能。例如通话状态的管理(包括接通,挂断等)。
资源管理器为应用程序提供各种资源,比如本地化字符串,图标,图片,布局文件,视频文件等等。
通知管理器使应用程序可以在状态栏中显示通知信息,可以用于传达告知类型的消息,可以短暂停留后自动消失,无需用户交互。比如通知管理器被用于告知下载完成,消息提醒等。通知管理器还可以是以图表或者滚动条文本形式出现在系统顶部状态栏的通知,例如后台运行的应用程序的通知,还可以是以对话窗口形式出现在屏幕上的通知。例如在状 态栏提示文本信息,发出提示音,电子设备振动,指示灯闪烁等。
运行时包括核心库和虚拟机。运行时负责操作系统的调度和管理。
核心库包含两部分:一部分是java语言需要调用的功能函数,另一部分是操作系统的核心库。应用程序层和应用程序框架层运行在虚拟机中。虚拟机将应用程序层和应用程序框架层的java文件执行为二进制文件。虚拟机用于执行对象生命周期的管理,堆栈管理,线程管理,安全和异常的管理,以及垃圾回收等功能。
系统库可以包括多个功能模块。例如:表面管理器(surface manager),媒体库(media libraries),三维图形处理库(例如:OpenGL ES),2D图形引擎(例如:SGL)等。
表面管理器用于对显示子系统进行管理,并且为多个应用程序提供了2D和3D图层的融合。
媒体库支持多种常用的音频,视频格式回放和录制,以及静态图像文件等。媒体库可以支持多种音视频编码格式,例如:MPEG4,H.264,MP3,AAC,AMR,JPG,PNG等。
三维图形处理库用于实现三维图形绘图,图像渲染,合成,和图层处理等。
2D图形引擎是2D绘图的绘图引擎。
内核层是硬件和软件之间的层。内核层至少包含显示驱动,摄像头驱动,音频驱动,传感器驱动。
硬件层可以包括各类传感器,例如加速度传感器、陀螺仪传感器、触摸传感器等。
图5为本申请实施例提供的一种服务器的结构示意图。本申请实施例中提供的服务器可以具有分布式结构,参考图5,该服务器可以包括多个计算节点(如图5所示服务器包括N个计算节点,N为正整数)、至少一个存储节点、任务队列节点、至少一个调度节点以及定位节点。下面对图5所示的服务器中各个节点的功能进行介绍:
计算节点,用于基于分布式处理方式,根据电子设备上传的对真实世界的环境拍摄的多帧图像以及每帧图像对应的定位参数,创建该环境对应的三维地图。其中,不同计算节点可以执行三维地图创建过程中的不同处理任务,N个计算节点共同完成整个三维地图的创建过程。例如,不同计算节点可分别对不同的图像进行相同类型的处理,从而将多帧图像的处理任务分散到多个计算节点中同步进行,进而加快图像处理的速度。
参考图5,N个计算节点可以包括图5中所示的CPU算法组件和GPU算法组件。其中,服务器中的CPU算法组件可以有多个,GPU算法组件也可以有多个。GPU算法组件可以用于对多帧图像进行图像处理(如特征提取、匹配、检索等),CPU算法组件可以用于根据GPU算法组件的图像处理结果,生成三维地图。GPU算法组件和CPU算法组件可以队列形消息中间件中的地图构建指令并进行算法自动处理。
本申请实施例中计算节点中还可以包括白模处理服务,白模处理服务用于对电子设备上传的网格进行简化,并根据简化后的网格生成白模。
当然,计算节点也可以通过其它类型的算法处理组件实现,本申请实施例中不做具体限制。
任务队列节点,用于按队列缓存三维地图创建过程中的处理任务,每个计算节点可以从任务队列节点读取待执行的任务后进行相应处理,从而实现多处理任务的分布式按序执行。
示例性的,任务队列节点可以利用图5中所示的队列形消息中间件实现。该队列形消 息中间件可以用于异步缓存来自多个电子设备的三维地图创建指令、三维地图创建过程中的处理任务的指令等,并可以共享或分配给N个计算节点,以使N个计算节点分担执行任务,均衡系统负载。
至少一个存储节点,用于对三维地图创建过程相关的数据进行临时存储或永久性存储。例如,至少一个存储节点可以存储多帧图像、多个计算节点进行相应处理的中间数据和结果数据等。
可选的,参考图5,存储节点可以包括云端数据库、对象存储服务、弹性文件服务、缓存型消息中间件等。其中,云端数据库可以用于存储电子设备侧的用户信息、创建三维地图过程中任务处理情况的指示信息、对三维地图的修改信息等占用较小存储空间的序列化内容。对象存储服务可以用于存储电子设备中涉及的三维模型、高清图片、视频、动画等占用较大存储空间的非序列化内容。弹性文件服务可以用于存储利用三维地图创建算法所生成的三维地图的地图数据、以及占用存储空间较大的算法的中间变量等数据。缓存形消息中间件可以用于异步缓存算法处理过程中的可序列化且占用存储空间较小的中间变量等数据,并可以共享给N个计算节点。
至少一个调度节点,用于对N个计算节点、任务队列节点、至少一个存储节点中的部分或全部节点的调度进行统筹管理。
示例性的,如图5中所示,服务器中的调度节点可以包括云端调度中心和算法调度中心。其中,云端调度中心可以对算法调度中心、存储节点、任务队列节点等节点进行管理和调度,并可以与电子设备进行信息和数据交互,可以作为高效的消息处理及分发节点,例如,云端调度中心能够向电子设备提供多帧图片的上传地址,进行电子设备侧的请求调度,云端数据库的请求及返回等。算法调度中心用于对N个计算节点进行管理和调度,还可以对其它的一些算法服务进行管理和调度。
定位节点,用于根据电子设备上传的图像对电子设备进行定位,以确定电子设备相对于AR场景的位置或电子设备在三维地图的坐标系中的位置。
可选地,定位节点可以包括视觉三角定位系统(visual triangulation positioning system,VTPS)服务、全局视觉定位系统(global visual positioning system,GVPS)服务和向量检索系统(vector retrieval system,VRS)服务。其中,VTPS服务可以用于根据AR场景的全景图的特征信息确定电子设备相对于AR场景的位置。GVPS服务可以用于进行空间定位,确定电子设备当前所处位置在创建的三维地图中对应位置的6自由度坐标。VRS服务用于进行向量搜索。可选的,VTPS服务、GVPS服务和VRS服务可以作为计算节点的子服务。
关于上述系统中各节点、服务或组件的具体功能,下文中会结合具体实施例进行说明,这里暂不详述。
需要说明的是,图5所示的服务器仅是对本申请实施例提供的服务器的一种示例性说明,并不对本申请实施例提供的方案适用的服务器的架构造成限制。本申请实施例提供的方案适用的服务器与图5所示的结构相比,也可以增加、删除或调整部分节点,本申请实施例中不进行具体限定。
下面对本申请实施例提供的增强现实场景定位方法进行进一步介绍,图6为本申请实施例提供的一种增强现实场景定位方法的流程图,图6所示的增强现实场景定位方法可以 应用于图2所示的增强现实系统。参考图6,该方法包括以下步骤:
S601:电子设备在请求服务器创建增强现实场景的三维地图时,采集增强现实场景的多张全景图。
一种可选的实施方式中,用户在操作电子设备在AR场景中游玩之前,电子设备需要对AR场景对应的真实世界的环境图像进行采集,电子设备可以将采集到的环境图像发送给服务器,服务器通过对环境图像进行处理可以得到AR场景的三维地图,该三维地图由多个三维点构成,每个三维点对应环境图像中的一个特征点,三维地图可以用于构建AR场景。例如,电子设备可以在显示屏中显示摄像头实时拍摄的图像,用户可以操作在显示的图像中放置虚拟物品,电子设备可以根据三维地图确定虚拟物品在AR场景中的位置,从而使得用户看到真实环境和虚拟物体实时叠加到同一个画面的效果。其中,用户在AR场景中放置的虚拟物品可以为三维数字资源模型,如图1所示的AR场景中的虚拟卡通人物为根据数字资源渲染得到的三维数字资源模型,本申请实施例中可以服务器可以向电子设备发送多种数字资源,以供用户选择添加到AR场景中游玩。
在本申请实施例中,电子设备在采集环境图像并上传给服务器后,服务器根据电子设备上传的环境图像生成三维地图。服务器将三维地图发送给电子设备后,电子设备可以显示提示用户拍摄全景图的消息,用户可以操作电子设备在多个位置采集全景图,AR场景的多张全景图可以用于对电子设备进行定位,以使用户可以操作电子设备移动至AR场景对应的真实世界中的地理位置处,进而电子设备可以在显示屏上显示AR场景供用户游玩。可选地,电子设备可以根据用户操作电子设备采集用于生成三维地图的环境图像时的移动路径,确定多个需要采集全景图的位置。例如,电子设备可以在用户采集环境图像的移动路径上确定多个位置,并引导用户在这多个位置操作电子设备采集全景图。
一种可选的实施方式中,电子设备在采集全景图时,可以针对每张全景图拍摄多帧全景图切片。例如,图7为本申请实施例提供的一种采集全景图切片的示意图,用户在操作电子设备拍摄全景图时,需要用户持电子设备站在固定位置,旋转电子设备拍摄全景图,电子设备在旋转过程中可以采集多帧全景图切片,多帧全景图切片可以用于拼接得到全景图。
S602:电子设备将增强现实场景的多张全景图发送给服务器。
可选地,电子设备可以将采集到的每张全景图的多帧全景图切片发送给服务器。电子设备还可以将每张全景图的位置信息发送给服务器。其中,每张全景图的位置信息可以包括电子设备采集全景图时的地理位置,如基于全球定位系统(global positioning system,GPS)、无线保真(wireless fidelity,Wi-Fi)定位或基站定位等定位技术对电子设备定位确定出的位置。每张全景图的位置信息还可以包括每帧全景图切片的位姿信息和惯性测量单元(inertial measurement unit,IMU)信息,每帧全景图切片的位姿信息可以为电子设备基于同步地图构建与定位(simultaneous localization and mapping,SLAM)算法确定的。
S603:服务器确定每张全景图中每帧全景图切片的全局特征信息和每帧全景图切片的目标位姿信息。
一种可选的实施方式中,每张全景图中每帧全景图切片的全局特征信息可以为每帧全景图切片的全局特征向量,全局特征向量可以用于表示图像的整体结构特征。实施中,服务器可以对图像特征不变性较好的区域的局部特征进行提取并聚类处理,然后计算各个局 部向量与聚类中心的加权残差和,得到全局特征向量。
可选地,服务器在接收每张全景图的多帧全景图切片后,可以对多帧全景图切片进行拼接,得到全景图。
本申请一些实施例中,服务器可以根据AR场景的三维地图确定每帧全景图切片的目标位姿信息,目标位姿信息为电子设备拍摄全景图切片时相对于三维地图的三维坐标系的位姿信息,由于AR场景的三维地图与真实环境是对应的,则目标位姿信息可以表示电子设备在拍摄全景图切片时在真实世界中的位置和方位角。
具体实施中,服务器在构建三维地图时可以存储用于构建三维地图的多帧环境图像以及每帧环境图像的全局特征信息。服务器在接收到电子设备发送的全景图切片后,可以根据全景图切片的全局特征信息和AR场景对应的多帧环境图像的全局特征信息确定与全景图切片相似的环境图像,并确定全景图切片和环境图像中的匹配特征点。服务器根据匹配特征点确定全景图切片中的特征点对应的三维地图中的三维点,根据全景图切片中的特征点、特征点对应的三维点以及电子设备的相机内参可以确定全景图切片的目标位姿信息。
通过S601-S603,服务器可以存储AR场景的多张全景图、每张全景图的位置信息、每张全景图中每帧全景图切片的全局特征信息以及每张全景图中每帧全景图切片的目标位姿信息。
S604:电子设备响应于用户触发的第一指令,将第一用户图像和目标场景的标识发送给服务器。
其中,第一指令可以为用户触发的从多个候选场景中选择目标场景的指令。第一用户图像可以为电子设备的摄像装置实时拍摄到的环境图像。
例如,图8为本申请实施例提供的一种候选场景的示意图。参考图8,当用户操作电子设备生成AR场景的三维地图后,服务器可以保存AR场景的三维地图,并在用户界面中显示多个AR场景的标识作为候选场景,用户可以从多个候选场景中选择目标场景,表示用户想要操作电子设备再次进入目标场景游玩。通过该方式,可以便于用户再次进入已生成三维地图的AR场景游玩,而无需电子设备重复执行生成三维地图的步骤。
在一些实施例中,电子设备在接收到第一指令后,在将第一用户图像发送给服务器之前,电子设备可以确定电子设备的地理位置。例如电子设备可以基于GPS定位技术、无线保真(wireless fidelity,Wi-Fi)定位或基站定位等定位技术确定电子设备的地理位置。当电子设备的地理位置与目标场景的任意全景图对应的地理位置之间距离大于预设阈值时,电子设备可以根据电子设备的地理位置与目标场景的任意全景图对应的地理位置确定第一路线,并向用户显示第一路线,以引导用户根据第一路线操作电子设备移动到目标场景对应的真实环境的目标地理位置附近。在用户操作电子设备移动到目标地理位置附近后,电子设备通过摄像装置采集第一用户图像,并将第一用户图像和目标场景的标识发送给服务器。
S605:服务器确定第一用户图像的全局特征信息。
可选的,第一用户图像的全局特征信息可以为第一用户图像的全局特征向量。
需要说明的是服务器确定第一用户图像的全局特征信息的方式可以参见上述实施例中服务器确定全景图切片的全局特征信息的方式实施,此处不再赘述。
S606:服务器根据第一用户图像的全局特征信息从目标场景的多张全景图的全景图切片中确定多帧目标全景图切片。
一种可选的实施方式中,服务器可以根据第一用户图像的全局特征信息从目标场景的多张全景图的全景图切片中筛选多帧目标全景图切片,目标全景图切片的全局特征信息与第一用户图像的全局特征信息之间的相似度大于预设阈值。
可选地,当目标场景的全景图数量较多时,服务器可以通过VRS服务确定多帧目标全景图切片,以提升效率。
S607:服务器根据多帧目标全景图切片的目标位姿信息确定电子设备相对于目标地理位置的第一相对位置。
一种可选的实施方式中,服务器可以基于VTPS服务根据多帧目标全景图切片的目标位姿信息确定电子设备相对于目标地理位置的第一相对位置。实施中,每帧目标全景图切片的目标位姿信息包括电子设备在拍摄该帧全景图切片时所处的位置以及方位角。服务器可以根据多帧目标全景图切片的目标位姿信息对电子设备进行定位,确定电子设备相对于目标地理位置的第一相对位置。可以理解的是,由于第一用户图像和目标全景图切片的全局特征信息相似,则第一用户图像和目标全景图切片对应的方位角可能相似,则根据目标全景图切片的目标位姿信息可以确定电子设备在拍摄第一用户图像时可能所处的位置,根据多帧目标全景图切片对电子设备进行定位,可以确定电子设备的第一相对位置,第一相对位置可以为电子设备相对于目标地理位置的角度范围和距离范围。
举例来说,图9为本申请实施例提供的一种基于VTPS确定电子设备的第一相对位置的示意图。参考图9,服务器可以确定与第一用户图像的全局特征向量之间相似度大于预设阈值的目标全景图切片,如全景图A的切片a的全局特征信息与第一用户图像的全局特征向量之间相似度大于预设阈值,全景图B的切片b的全局特征信息与第一用户图像的全局特征向量之间相似度大于预设阈值,全景图C的切片c的全局特征信息与第一用户图像的全局特征向量之间相似度大于预设阈值。根据切片a的目标位姿信息、切片b的目标位姿信息以及切片c的目标位姿信息确定第一电子设备的第一相对位置可以如图9中阴影区域所示,第一相对位置可以为电子设备相对于目标地理位置的角度范围和距离范围。
需要说明的是,服务器中存储的全景图的位置信息可以为基于Wi-Fi定位、GNSS定位等技术得到的高精度定位信息,由于全景图的高精度定位信息可信度较高,在确定电子设备的第一相对位置时,可以优先选择具有高精度定位信息的全景图的全景图切片作为目标全景图切片,进而提升确定出的第一相对位置的准确性。
S608:服务器将第一相对位置发送给电子设备。
S609:电子设备根据第一相对位置确定从电子设备的当前位置到目标场景对应的地理位置的目标路线。
一种可选的实施方式中,电子设备可以在第一相对位置中的角度范围中选择任意角度,并在距离范围中选择任意距离确定目标路线,电子设备可以在显示屏中显示该目标路线,该目标路线可以引导用户持电子设备移动至目标地理位置,也就是说引导用户持电子设备进入目标场景对应的真实世界中的区域。
例如,电子设备可以根据目标路线渲染导航箭头,并在电子设备显示的摄像装置实时拍摄的环境图像中叠加显示导航箭头,从而引导用户持电子设备移动至目标地理位置。
需要说明的是,电子设备在向目标场景对应的真实世界中的区域移动时,可以向服务器多次发送摄像装置采集到的用户图像,从而根据S605-S609对第一电子设备进行多次定位并调整目标路线,进而准确引导用户持电子设备移动至目标地理位置。
本申请一些实施例中,用户持电子设备进入目标场景对应的真实世界中的区域后,电子设备可以根据目标场景对应的三维地图显示目标场景。可选地,服务器可以确定电子设备在三维地图的坐标系中的第二相对位置,电子设备可以基于在三维地图的坐标系中的第二相对位置显示目标场景,从而使得用户在目标场景中放置的虚拟物品可以与真实世界的环境图像叠加显示在电子设备的显示屏中。实施中,电子设备可以通过服务器中的GVPS服务确定电子设备在三维地图的坐标系中的第二相对位置。例如,电子设备可以通过摄像装置采集第二用户图像,电子设备将第二用户图像发送给服务器,服务器确定数据库中与第二用户图像包含相同二维特征点的环境图像。服务器根据该环境图像的二维特征点与三维地图中三维点云的关系,确定第二用户图像包含的二维特征点对应的三维点云,并根据第二用户图像包含的二维特征点对应的三维点云确定电子设备在三维地图的三维坐标系中的第二相对位置。
需要说明的是,采集用于生成AR场景的三维地图的环境图像的电子设备,与请求服务器定位以进入AR场景的电子设备可以为相同的电子设备,也可以为不同的电子设备。本申请图6所示实施例中执行S601-S602的电子设备与执行S604、S609的电子设备可以为不同的电子设备。通过该方式,用户可以操作电子设备进入其他用户创建的AR场景中游玩。
下面基于图5所示的服务器的结构,对本申请实施例提供的增强现实场景定位方法中服务器执行的功能进行进一步介绍。图10为本申请实施例提供的一种增强现实场景定位方法流程的示意图。参考图10,该方法包括以下步骤:
S1001:云端调度中心接收电子设备发送的全景图切片上传请求。
S1002:云端调度中心向电子设备发送全景图切片上传链接。
S1003:对象存储服务接收电子设备通过全景图切片上传链接发送的全景图切片。
S1004:电子设备完成单张全景图的全景图切片的上传后,云端调度中心向队列形消息中间件发送全景图切片处理消息。
S1005:GPU算法组件监听到队列形消息中间件的全景图切片处理任务。
S1006:GPU算法组件对全景图切片进行特征提取及匹配。
S1007:电子设备完成单张全景图全部全景图切片的上传后,云端调度中心向队列形消息中间件发送全景图处理消息。
S1008:CPU算法组件监听到队列形消息中间件的全景图切片处理任务。
S1009:CPU算法组件从对象存储服务中获取多帧全景图切片。
S1010:CPU算法组件确定每帧全景图切片的目标位姿信息。
其中,每帧全景图切片的目标位姿信息为电子设备在拍摄全景图切片时相对于三维地图的三维坐标系的位姿。
S1011:CPU算法组件通过多帧全景图切片的目标位姿信息及匹配对信息对全景图切片进行拼接,得到全景图。
S1012:CPU算法组件将全景图存储到弹性文件服务。
S1013:CPU算法组件发送全景图处理任务给队列形消息中间件。
S1014:GPU算法组件监听到队列形消息中间件的全景图处理任务。
S1015:GPU算法组件从弹性文件服务中获取全景图。
S1016:GPU算法组件对全景图进行全局特征提取,得到全景图的全局特征信息。
S1017:GPU算法组件将全景图的全局特征信息存储到弹性文件服务中。
S1018:电子设备向VTPS服务发送定位请求。
其中,定位请求中包括目标场景的标识、M张第一用户图像和传感器信息,传感器信息可以为电子设备采集第一用户图像时的位置信息,M为正整数,一般M小于等于3。
S1019:VTPS服务提取第一用户图像的全局特征信息。
S1020:VTPS服务将第一用户图像的全局特征信息和传感器信息发送给VRS服务。
S1021:VRS服务向VTPS服务返回全局特征信息与第一用户图像的全局特征信息最相似的M张全景图切片。
S1022:VTPS服务根据M张全景图切片的目标位姿信息确定电子设备的第一相对位置。
其中,第一相对位置可以为对M张全景图切片的目标位姿信息进行聚类后得到的最大似然估计位置。
S1023:VTPS服务将第一相对位置发送给电子设备。
S1024:电子设备根据第一相对位置确定目标路线,并根据目标路线渲染导航箭头显示给用户。
可选地,当用户持电子设备不断向箭头方向前进过程中,电子设备可以每行进一段距离、或者一段时间间隔后,采集第二用户图像及传感器信息,并将第二用户图像和传感器信息发送给GVPS服务,GVPS服务电子设备进行定位,若定位成功则获得精确的电子设备在AR场景的坐标系中的相对位置,并将该相对位置发送给电子设备,电子设备根据该相对位置向用户显示AR场景。若GVPS服务对电子设备定位失败,则GVPS服务将第二用户图像和传感器信息转发到VTPS服务,VTPS服务对电子设备进行下一轮定位。
基于以上实施例,本申请还提供一种增强现实场景定位方法。该方法可以由图2所示的增强现实系统中的电子设备和服务器执行。其中,电子设备可以具有本申请图3和/或图4所示的结构,服务器可以具有本申请图5所示的结构。图11为本申请实施例提供的一种增强现实场景定位方法的流程图。参考图11,该方法包括以下步骤:
S1101:电子设备响应于用户触发的第一指令,将第一用户图像和目标场景的标识发送给服务器。
其中,第一指令为用户触发的从多个候选场景中选择目标场景的指令,第一用户图像为电子设备对当前所处环境拍摄得到的图像。
S1102:服务器根据目标场景的标识确定目标场景的多张全景图。
S1103:服务器根据第一用户图像和目标场景的多张全景图确定第一相对位置。
其中,第一相对位置为电子设备相对于目标地理位置的位置,目标地理位置为目标场景对应的真实环境的地理位置。
S1104:服务器将第一相对位置发送给电子设备。
S1105:电子设备根据第一相对位置确定目标路线。
其中,目标路线为从电子设备当前位置到目标地理位置的路线。
需要说明的是,本申请图11所示的增强现实场景定位方法的具体实时可以参见以上各实施例实施,重复之处不再赘述。
基于以上实施例,本申请还提供一种电子设备,所述电子设备包括多个功能模块;所述多个功能模块相互作用,实现本申请实施例所描述的各方法中电子设备所执行的功能。如执行图6所示实施例中的S601-S602、S604、S609。所述多个功能模块可以基于软件、硬件或软件和硬件的结合实现,且所述多个功能模块可以基于具体实现进行任意组合或分割。
基于以上实施例,本申请还提供一种电子设备,该电子设备包括至少一个处理器和至少一个存储器,所述至少一个存储器中存储计算机程序指令,所述电子设备运行时,所述至少一个处理器执行本申请实施例所描述的各方法中电子设备所执行的功能。如执行图6所示实施例中的S601-S602、S604、S609。
基于以上实施例,本申请还提供一种服务器,所述服务器包括多个功能模块;所述多个功能模块相互作用,实现本申请实施例所描述的各方法中服务器所执行的功能。如执行图6所示实施例中的S603、S605-S608。所述多个功能模块可以基于软件、硬件或软件和硬件的结合实现,且所述多个功能模块可以基于具体实现进行任意组合或分割。
基于以上实施例,本申请还提供一种服务器,该服务器包括至少一个处理器和至少一个存储器,所述至少一个存储器中存储计算机程序指令,所述电子设备运行时,所述至少一个处理器执行本申请实施例所描述的各方法中服务器所执行的功能。如执行图6所示实施例中的S603、S605-S608。
基于以上实施例,本申请还提供一种计算机程序,当所述计算机程序在计算机上运行时,使得所述计算机执行本申请实施例所描述的各方法。
基于以上实施例,本申请还提供一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机程序,当所述计算机程序被计算机执行时,使得所述计算机执行本申请实施例所描述的各方法。
基于以上实施例,本申请还提供了一种芯片,所述芯片用于读取存储器中存储的计算机程序,实现本申请实施例所描述的各方法。
基于以上实施例,本申请提供了一种芯片系统,该芯片系统包括处理器,用于支持计算机装置实现本申请实施例所描述的各方法。在一种可能的设计中,所述芯片系统还包括存储器,所述存储器用于保存该计算机装置必要的程序和数据。该芯片系统,可以由芯片构成,也可以包含芯片和其他分立器件。
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本申请是参照根据本申请的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方 式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
显然,本领域的技术人员可以对本申请进行各种改动和变型而不脱离本申请的保护范围。这样,倘若本申请的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包含这些改动和变型在内。

Claims (21)

  1. 一种增强现实系统,其特征在于,所述增强现实系统包括电子设备和服务器;
    所述电子设备,用于响应于用户触发的第一指令,将第一用户图像和目标场景的标识发送给服务器;所述第一指令为所述用户触发的从多个候选场景中选择目标场景的指令,所述第一用户图像为所述电子设备对当前所处环境拍摄得到的图像;接收所述服务器发送的第一相对位置,所述第一相对位置为所述电子设备相对于目标地理位置的位置,所述目标地理位置为所述目标场景对应的真实环境的地理位置;根据所述第一相对位置确定目标路线,所述目标路线为从所述电子设备当前位置到所述目标地理位置的路线;
    所述服务器,用于接收所述电子设备发送的所述第一用户图像和所述目标场景的标识;根据所述目标场景的标识确定所述目标场景的多张全景图;根据所述第一用户图像和所述目标场景的多张全景图确定所述第一相对位置,并将所述第一相对位置发送给所述电子设备。
  2. 如权利要求1所述的系统,其特征在于,所述服务器具体用于:
    对所述第一用户图像进行特征提取,确定所述第一用户图像的全局特征信息;
    根据所述第一用户图像的全局特征信息从所述目标场景的多张全景图的全景图切片中确定多帧目标全景图切片;每帧目标全景图切片的全局特征信息和所述第一用户图像的全局特征信息之间的相似度大于预设阈值;
    获取每张目标全景图切片的目标位姿信息,每张目标全景图切片的目标位姿信息用于表示电子设备在拍摄目标全景图切片时在真实世界中的位置和方位角;
    根据所述多张目标全景图切片的目标位姿信息确定所述第一相对位置。
  3. 如权利要求2所述的系统,其特征在于,所述第一相对位置包括所述电子设备相对于所述目标地理位置的角度范围和距离范围;
    所述电子设备具体用于:
    从所述角度范围中选择任一角度值,以及从所述距离范围中选择任一距离值;
    根据选择的角度值和选择的距离值确定所述目标路线。
  4. 如权利要求1-3任一项所述的系统,其特征在于,所述电子设备还用于:
    根据所述目标路线渲染导航箭头,并显示所述电子设备当前拍摄的环境图像和所述导航箭头。
  5. 如权利要求1-4任一项所述的系统,其特征在于,
    所述电子设备还用于:
    接收所述服务器创建的所述目标场景的三维地图,采集所述目标场景的多张全景图,每张全景图包括多帧全景图切片;
    将所述目标场景的多张全景图发送给所述服务器;
    所述服务器还用于:
    接收所述电子设备上传的多张全景图,对每张全景图的多帧全景图切片进行特征提取,确定每帧全景图切片的全局特征信息;
    根据所述目标场景的三维地图和每帧全景图切片的全局特征信息确定每帧全景图切片的目标位姿信息。
  6. 如权利要求5所述的系统,其特征在于,所述服务器具体用于:
    获取所述目标场景对应的多帧环境图像以及每帧环境图像的全局特征信息和特征点;
    提取第一全景图切片的全局特征信息和特征点,所述第一全景图切片为多张全景图中的任一帧全景图切片;
    根据所述第一全景图切片的全局特征信息确定与所述第一全景图切片匹配的至少一帧环境图像,并确定与所述第一全景图切片匹配的至少一帧环境图像中的特征点在三维地图中对应的三维点,将确定出的三维点作为所述第一全景图切片的特征点对应的三维点;
    根据所述第一全景图切片的特征点、所述第一全景图切片的特征点对应的三维点以及所述电子设备的相机内参确定所述第一全景图切片的目标位姿信息。
  7. 如权利要求1-6任一项所述的系统,其特征在于,
    所述电子设备还用于:
    在移动至所述目标地理位置之后,采集第二用户图像,所述第二用户图像为所述电子设备移动至所述目标地理位置之后对所处环境拍摄得到的图像;
    将所述第二用户图像发送给所述服务器,并接收所述服务器返回的第二相对位置,所述第二相对位置为所述电子设备在所述目标场景的三维地图的三维坐标系中的位置;
    根据所述第二相对位置、所述三维地图和所述电子设备实时采集的环境图像显示所述目标场景;
    所述服务器还用于:
    接收所述电子设备上传的所述第二用户图像;
    基于GVPS算法根据所述第二用户图像确定所述电子设备的第二相对位置,并将所述第二相对位置发送给所述电子设备。
  8. 一种增强现实场景定位方法,其特征在于,应用于服务器,所述方法包括:
    接收电子设备发送的第一用户图像和目标场景的标识;其中,所述第一用户图像为所述电子设备对当前所处环境拍摄得到的图像,所述目标场景为所述第一电子设备响应于第一指令确定的,所述第一指令为用户触发的从多个候选场景中选择目标场景的指令;
    根据所述目标场景的标识确定所述目标场景的多张全景图;
    根据所述第一用户图像和所述目标场景的多张全景图确定第一相对位置,所述第一相对位置为所述电子设备相对于目标地理位置的位置,所述目标地理位置为所述目标场景对应的真实环境的地理位置;
    将所述第一相对位置发送给所述电子设备,以使所述电子设备根据所述第一相对位置确定目标路线,所述目标路线为从所述电子设备当前位置到所述目标地理位置的路线。
  9. 如权利要求8所述的方法,其特征在于,所述根据所述第一用户图像和所述目标场景的多张全景图确定第一相对位置,包括:
    对所述第一用户图像进行特征提取,确定所述第一用户图像的全局特征信息;
    根据所述第一用户图像的全局特征信息从所述目标场景的多张全景图的全景图切片中确定多帧目标全景图切片;每帧目标全景图切片的全局特征信息和所述第一用户图像的全局特征信息之间的相似度大于预设阈值;
    获取每张目标全景图切片的目标位姿信息,每张目标全景图切片的目标位姿信息用于表示电子设备在拍摄全景图切片时在真实世界中的位置和方位角;
    根据所述多张目标全景图切片的目标位姿信息确定所述第一相对位置。
  10. 如权利要求9所述的方法,其特征在于,所述第一相对位置包括所述电子设备相对 于所述目标地理位置的角度范围和距离范围;所述目标路线为所述电子设备根据所述角度范围中的任一角度值和所述距离范围中的任一距离值确定的。
  11. 如权利要求8-10任一项所述的方法,其特征在于,在接收所述电子设备发送的第一用户图像和目标场景的标识之前,所述方法还包括:
    将所述服务器创建的目标场景的三维地图发送给所述电子设备,接收所述电子设备发送的所述目标场景的多张全景图;每张全景图包括多帧全景图切片;
    对每张全景图的多帧全景图切片进行特征提取,确定每帧全景图切片的全局特征信息;
    根据所述目标场景的三维地图和每帧全景图切片的全局特征信息确定每帧全景图切片的目标位姿信息。
  12. 如权利要求11所述的方法,其特征在于,所述根据所述目标场景的三维地图和每帧全景图切片的全局特征信息确定每帧全景图切片的目标位姿信息,包括:
    获取所述目标场景对应的多帧环境图像以及每帧环境图像的全局特征信息和特征点;
    提取第一全景图切片的全局特征信息和特征点,所述第一全景图切片为多张全景图中的任一帧全景图切片;
    根据所述第一全景图切片的全局特征确定与所述第一全景图切片匹配的至少一帧环境图像,并确定与所述第一全景图切片匹配的至少一帧环境图像中的特征点在三维地图中对应的三维点,将确定出的三维点作为所述第一全景图切片的特征点对应的三维点;
    根据所述第一全景图切片的特征点、所述第一全景图切片的特征点对应的三维点以及所述电子设备的相机内参确定所述第一全景图切片的目标位姿信息。
  13. 如权利要求8-12任一项所述的方法,其特征在于,所述方法还包括:
    接收所述电子设备发送的第二用户图像,所述第二用户图像为所述电子设备移动至所述目标地理位置之后对所处环境拍摄得到的图像;
    基于GVPS算法根据所述第二用户图像确定所述电子设备的第二相对位置,并将所述第二相对位置发送给所述电子设备,所述第二相对位置为所述电子设备在所述目标场景的三维地图的三维坐标系中的位置;以使所述电子设备根据所述第二相对位置、所述三维地图和所述电子设备实时采集的环境图像显示所述目标场景。
  14. 一种增强现实场景定位方法,其特征在于,应用于电子设备,所述方法包括:
    响应于用户触发的第一指令,将第一用户图像和目标场景的标识发送给服务器;所述第一指令为所述用户触发的从多个候选场景中选择目标场景的指令,所述第一用户图像为所述电子设备对当前所处环境拍摄得到的图像;
    接收所述服务器发送的第一相对位置,所述第一相对位置为所述电子设备相对于目标地理位置的位置,所述目标地理位置为所述目标场景对应的真实环境的地理位置,所述第一相对位置为所述服务器根据所述第一用户图像和所述目标场景的多张全景图确定的;
    根据所述第一相对位置确定目标路线,所述目标路线为从所述电子设备当前位置到所述目标地理位置的路线。
  15. 如权利要求14所述的方法,其特征在于,所述第一相对位置包括所述电子设备相对于所述目标地理位置的角度范围和距离范围;
    所述根据所述第一相对位置确定目标路线,包括:
    从所述角度范围中选择任一角度值,以及从所述距离范围中选择任一距离值;所述电子设备根据选择的角度值和选择的距离值确定所述目标路线。
  16. 如权利要求14或15所述的方法,其特征在于,所述方法还包括:
    根据所述目标路线渲染导航箭头,并显示所述电子设备当前拍摄的环境图像和所述导航箭头。
  17. 如权利要求14-16任一项所述的方法,其特征在于,在响应于用户触发的第一指令,将第一用户图像和目标场景的标识发送给服务器之前,所述方法还包括:
    收所述服务器创建的所述目标场景的三维地图,采集所述目标场景的多张全景图,每张全景图包括多帧全景图切片;
    将所述目标场景的多张全景图发送给所述服务器,以使所述服务器确定每帧全景图切片的全局特征信息和每帧全景图切片的目标位姿信息;所述多张全景图的多帧全景图切片的目标位姿信息用于对所述电子设备进行定位。
  18. 如权利要求17所述的方法,其特征在于,所述方法还包括:
    在移动至所述目标地理位置之后,采集第二用户图像,所述第二用户图像为所述电子设备移动至所述目标地理位置之后对所处环境拍摄得到的图像;
    将所述第二用户图像发送给所述服务器,以使所述服务器基于GVPS算法根据所述第二用户图像确定所述电子设备的第二相对位置,所述第二相对位置为所述电子设备在所述目标场景的三维地图的三维坐标系中的位置;
    接收所述服务器发送的所述第二相对位置,根据所述第二相对位置、所述三维地图和所述电子设备实时采集的环境图像显示所述目标场景。
  19. 一种服务器,其特征在于,包括至少一个处理器,所述至少一个处理器与至少一个存储器耦合,所述至少一个处理器用于读取所述至少一个存储器所存储的计算机程序,以执行如权利要求8-13中任一所述的方法。
  20. 一种电子设备,其特征在于,包括至少一个处理器,所述至少一个处理器与至少一个存储器耦合,所述至少一个处理器用于读取所述至少一个存储器所存储的计算机程序,以执行如权利要求14-18中任一所述的方法。
  21. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机执行如权利要求8-13中任一所述的服务器所执行的方法,或执行如权利要求14-18中任一所述的电子设备所执行的方法。
PCT/CN2022/144272 2022-01-06 2022-12-30 一种增强现实系统、增强现实场景定位方法及设备 WO2023131089A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP22918530.1A EP4414941A1 (en) 2022-01-06 2022-12-30 Augmented reality system, augmented reality scenario positioning method, and device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210010548.9A CN116452777A (zh) 2022-01-06 2022-01-06 一种增强现实系统、增强现实场景定位方法及设备
CN202210010548.9 2022-01-06

Publications (1)

Publication Number Publication Date
WO2023131089A1 true WO2023131089A1 (zh) 2023-07-13

Family

ID=87073206

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/144272 WO2023131089A1 (zh) 2022-01-06 2022-12-30 一种增强现实系统、增强现实场景定位方法及设备

Country Status (3)

Country Link
EP (1) EP4414941A1 (zh)
CN (1) CN116452777A (zh)
WO (1) WO2023131089A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180342100A1 (en) * 2017-05-25 2018-11-29 Onsiteiq, Inc. Interactive Image Based 3D Panograph
CN111551188A (zh) * 2020-06-07 2020-08-18 上海商汤智能科技有限公司 一种导航路线生成的方法及装置
CN112598732A (zh) * 2020-12-10 2021-04-02 Oppo广东移动通信有限公司 目标设备定位方法、地图构建方法及装置、介质、设备
CN112729327A (zh) * 2020-12-24 2021-04-30 浙江商汤科技开发有限公司 一种导航方法、装置、计算机设备及存储介质
CN113672756A (zh) * 2020-05-14 2021-11-19 华为技术有限公司 一种视觉定位方法及电子设备

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180342100A1 (en) * 2017-05-25 2018-11-29 Onsiteiq, Inc. Interactive Image Based 3D Panograph
CN113672756A (zh) * 2020-05-14 2021-11-19 华为技术有限公司 一种视觉定位方法及电子设备
CN111551188A (zh) * 2020-06-07 2020-08-18 上海商汤智能科技有限公司 一种导航路线生成的方法及装置
CN112598732A (zh) * 2020-12-10 2021-04-02 Oppo广东移动通信有限公司 目标设备定位方法、地图构建方法及装置、介质、设备
CN112729327A (zh) * 2020-12-24 2021-04-30 浙江商汤科技开发有限公司 一种导航方法、装置、计算机设备及存储介质

Also Published As

Publication number Publication date
EP4414941A1 (en) 2024-08-14
CN116452777A (zh) 2023-07-18

Similar Documents

Publication Publication Date Title
CN111399789B (zh) 界面布局方法、装置及系统
US11892299B2 (en) Information prompt method and electronic device
WO2021000841A1 (zh) 一种生成用户头像的方法及电子设备
CN114168235B (zh) 一种功能切换入口的确定方法与电子设备
WO2023131090A1 (zh) 一种增强现实系统、多设备构建三维地图的方法及设备
WO2023124948A1 (zh) 一种三维地图的创建方法及电子设备
US20240013432A1 (en) Image processing method and related device
CN115115679A (zh) 一种图像配准方法及相关设备
WO2021204103A1 (zh) 照片预览方法、电子设备和存储介质
CN114842069A (zh) 一种位姿确定方法以及相关设备
CN116048765B (zh) 任务处理方法、样本数据处理方法及电子设备
WO2023131089A1 (zh) 一种增强现实系统、增强现实场景定位方法及设备
CN116204254A (zh) 一种批注页面生成方法、电子设备及存储介质
CN115268727A (zh) 显示方法及其装置
CN112862977A (zh) 数字空间的管理方法、装置与设备
CN115145457A (zh) 一种滚动截屏的方法及装置
WO2023061298A1 (zh) 一种图片备份系统、方法与设备
CN113452896B (zh) 一种图像显示方法和电子设备
CN112783993B (zh) 基于数字地图的多个授权空间的内容同步方法
WO2024114785A1 (zh) 一种图像处理方法、电子设备及系统
CN117152338A (zh) 一种建模方法与电子设备
WO2024045854A1 (zh) 一种虚拟数字内容显示系统、方法与电子设备
EP4373119A1 (en) Picture capturing and sharing method, and electronic device
WO2022206709A1 (zh) 应用程序的组件加载方法及相关装置
WO2024120296A1 (zh) 一种专题内容的显示方法及电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22918530

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2022918530

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2022918530

Country of ref document: EP

Effective date: 20240508

NENP Non-entry into the national phase

Ref country code: DE