WO2023035829A1 - 一种确定和呈现目标标记信息的方法与设备 - Google Patents

一种确定和呈现目标标记信息的方法与设备 Download PDF

Info

Publication number
WO2023035829A1
WO2023035829A1 PCT/CN2022/110472 CN2022110472W WO2023035829A1 WO 2023035829 A1 WO2023035829 A1 WO 2023035829A1 CN 2022110472 W CN2022110472 W CN 2022110472W WO 2023035829 A1 WO2023035829 A1 WO 2023035829A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
target
scene
real
image
Prior art date
Application number
PCT/CN2022/110472
Other languages
English (en)
French (fr)
Inventor
唐荣兴
蒋建平
侯晓辉
刘理想
Original Assignee
亮风台(上海)信息科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 亮风台(上海)信息科技有限公司 filed Critical 亮风台(上海)信息科技有限公司
Publication of WO2023035829A1 publication Critical patent/WO2023035829A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/006Mixed reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30244Camera pose

Definitions

  • the present application relates to the communication field, and in particular to a technology for determining and presenting target marking information.
  • Augmented Reality (AR) technology is a technology that ingeniously integrates virtual information with the real world. It uses multimedia, 3D modeling, real-time tracking and registration, intelligent interaction, sensing and other technical means to integrate computer
  • the generated text, images, 3D models, music, video and other virtual information are simulated and applied to the real world, and the two kinds of information complement each other, thereby realizing the "enhancement" of the real world.
  • AR augmented reality technology is a relatively new technical content that promotes the integration of real world information and virtual world information.
  • implement simulation processing superimpose virtual information content in the real world for effective application, and in this process can be perceived by human senses, so as to achieve a sensory experience beyond reality.
  • it is usually necessary to mark a specific location or a specific target to make it easier for the user to obtain interactive content and the like.
  • An object of the present application is to provide a method and device for determining and presenting target marker information.
  • a method for determining target tag information is provided, which is applied to a first user equipment, and the method includes:
  • the real-time scene image about the current scene is captured by the camera device, the real-time pose information of the camera device and the 3D point cloud information corresponding to the current scene are obtained through three-dimensional tracking, and the real-time image of the user in the current scene is obtained.
  • a method for presenting target tag information which is applied to a second user equipment.
  • the method includes:
  • Target scene information includes corresponding target tag information
  • target tag information includes corresponding tag information and target spatial position
  • the corresponding target image position is determined according to the current pose information of the camera device and the target spatial position, and the marker information is superimposed on the target image position of the current scene image.
  • a method for determining and presenting target marker information comprising:
  • the first user equipment captures an initial scene image about the current scene by a camera device, and performs three-dimensional tracking initialization on the current scene according to the initial scene image;
  • the first user equipment captures real-time scene images of the current scene through the camera device, obtains real-time pose information of the camera device and 3D point cloud information corresponding to the current scene through three-dimensional tracking, and obtains The tag information input in the real-time scene image of the current scene and the target image position corresponding to the tag information;
  • the first user equipment determines a corresponding target spatial position according to the real-time pose information and the target image position, and generates corresponding target marker information according to the target spatial position and the marker information, wherein the target marker The information is used to superimpose and present the marker information at the target space position of the 3D point cloud information;
  • the second user equipment acquires the target scene information that matches the current scene, where the target scene information includes corresponding target tag information, and the target tag information includes corresponding tag information and a target spatial position;
  • the second user equipment captures a current scene image related to the current scene by the camera device
  • the second user equipment determines a corresponding target image position according to the current pose information of the camera device and the target spatial position, and superimposes and presents the marker information on the target image position of the current scene image.
  • a first user equipment for determining target tag information includes:
  • a module used for taking an initial scene image about the current scene by a camera, and carrying out three-dimensional tracking initialization for the current scene according to the initial scene image;
  • the first and second modules are used to shoot real-time scene images about the current scene through the camera device, obtain real-time pose information of the camera device and 3D point cloud information corresponding to the current scene through three-dimensional tracking, and obtain the user's
  • the tag information input in the real-time scene image of the current scene and the target image position corresponding to the tag information;
  • a third module configured to determine the corresponding target spatial position according to the real-time pose information and the target image position, and generate corresponding target marker information according to the target spatial position and the marker information, wherein the target marker The information is used to superimpose and present the marker information at the target space position of the 3D point cloud information.
  • a second user device for presenting target marking information wherein the device includes:
  • a two-one module configured to acquire target scene information matching the current scene, wherein the target scene information includes corresponding target tag information, and the target tag information includes corresponding tag information and target spatial position;
  • a two-two module configured to capture a current scene image about the current scene through the camera device
  • the second and third modules are configured to determine the corresponding target image position according to the current pose information of the camera device and the target spatial position, and superimpose and present the marker information on the target image position of the current scene image.
  • a computer device wherein the device includes:
  • a memory arranged to store computer-executable instructions which, when executed, cause the processor to perform the steps of any one of the methods described above.
  • a computer-readable storage medium on which computer programs/instructions are stored, wherein the computer program/instructions, when executed, cause the system to perform any of the methods described above. step.
  • a computer program product including computer programs/instructions, which is characterized in that, when the computer program/instructions are executed by a processor, the steps of any one of the methods described above are implemented.
  • this application obtains the 3D point cloud information corresponding to the current scene and the real-time pose of the camera device by performing three-dimensional tracking on the current scene, and is used to determine the corresponding marker information input by the user in the real-time scene image of the current scene.
  • the spatial location can facilitate the user to edit the target object in the scene to provide relevant interactive information.
  • it is also conducive to the scene collaboration of multiple users, etc., which improves the user experience.
  • FIG. 1 shows a flowchart of a method for determining target tag information according to an embodiment of the present application
  • FIG. 2 shows a flow chart of a method for presenting target tag information according to another embodiment of the present application
  • Fig. 3 shows a flow chart of a system method for determining and presenting target marker information according to an embodiment of the present application
  • FIG. 4 shows functional modules of a first user equipment according to an embodiment of the present application
  • FIG. 5 shows functional modules of a second user equipment according to an embodiment of the present application
  • FIG. 6 illustrates an exemplary system that may be used to implement various embodiments described in this application.
  • the terminal, the device serving the network, and the trusted party all include one or more processors (for example, a central processing unit (Central Processing Unit, CPU)), an input/output interface, a network interface and Memory.
  • processors for example, a central processing unit (Central Processing Unit, CPU)
  • CPU Central Processing Unit
  • Memory may include non-permanent memory in computer-readable media, random access memory (Random Access Memory, RAM) and/or non-volatile memory, such as read-only memory (Read Only Memory, ROM) or flash memory (Flash Memory). Memory is an example of computer readable media.
  • RAM Random Access Memory
  • ROM Read Only Memory
  • Flash Memory Flash Memory
  • Computer-readable media including both permanent and non-permanent, removable and non-removable media, can be implemented by any method or technology for storage of information.
  • Information may be computer readable instructions, data structures, modules of a program, or other data.
  • the example of the storage medium of computer includes, but not limited to Phase-Change Memory (Phase-Change Memory, PCM), Programmable Random Access Memory (Programmable Random Access Memory, PRAM), Static Random-Access Memory (Static Random-Access Memory, SRAM), Dynamic Random Access Memory (Dynamic Random Access Memory, DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read-Only Memory (Electrically-Erasable Programmable Read- Only Memory, EEPROM), flash memory or other memory technology, CD-ROM (Compact Disc Read-Only Memory, CD-ROM), Digital Versatile Disc (Digital Versatile Disc, DVD) or other optical storage, Magnetic tape cartridge, tape disk storage or other magnetic storage device or any other
  • the equipment referred to in this application includes, but is not limited to, user equipment, network equipment, or equipment formed by integrating user equipment and network equipment through a network.
  • the user equipment includes but is not limited to any mobile electronic product that can perform human-computer interaction with the user (for example, human-computer interaction through a touchpad), such as a smart phone, a tablet computer, smart glasses, etc., and the mobile electronic product can Use any operating system, such as Android operating system, iOS operating system, etc.
  • the network device includes an electronic device that can automatically perform numerical calculation and information processing according to pre-set or stored instructions, and its hardware includes but is not limited to a microprocessor, an application specific integrated circuit (Application Specific Integrated Circuit, ASIC) , Programmable Logic Device (PLD), Field Programmable Gate Array (Field Programmable Gate Array, FPGA), Digital Signal Processor (Digital Signal Processor, DSP), embedded devices, etc.
  • ASIC Application Specific Integrated Circuit
  • PLD Programmable Logic Device
  • FPGA Field Programmable Gate Array
  • DSP Digital Signal Processor
  • the network equipment includes but is not limited to a computer, a network host, a single network server, a plurality of network server sets or a cloud formed by multiple servers; here, the cloud is composed of a large number of computers or network servers based on cloud computing (Cloud Computing), wherein , cloud computing is a kind of distributed computing, a virtual supercomputer composed of a group of loosely coupled computer sets.
  • the network includes, but is not limited to, the Internet, a wide area network, a metropolitan area network, a local area network, a VPN network, a wireless self-organizing network (Ad Hoc network) and the like.
  • the device may also be a program running on the user device, network device, or a device formed by integrating user device and network device, network device, touch terminal or network device and touch terminal through a network.
  • Fig. 1 shows a method for determining target tag information according to one aspect of the present application, which is applied to a first user equipment, wherein the method includes step S101, step S102 and step S103.
  • step S101 an initial scene image about the current scene is taken by the camera, and the three-dimensional tracking initialization of the current scene is performed according to the initial scene image
  • step S102 an image about the current scene is taken by the camera Real-time scene image, obtain the real-time pose information of the camera device and the 3D point cloud information corresponding to the current scene through three-dimensional tracking, and obtain the mark information and the mark information input by the user in the real-time scene image of the current scene Corresponding target image position
  • step S103 determine the corresponding target spatial position according to the real-time pose information and the target image position, and generate corresponding target tag information according to the target spatial position and the tag information, wherein , the target mark information is used to superimpose and present the mark information on the target space position of the 3D point cloud information.
  • the first user equipment includes but is not limited to a human-computer interaction device with a camera, such as a personal computer, a smart phone, a tablet computer, a projector, smart glasses, or a smart helmet, etc., wherein the first user equipment It can contain a camera itself, or it can be an external camera.
  • the first user equipment further includes a display device (such as a display screen, a projector, etc.), configured to superimpose and display corresponding marker information and the like in the presented real-time scene image.
  • an initial scene image of the current scene is captured by a camera device, and three-dimensional tracking initialization is performed on the current scene according to the initial scene image.
  • the user holds the first user equipment, and collects an initial scene image about the current scene through a camera of the first user equipment, and the number of the initial scene images may be one or more.
  • the first user equipment may perform three-dimensional tracking initialization on the current scene based on the one or more initial scene images, such as SLAM (simultaneous localization and mapping, simultaneous localization and mapping) initialization.
  • the specific initialization methods include, but are not limited to, double-frame initialization, single-frame initialization, 2D recognition initialization, 3D model initialization, and the like.
  • the first user equipment may complete the 3D tracking initialization locally, or may upload the collected initial scene image to other devices (such as a network server), and other devices may perform 3D tracking initialization, which is not limited here.
  • the initial pose information of the camera device can be obtained by performing three-dimensional tracking initialization.
  • the initial map point information corresponding to the current scene can also be obtained.
  • the first user equipment may present a prompt message indicating that the initialization is complete to the user, so as to remind the user to add marker information (such as AR marker information) and the like.
  • the initialization method includes but is not limited to: completing initialization by identifying preset 2D markers in the current scene; single-frame initialization; double-frame initialization; 3D model initialization, etc. Specific examples are as follows:
  • the first user equipment obtains the image with the 2D recognition map through the camera device, it extracts image features and performs matching recognition with the stored feature library, determines the pose of the camera device relative to the current scene, and uses the information obtained by the recognition algorithm (camera device pose) is sent to the SLAM algorithm to complete the initialization of the tracking.
  • the recognition algorithm camera device pose
  • Single-frame initialization uses the homography matrix to obtain the corresponding rotation and translation matrices under the condition that the image sensor (that is, the camera device) acquires an approximate planar scene, thereby initializing the map points and the pose of the camera device.
  • the 3D model of the tracking target it is necessary to obtain the 3D model of the tracking target, use the 3D model to obtain 3D edge features, and then obtain the initial pose of the tracking target. Render the 3D edge features in the initial pose on the application interface, shoot a video containing the tracking target, and read the image frame of the video. The user aligns the target in the scene with the 3D model edge feature, performs tracking and matching, and after tracking the target, obtains The current pose and feature point information are passed to the SLAM algorithm for initialization.
  • step S102 the real-time scene image of the current scene is captured by the camera device, the real-time pose information of the camera device and the 3D point cloud information corresponding to the current scene are obtained through three-dimensional tracking, and the user's position in the current scene is obtained.
  • the first user equipment performs three-dimensional tracking initialization, it continues to scan the current scene through the camera device to obtain a corresponding real-time scene image, and processes the real-time scene image through the tracking thread of the three-dimensional tracking algorithm, the local mapping thread, etc., and obtains the real-time scene image.
  • the 3D point cloud information includes 3D points.
  • the 3D point cloud information includes 3D map points determined after image matching and depth information acquisition. Data type, the scan data is recorded in the form of points, each point contains three-dimensional coordinates, and some may contain color information (R, G, B) or the intensity of the reflective surface of the object.
  • the 3D point cloud information in addition to corresponding 3D map points, also includes, but is not limited to: key frames corresponding to point cloud information, common view information corresponding to point cloud information, and growth trees corresponding to point cloud information information etc.
  • the real-time pose information includes the real-time position and attitude of the camera device in space, through which the image position and spatial position can be converted, etc.
  • the pose information includes the world coordinates of the camera device relative to the current scene
  • the pose information includes the external parameters of the camera device relative to the world coordinate system of the current scene and the internal parameters of the camera coordinate system and the image/pixel coordinate system of the camera device, which are not limited here.
  • the first user device may obtain relevant operations of the user in the real-time scene image, such as selection of a target point determined by touch, click, voice, gesture, or head movement.
  • the first user equipment can determine the target image position of the target point based on the user's selection operation in the real-time scene image, and the target image position is used to represent the two-dimensional coordinates of the target point in the pixel/image coordinate system corresponding to the real-time scene image information etc.
  • the first user equipment may also acquire the marking information input by the user, where the marking information includes human-computer interaction content input by the user, and is used to be superimposed on the spatial position corresponding to the target image position in the current scene, and the like. For example, the user selects a mark information on the first user equipment, such as a 3D arrow, and then clicks on a certain position in the real-time scene image displayed on the display screen. scene.
  • the tag information includes, but is not limited to: identification information; file information; form information; application call information; real-time sensor information.
  • marking information may include identification information such as arrows, brushes-scribbles on the screen, circles, geometric shapes, and the like.
  • the marking information may also include corresponding multimedia file information, such as pictures, videos, 3D models, PDF files, office documents, and other types of files.
  • the marking information may also include form information, for example, a form is generated at a position corresponding to the target image for the user to view or input content.
  • the tag information may also include application invocation information, related instructions for executing the application, etc., such as opening the application, invoking specific functions of the application, such as making a call, opening a link, and the like.
  • the tag information may also include real-time sensing information, which is used to connect a sensing device (such as a sensor, etc.) and acquire sensing data of a target object.
  • the mark information includes any of the following: mark identification, such as mark icon, name, etc.; mark content, such as PDF file content, mark color, size, etc.; mark type, such as file information, application call information etc.
  • mark identification such as mark icon, name, etc.
  • mark content such as PDF file content, mark color, size, etc.
  • mark type such as file information, application call information etc.
  • the application calling information is used to call the first target application installed in the current device.
  • the first user equipment currently has multiple applications installed, and the application calling information is used to call one of the applications currently installed on the first user equipment, for example, after starting the corresponding application and executing related shortcut instructions, such as starting a phone application, and launching Call to Zhang XX, etc.
  • the application calling information is used to call the application in the first user equipment, and if the tag information is presented on other user equipment (such as the second user equipment, etc.), the application The calling information is used to call the corresponding application in the second user equipment, such as starting the phone application of the second user equipment, and initiating a call to Zhang XX.
  • the application invocation information is used to prompt the user to install the corresponding application and perform invocation after the installation is completed.
  • the application invocation information is used to invoke a second target application installed in a corresponding third user equipment, where there is a communication connection between the third user equipment and the first user equipment.
  • the third user equipment is in the current scene, and the third user equipment is connected to the first user equipment through wired, wireless or network equipment.
  • the first user equipment may send an instruction corresponding to the application invocation information to the third user equipment, so that the third user equipment invokes a relevant application to perform a corresponding operation and the like.
  • the first user device is augmented reality glasses
  • the third user device is an operating device on the workbench in the current scene
  • the third user device is installed with an operation application for operating workpieces
  • the first user device may be based on user The operation adds tag information in the real-time scene image, and the tag information is the application invocation information corresponding to the operation application of the third user equipment. If the trigger operation of the relevant user on the application invocation information is acquired subsequently, the third user equipment will call The corresponding operation application processes the workpiece, etc.
  • the user equipment when the user equipment (such as the first user equipment or the second user equipment, etc.) acquires the user's trigger operation on the displayed marker information in the scene image, it sends an instruction corresponding to the application calling information to the third user. equipment, so that the third user equipment invokes related applications to execute related operations or instructions.
  • the trigger operation includes but not limited to click, touch, gesture instruction, voice instruction, button, head movement instruction and so on.
  • the corresponding target spatial position is determined according to the real-time pose information and the target image position, and the corresponding target marker information is generated according to the target spatial position and the marker information, wherein the target marker information It is used to overlay and present the marker information at the target space position of the 3D point cloud information.
  • the spatial three-dimensional coordinates corresponding to the two-dimensional coordinates on the real-time scene image captured by the camera device can be estimated, so as to determine the corresponding target spatial position based on the target image position. For example, after SLAM initialization is completed, the 3D point cloud and the pose of the camera device in the environment are calculated in real time based on the real-time scene image.
  • the algorithm uses the 3D point cloud in the current scene to fit a world coordinate
  • the plane under the system is obtained the plane expression.
  • a ray based on the camera coordinate system is constructed, and then the ray is transformed into the world coordinate system, which is calculated by the ray expression and the plane expression in the world coordinate system
  • the intersection point of the ray and the plane, this intersection point is the 3D space point corresponding to the 2D click point in the scene image captured by the camera device, the coordinate position corresponding to the 3D space point is determined as the target space position corresponding to the marker information, the target space position It is used for placing corresponding marker information in space, so that the marker information is superimposed and displayed at a corresponding position in the real-time scene image captured by the camera device, and the marker information is rendered on the real-time scene image.
  • the first user equipment generates corresponding target tag information according to the corresponding target spatial position and tag information, such as generating a corresponding tag configuration file, the configuration file includes spatial coordinate information and tag information, etc.
  • the corresponding tag configuration file can be related to
  • the corresponding scene 3D point cloud information is stored in the same file, which facilitates management and storage.
  • the tag information includes real-time sensing information
  • the real-time sensing information is used to indicate the real-time sensing data of the corresponding sensing device, and there is a communication connection.
  • real-time sensing data includes real-time collection of any object or process that needs to be monitored, connected, and interacted with through various devices and technologies such as information sensors, radio frequency identification technology, global positioning systems, infrared sensors, and laser scanners. Sound, light, heat, electricity, mechanics, chemistry, biology, location and other necessary information.
  • more common sensing devices include temperature sensors, humidity sensors, brightness sensors, etc., and the above sensing devices are just examples and not limited.
  • a communication connection is established between the corresponding sensing device and the first user equipment through wired, wireless or network equipment, so that based on the communication connection, the first user equipment can obtain real-time sensing information of the sensing device, etc., and the sensing device sets Since the target object is used to collect real-time sensing data of the target object (such as other equipment or objects, etc.), the real-time sensing information can be updated based on the currently collected real-time image, or the real-time sensing information can be updated based on a predetermined time interval renew.
  • the first user equipment may present one or more selectable objects contained in the scene image, where each selectable object has been provided with a corresponding Select the mark information of a real-time sensing information, such as a temperature sensor, and then click on an optional object of the temperature sensor displayed on the display screen, and the real-time temperature sensing mark information is superimposed on the selectable object displayed in the real-time scene
  • a real-time sensing information such as a temperature sensor
  • the data of the subsequent temperature sensor is updated every 0.5 seconds; for another example, the user determines the sensor device set by a certain selectable object, and then the user selects the real-time sensor device corresponding to the sensor device on the first user equipment.
  • the selected real-time sensing mark information will be superimposed and displayed at the position of the selectable object in the real-time scene, and the subsequent data of the sensor will be displayed according to each real-time scene
  • the image is updated.
  • the method further includes step S104 (not shown), in step S104, receiving the real-time sensing information returned by the network device and acquired by the sensing device; wherein, the Generating corresponding target marker information from the target spatial position and the marker information includes: generating corresponding target marker information according to the target spatial position and the real-time sensing information, wherein the target marker information is used in the The target spatial position of the 3D point cloud information is superimposed to present the real-time sensing information.
  • the first user equipment determines to add corresponding real-time sensing information to the real-time scene image based on the user's operation, and determines the target spatial position corresponding to the real-time sensing information, and establishes communication between the first user equipment and the sensing device through network equipment connection, the network device can obtain real-time sensing data of the sensing device in real time, and send the real-time sensing data to the first user equipment.
  • the first user equipment receives the corresponding real-time sensing data, and superimposes and presents the real-time sensing data at the corresponding spatial position of the real-time scene image (for example, calculates the corresponding real-time image position in real time according to the target spatial position, real-time pose information, etc.).
  • the determining the corresponding target spatial position according to the real-time pose information and the target image position includes: mapping the target image position to space according to the real-time pose information to determine the corresponding 3D straight line: determine the target spatial position corresponding to the target point according to the 3D point cloud information and the 3D straight line.
  • the first user device first determines the corresponding 3D straight line according to the real-time pose information and the target image position of the target point in the real-time scene image, and then determines the corresponding target point according to the spatial position relationship between the 3D point cloud information and the 3D straight line
  • the target space position of the position such as according to the intersection point of the 3D straight line and a certain plane, the distance between the 3D straight line and the feature points in the 3D point cloud information, etc., wherein the 3D straight line can be a straight line in the camera coordinate system of the camera device, or it can be A line in the world coordinate system.
  • the target image position that is, the 2D marker point (target point) P2d is mapped to a straight line L3dC in the camera coordinate system of the camera device, and the 3D point cloud obtained by the SLAM algorithm is mapped to the camera coordinate system to obtain the 3D point, in some embodiments, find the point P3d'C closest to the vertical distance from the straight line L3dC in the 3D point cloud under the camera coordinate system, and use the depth value of P3d'C to obtain the corresponding value of the depth value on the straight line L3dC Point P3dC; in some other embodiments, the distance weighted average between the 3D point cloud and L3dC in the camera coordinate system is used as the depth value, and according to this depth value, the point P3dC corresponding to the depth value is taken on the straight line L3dC; the point P3dC is mapped to the world coordinate system to obtain P3d, then P3d is the estimate of the 2D point in 3D space, so as to determine the target space
  • the point with the smallest distance to the marker point P2d is found among these 2Ds points, and the depth value of the 3D point in the camera coordinate system corresponding to the 2Ds point is used as the intercepted depth value, and is mapped to the camera at the marker point P2d Take the point corresponding to the depth value on a straight line L3dC in the coordinate system to obtain the estimated point P3dC in the camera coordinate system, and then convert it to the world coordinate system to obtain the target space position; in some embodiments, find in these 2Ds points The point with the smallest distance from the marker point P2d, the depth value of the 3D point in the world coordinate system corresponding to this 2Ds point is used as the intercepted depth value, and the depth value is taken on a straight line L3d mapped from the marker point P2d to the world coordinate system Corresponding point, get the estimated point P3d under the world coordinate system, get the target space position; In some other embodiments, determine the weight according to the distance between these 2Ds
  • the weighted average is (weight of each point*depth value)/number of points. According to the final depth value, the marker point P2d is mapped to a line in the camera coordinate system The estimated point P3dC is intercepted on the straight line L3dC, and then converted to the world coordinate system to obtain the target space position.
  • the weight is determined according to the distance between these 2Ds points and the marker point P2d
  • the final depth value is determined according to the weighted average of the depth values of the 3D points in the world coordinate system corresponding to these 2Ds points
  • the final depth value is determined according to the final depth value , intercept the estimated point P3d on a straight line L3d mapped from the marked point P2d to the world coordinate system, and obtain the target space position.
  • the above-mentioned mapping between the coordinate systems utilizes pose information of the camera device.
  • the 3D point cloud information includes a plurality of feature points, and each feature point includes corresponding depth information; wherein, according to the 3D point cloud information and the 3D straight line, the The target spatial position includes: determining at least one target feature point from the 3D point cloud information according to the distance between each feature point in the 3D point cloud information and the 3D straight line; based on the depth of the at least one target feature point The information determines the depth information of the target point on the 3D straight line, so as to determine the corresponding target spatial position.
  • the multiple feature points refer to 3D points in the 3D point cloud information.
  • the target image position that is, the 2D marker point (target point) P2d is mapped to a straight line L3d in the world coordinate system through the corresponding real-time pose information when the marker information is input in the real-time scene image of the current scene, Calculate the distance between each feature point and the straight line L3d from the 3D point cloud in the world coordinate system obtained by the SLAM algorithm, and select at least one corresponding target feature point according to the distance between each feature point and the straight line.
  • the first user equipment determines the depth information of the target point according to the depth information of at least one target feature point, such as using the depth information of a certain target feature point in the at least one target feature point as the depth information of the target point, or by weighted average
  • the calculation method determines the depth information of the target point, etc., and determines the corresponding target spatial position according to the depth information of the target point, such as mapping the mark point P2d to a straight line L3d in the world coordinate system according to the depth information of the target point Take the point corresponding to the depth value to get the estimated point P3d in the world coordinate system, so as to get the target space position.
  • the at least one target feature point includes a feature point with the smallest distance to the 3D straight line.
  • the distance between each target feature point of the at least one target feature point and the 3D straight line is less than or equal to a distance threshold; wherein, the determination based on the depth information of the at least one target feature point
  • the depth information of the target point on the 3D straight line includes: determining the weight information of each target feature point according to the distance information between the at least one target feature point and the 3D straight line; The depth information and the weight information determine the depth information of the target point on the 3D straight line.
  • the 2D marker point P2d is mapped to a straight line L3d in the world coordinate system through the corresponding real-time pose information when the marker information is input in the real-time scene image of the current scene.
  • the world coordinate obtained from the SLAM algorithm Find the point P3d' with the closest vertical distance to the straight line L3d in the 3D point cloud under the system.
  • the depth value of the 3D point P3d' take the point corresponding to the depth value on the straight line L3d as the estimated point P3d, and obtain
  • the weighted average of the distance between the 3D point cloud and L3d in the world coordinate system obtained according to the SLAM algorithm is used as the depth value, and further, the 3D point cloud participating in the weighted average calculation, which is related to the straight line L3d
  • the distance of is less than or equal to the distance threshold, according to this depth value, take the point corresponding to this depth value on the straight line L3d as the estimated point P3d, and obtain the estimation in the world coordinate system.
  • the weighted average of the distance between the 3D point and L3d is specifically (z value of each 3D point*weight coefficient)/number of points, and the weight is based on the vertical distance from the 3D point to the straight line L3d. The smaller the distance, the smaller the weight , and the sum of the weights is 1.
  • the method further includes step S105 (not shown), in step S105, updating target scene information based on the target tag information, wherein the target scene information is stored in a scene database, and the scene
  • the database includes one or more target scene information, and each target scene information includes target marker information and a corresponding 3D point cloud.
  • the scene database may be stored locally on the first user equipment, or may be stored on a network device having a communication connection with the first user equipment.
  • the first user equipment generates target scene information about the current scene based on target tag information, 3D point cloud information, etc., and updates the local scene database based on the target scene information, or sends the target scene information to the corresponding Network devices to update the scene database, etc.
  • the first user equipment may update the target scene information after determining all the target tag information of the current scene, or the first user equipment may update the target scene information at preset time intervals, or the first user equipment may update the target scene information based on User's manual operations, such as clicking the save button, updating target scene information, etc.
  • the target scene information includes 3D point cloud information corresponding to the scene, and at least one target tag information associated with the 3D point cloud information, and the corresponding target tag information includes corresponding tag information and target spatial position.
  • each piece of target scene information further includes corresponding device parameter information.
  • the device parameter information includes the internal reference of the device camera or the identification information of the device, such as the internal reference of the camera device of the user device that generates the 3D point cloud information or the identification information of the device.
  • the first user equipment performs three-dimensional tracking on the scene to determine the 3D point cloud information of the scene and determine the target marker.
  • the target scene information also includes internal parameters of the camera device of the first user equipment. After the user equipment other than the first user equipment uses the target scene, it needs to use the internal parameters of the camera device of the first user equipment when initializing the point cloud, so as to increase the accuracy of the display position of the target marker on the display devices of other user equipment.
  • the target scene information may directly include the internal parameters of the equipment camera, or the internal parameters of the equipment camera may be determined through the identification information (such as ID, name, etc.) of the equipment.
  • each target scene information also includes corresponding scene identification information, such as scene preview and description, such as using a picture of the scene to identify the scene, and presenting the picture of the scene on the device, etc. .
  • the method further includes step S106 (not shown), in step S106, the The target scene information is sent to the corresponding network device, wherein the target scene information is sent to the second user device via the network device based on the scene call request of the second user device, and the target scene information is used in the The tag information is superimposed on the current scene image collected by the second user equipment.
  • the second user holds a second user equipment, and sends a scene invocation request about the current scene to the network device through the second user equipment, and the corresponding scene invocation request includes the scene identification information of the current scene, and the network device matches the scene according to the scene identification information.
  • the network device sends a plurality of target scene information to the second user device in advance, and the second user device can determine the user selection or automatic matching from the multiple target scene information based on the user selection operation or the current scene image captured by the second user device.
  • the target scene information, and then the mark information added by the corresponding user of the first user equipment in the target scene information is presented through the display device of the second user equipment.
  • the second user can also edit the tag information presented by the display device of the second user equipment, such as adding, deleting, modifying, and replacing tag information Or move the position of the tag information, etc., which are only examples here, and are not limited.
  • the method further includes step S107 (not shown), in step S107, receiving a scene calling request about the target scene information sent by the corresponding second user equipment; responding to the scene calling request , sending the target scene information to the second user equipment, where the target scene information is used to superimpose the marker information on the current scene image collected by the second user equipment.
  • the first user equipment directly establishes a communication connection with the second user equipment, without data storage and transmission through a network device.
  • the second user equipment sends to the first user equipment a scene invocation request about the current scene, and the corresponding scene invocation request includes the scene identification information of the current scene, and the first user equipment determines the target scene information of the current scene according to the scene identification information, and sends The target scene information is sent to the second user equipment.
  • the first user equipment shares a plurality of target scene information with the second user equipment in advance, and the second user equipment may determine from the multiple target scene information based on the user selection operation or the current scene image captured by the second user equipment or automatically
  • the matching target scene information, the tag information in the target scene information is presented by the display device of the second user equipment.
  • the method further includes step S108 (not shown).
  • step S108 the camera of the first user equipment continues to capture subsequent scene images related to the current scene, based on the target
  • the scene information superimposes and presents the mark information on a subsequent scene image corresponding to the current scene.
  • the corresponding target scene information can also be used by the first user equipment to continue shooting the current scene and perform subsequent updates. For example, after the first user equipment updates the target scene information based on the target tag information, it continues to add other Mark information or edit existing mark information to continue to update target scene information, or the first user equipment reloads target scene information in a subsequent period of time, and adds marks or edits existing marks based on the target scene information.
  • the first user equipment continues to shoot subsequent scene images about the current scene through the camera device, and superimposes and presents existing marker information in the subsequent scene images, such as real-time information based on the existing marker target space position, real-time pose information of the camera device, etc.
  • existing marker information such as real-time information based on the existing marker target space position, real-time pose information of the camera device, etc.
  • the corresponding real-time image position in the subsequent scene image is calculated for presentation.
  • Fig. 2 shows a method for presenting target mark information according to another aspect of the present application, which is applied to a second user equipment, wherein the method includes step S201, step S202 and step S203.
  • step S201 the target scene information matching the current scene is acquired, wherein the target scene information includes corresponding target tag information, and the target tag information includes corresponding tag information and target spatial position;
  • step S202 by The imaging device shoots a current scene image about the current scene;
  • step S203 the corresponding target image position is determined according to the current pose information of the imaging device and the spatial position of the target, and in the current scene image The target image position is superimposed to present the tag information.
  • the second user holds the second user equipment, and the second user equipment may download one or more target scene information locally through the second user equipment by sending a scene call request to the network device or the first user equipment.
  • the camera device of the second user equipment scans the current scene to obtain the current scene image, and performs point cloud initialization through three-dimensional tracking (such as SLAM), and the current world coordinates The system is aligned with the world coordinate system of the 3D point cloud information in the target scene information, so that the target space position of the marker information is aligned with the current scene.
  • SLAM three-dimensional tracking
  • the second user obtained through three-dimensional tracking
  • the real-time pose information of the camera device of the device calculates in real time the corresponding target image position of the marker information in the current scene image, and the marker information is superimposed and displayed at the corresponding position on the display screen of the second user equipment.
  • the marker information included in the target scene information Information reproduced.
  • the second user equipment obtains the target scene information that matches the current scene. In some cases, it can be obtained through manual selection by the second user, and in other cases, it can be obtained through automatic matching.
  • Scene image determine the matching target scene information according to the scene image, such as determining the matching target scene information through the 2D recognition initialization of three-dimensional tracking (such as a 2D recognition map in the scene); and for example, through the point cloud initialization of three-dimensional tracking, Determine the matching target scene information.
  • the above-mentioned second user equipment acquires target scene information that matches the current scene, and may also determine it based on relevant information provided by the second user equipment, the relevant information includes but not limited to: location information of the second user equipment , such as GPS location information, wifi radio frequency fingerprint information, etc., the identity information of the user of the second user device, the permission information of the user of the second user device, such as the user's access permission, viewing and editing permission, etc., and the tasks being performed by the user of the second user device information etc.
  • location information of the second user equipment such as GPS location information, wifi radio frequency fingerprint information, etc.
  • the identity information of the user of the second user device such as the identity information of the user of the second user device
  • the permission information of the user of the second user device such as the user's access permission, viewing and editing permission, etc.
  • the above methods of acquiring target scene information matching the current scene can be used in combination, such as downloading the corresponding target scene information through the relevant information provided by the second user equipment or screening the downloaded target scene information Matching target scene information, and then determine the matching target scene through manual selection or automatic matching.
  • downloading the corresponding target scene information through the relevant information provided by the second user equipment or screening the downloaded target scene information Matching target scene information and then determine the matching target scene through manual selection or automatic matching.
  • the point cloud initialization process of SLAM includes: 1) judging the similarity between the scene image of the current scene and the point cloud in one or more target scene information, and selecting the point cloud with the highest similarity (such as using BOW( bag of words) model calculates the number of matching points between the feature points of the current image frame and the feature points of multiple key frames in the map point cloud in one or more groups of target scene information, and selects the group with the largest number of matching feature points) to load, Execute SLAM relocation and complete initialization. Among them, the relocation process is as follows:
  • the SLAM system extracts ORB features from the acquired current image frame, and uses the BOW (bag of words) model to calculate the matching points between the feature points of the current image frame and multiple key frames in the map point cloud; the matching number meets a certain threshold and is considered are candidate keyframes.
  • the current frame camera pose is estimated by ransac (random sampling consensus algorithm) and PNP, and then the estimated interior point (an interior point is a point suitable for the estimation model in the ransac algorithm) is updated as a map point, and then Use the graph optimization theory to optimize the camera pose of the current frame. If there are fewer interior points after optimization, repeat the above process to perform more matches on the map points of the selected candidate keyframes, and finally optimize the pose and interior points again.
  • a certain threshold is met, the relocation is successful, thereby establishing a SLAM coordinate system consistent with the coordinate system in the map point cloud (obtaining the camera pose), and completing the initialization.
  • the generation process of the target scene information is the same as or similar to the generation process of the target scene information in FIG. 1 , and will not be repeated here.
  • the method further includes step S204 (not shown).
  • step S204 an editing operation of the corresponding user on the target scene information is obtained, and the target scene information is updated based on the editing operation.
  • editing operations include but are not limited to editing target tag information in target scene information, such as modifying, adding, replacing, and deleting tag information, or moving the location of tag information.
  • the second user equipment presents the tag information added in the target scene information on the current scene image through the display device. After the tag information is reproduced, the second user can perform editing operations in the current scene. Specific examples are as follows: 1) The second user adds marking information to the current scene image on the display device, and can adjust the content, position, size and angle of the marking information, etc.;
  • the second user can select the tag information displayed on the display device, such as the tag information that has been added in the target scene information or the tag information newly added by the second user, and a deletion option appears for deletion.
  • the second user can check the tag information displayed on the display device, such as clicking on the PDF tag to open the PDF for viewing; another example is to check the real-time sensing information of the temperature sensor;
  • the second user can perform corresponding operations on the tag information added in the current scene, such as calling the shortcut function of the application. After the second user clicks the tag of the application calling information, the application can be started and the corresponding shortcut function can be called; another example is filling in Form, after the second user clicks the mark of the form information, the form can be filled out.
  • the editing operation includes updating the 3D point cloud information in the target scene information, for example, by moving the camera device of the second user equipment, and updating the 3D point cloud information corresponding to the target scene information through a three-dimensional tracking algorithm.
  • the editing operation of the target scene by the second user is synchronized to the cloud server, and the corresponding scene database is updated.
  • the editing operation of the target scene by the second user overwrites the previous target scene information.
  • the editing operation of the target scene is additionally saved as a new target scene information or the like.
  • the method further includes step S205 (not shown).
  • step S205 the updated target scene information is sent to the network device to update the corresponding scene database.
  • the tag information includes real-time sensing information, and the real-time sensing information is used to indicate real-time sensing data of a corresponding sensing device that exists between the sensing device and the second user equipment. Communication connection; wherein, the method also includes step S207 (not shown), in step S207, obtain the real-time sensing data corresponding to the sensing device; wherein, the target image position in the current scene image Superimposing and presenting the tag information includes: superimposing and presenting the real-time sensor data at a target image position of the current scene image.
  • real-time sensing data includes real-time collection of any object or process that needs to be monitored, connected, and interacted with through various devices and technologies such as information sensors, radio frequency identification technology, global positioning systems, infrared sensors, and laser scanners. Sound, light, heat, electricity, mechanics, chemistry, biology, location and other necessary information.
  • a communication connection is established between the corresponding sensing device and the second user equipment through wired, wireless or network equipment, so that based on the communication connection, the second user equipment can obtain real-time sensing information of the sensing device, etc., and the sensing device sets Since the target object is used to collect real-time sensing data of the target object (such as other equipment or objects, etc.), the real-time sensing information can be updated based on the currently collected real-time image, or the real-time sensing information can be updated based on a predetermined time interval renew.
  • the second user device may present one or more mark information of one or more real-time sensing information added to a certain target object in the scene image, and by updating the real-time sensing information, the second user may View the real-time sensor information of the target object to understand the device status of the target object.
  • the tag information includes application calling information, and the application calling information is used to call the target application in the current device; wherein, the method further includes step S206 (not shown), in step S206, Acquiring a trigger operation corresponding to the user's application invocation information in the current scene image, and invoking the target application in the second user equipment based on the trigger operation.
  • step S206 Acquiring a trigger operation corresponding to the user's application invocation information in the current scene image, and invoking the target application in the second user equipment based on the trigger operation.
  • multiple applications are currently installed on the second user equipment, and the application calling information is used to call one of the applications currently installed on the second user equipment, for example, after starting the corresponding application and executing related shortcut instructions, such as starting a phone application, and launching Call to Zhang XX, etc.
  • the application invocation information is used to prompt the user to install the corresponding application and perform invocation after the installation is completed.
  • the third user equipment is in the current scene, and the third user equipment is connected to the second user equipment through wired, wireless or network equipment. Based on the corresponding communication connection, the second user equipment may send an instruction corresponding to the application invocation information to the third user equipment, so that the third user equipment invokes a relevant application to perform a corresponding operation and the like.
  • the second user device is augmented reality glasses
  • the third user device is an operating device on the workbench in the current scene
  • the third user device is installed with an operation application for operating the workpiece
  • the second user device can view the current The identification information of the application invocation information that has been added in the real-time scene.
  • the application invocation information corresponding to the operation application of the third user device is currently presented in the real-time scene image.
  • the trigger operation of the second user on the application invocation information is obtained , the third user equipment invokes the corresponding operating equipment to process the workpiece and so on.
  • the second user may also add identification information about the application calling information used by the operating application of the third user equipment to the real-time scene information.
  • the trigger operation includes but not limited to click, touch, gesture instruction, voice instruction, button, head movement instruction and so on.
  • FIG. 3 shows a method for determining and presenting target marker information according to an aspect of the present application, wherein the method includes:
  • the first user equipment captures an initial scene image about the current scene by a camera device, and performs three-dimensional tracking initialization on the current scene according to the initial scene image;
  • the first user equipment captures real-time scene images of the current scene through the camera device, obtains real-time pose information of the camera device and 3D point cloud information corresponding to the current scene through three-dimensional tracking, and obtains The tag information input in the real-time scene image of the current scene and the target image position corresponding to the tag information;
  • the first user equipment determines a corresponding target spatial position according to the real-time pose information and the target image position, and generates corresponding target marker information according to the target spatial position and the marker information, wherein the target marker The information is used to superimpose and present the marker information at the target space position of the 3D point cloud information;
  • the second user equipment acquires the target scene information that matches the current scene, where the target scene information includes corresponding target tag information, and the target tag information includes corresponding tag information and a target spatial position;
  • the second user equipment captures a current scene image related to the current scene by the camera device
  • the second user equipment determines a corresponding target image position according to the current pose information of the camera device and the target spatial position, and superimposes and presents the marker information on the target image position of the current scene image.
  • the process of determining and presenting the target marker information is the same as or similar to the process of determining the target marker information in FIG. 1 and presenting the target marker information in FIG. 2 , and will not be repeated here.
  • the above mainly introduces various embodiments of a method for determining and presenting target marker information of the present application.
  • the present application also provides specific devices capable of implementing the above embodiments, which will be introduced below with reference to FIGS. 4 and 5 .
  • FIG. 4 shows a first user equipment 100 for determining target tag information according to an aspect of the present application, wherein the equipment includes a one-module 101 , a two-module 102 and a three-module 103 .
  • - module 101 used to take an initial scene image about the current scene through the camera device, and perform three-dimensional tracking initialization on the current scene according to the initial scene image
  • module 102 used to take pictures about the current scene through the camera device
  • the real-time scene image of the current scene, the real-time pose information of the camera device and the 3D point cloud information corresponding to the current scene are obtained through three-dimensional tracking, and the mark information and the mark information input by the user in the real-time scene image of the current scene are obtained.
  • the target image position corresponding to the tag information a module 103, configured to determine the corresponding target spatial position according to the real-time pose information and the target image position, and generate a corresponding target position according to the target spatial position and the tag information
  • the target mark information wherein the target mark information is used to superimpose and present the mark information on the target space position of the 3D point cloud information.
  • the tag information includes, but is not limited to: identification information; file information; form information; application call information; real-time sensor information.
  • the application calling information is used to call the first target application installed in the current device.
  • the application invocation information is used to invoke a second target application installed in a corresponding third user equipment, where there is a communication connection between the third user equipment and the first user equipment.
  • the tag information includes real-time sensing information, the real-time sensing information is used to indicate the real-time sensing data of the corresponding sensing device, and there is a communication connection.
  • step S101, step S102 and step S103 shown in FIG. 4 are the same or similar to the embodiments of step S101, step S102 and step S103 shown in FIG. , so no further details are included here by reference.
  • the device further includes a module (not shown), configured to receive the real-time sensing information returned by the network device and acquired by the sensing device; wherein, according to the target Generating corresponding target marker information based on the spatial position and the marker information includes: generating corresponding target marker information according to the target spatial position and the real-time sensing information, wherein the target marker information is used at the 3D point The target spatial position of the cloud information is superimposed to present the real-time sensor information.
  • a module not shown, configured to receive the real-time sensing information returned by the network device and acquired by the sensing device; wherein, according to the target Generating corresponding target marker information based on the spatial position and the marker information includes: generating corresponding target marker information according to the target spatial position and the real-time sensing information, wherein the target marker information is used at the 3D point The target spatial position of the cloud information is superimposed to present the real-time sensor information.
  • the determining the corresponding target spatial position according to the real-time pose information and the target image position includes: mapping the target image position to space according to the real-time pose information to determine the corresponding 3D straight line: determine the target spatial position corresponding to the target point according to the 3D point cloud information and the 3D straight line.
  • the 3D point cloud information includes a plurality of feature points, and each feature point includes corresponding depth information; wherein, according to the 3D point cloud information and the 3D straight line, the The target spatial position includes: determining at least one target feature point from the 3D point cloud information according to the distance between each feature point in the 3D point cloud information and the 3D straight line; based on the depth of the at least one target feature point The information determines the depth information of the target point on the 3D straight line, so as to determine the corresponding target spatial position.
  • the at least one target feature point includes a feature point with the smallest distance to the 3D straight line.
  • the distance between each target feature point of the at least one target feature point and the 3D straight line is less than or equal to a distance threshold; wherein, the determination based on the depth information of the at least one target feature point
  • the depth information of the target point on the 3D straight line includes: determining the weight information of each target feature point according to the distance information between the at least one target feature point and the 3D straight line; The depth information and the weight information determine the depth information of the target point on the 3D straight line.
  • the device further includes a module (not shown), configured to update target scene information based on the target tag information, wherein the target scene information is stored in a scene database, and the scene database includes One or more target scene information, each target scene information includes target marker information and corresponding 3D point cloud. In some implementations, each target scene information also includes corresponding device parameter information.
  • the device further includes a module (not shown), configured to send the target scene information to a corresponding network device, wherein the target scene information is based on a scene call request of the second user equipment The target scene information is sent to the second user equipment via the network device, and the target scene information is used to superimpose the marker information on the current scene image collected by the second user equipment.
  • the device further includes a module (not shown), configured to receive a scene call request about the target scene information sent by the corresponding second user equipment; in response to the scene call request, the The target scene information is sent to the second user equipment, where the target scene information is used to superimpose the marker information on the current scene image collected by the second user equipment.
  • the device further includes a module (not shown), configured to continue shooting subsequent scene images about the current scene through the camera device of the first user equipment, based on the target scene information Superimposing and presenting the mark information on a subsequent scene image corresponding to the current scene.
  • step S104 to step S108 the specific implementation manners corresponding to the first four modules to one eighth modules are the same as or similar to the above-mentioned embodiments of step S104 to step S108, so they are not repeated here, and are included here by reference.
  • FIG. 5 shows a second user equipment 200 for presenting target marking information according to another aspect of the present application, wherein the equipment includes a two-one module 201 , a two-two module 202 and a two-three module 203 .
  • the two-one module 201 is configured to acquire target scene information matching the current scene, wherein the target scene information includes corresponding target tag information, and the target tag information includes corresponding tag information and target spatial position;
  • two-two module 202 used to shoot the current scene image about the current scene through the camera device;
  • the second and third modules 203 used to determine the corresponding target image position according to the current pose information of the camera device and the target spatial position, in The target image position of the current scene image is superimposed to present the tag information.
  • step S201, step S202, and step S203 shown in FIG. 5 are the same or similar to the embodiments of step S201, step S202, and step S203 shown in FIG. 2 , so no further details are included here by reference.
  • the device further includes a 24 module (not shown), configured to acquire an editing operation of a corresponding user on the target scene information, and update the target scene information based on the editing operation.
  • the device further includes a 25 module (not shown), configured to send the updated target scene information to the network device to update the corresponding scene database.
  • the tag information includes real-time sensing information, and the real-time sensing information is used to indicate real-time sensing data of a corresponding sensing device that exists between the sensing device and the second user equipment.
  • the device further includes a twenty-seven module (not shown), used to obtain real-time sensing data corresponding to the sensing device; wherein, the target image position of the current scene image is superimposed and presented
  • the marking information includes: superimposing and presenting the real-time sensor data at the target image position of the current scene image.
  • the tag information includes application calling information, and the application calling information is used to call the target application in the current device; wherein, the device further includes a module (not shown) for obtaining the corresponding The user performs a trigger operation on the application invocation information in the current scene image, and invokes the target application in the second user equipment based on the trigger operation.
  • the present application also provides a computer-readable storage medium, the computer-readable storage medium stores computer codes, and when the computer codes are executed, as described in any one of the preceding items The described method is carried out.
  • the present application also provides a computer program product, when the computer program product is executed by a computer device, the method described in any one of the preceding items is executed.
  • the present application also provides a kind of computer equipment, and described computer equipment comprises:
  • processors one or more processors
  • memory for storing one or more computer programs
  • the one or more processors are made to implement the method as described in any one of the preceding items.
  • FIG. 6 illustrates an exemplary system that may be used to implement various embodiments described in this application
  • system 300 can be used as any one of the above-mentioned devices in each of the above-mentioned embodiments.
  • system 300 may include one or more computer-readable media (e.g., system memory or NVM/storage device 320 ) having instructions and be coupled to and configured to execute The instructions are one or more processors (eg, processor(s) 305 ) that implement a module to perform the actions described in this application.
  • processors e.g, processor(s) 305
  • system control module 310 may include any suitable interface controller to provide at least one of processor(s) 305 and/or any suitable device or component in communication with system control module 310 Any suitable interface.
  • the system control module 310 may include a memory controller module 330 to provide an interface to the system memory 315.
  • the memory controller module 330 may be a hardware module, a software module and/or a firmware module.
  • System memory 315 may be used, for example, to load and store data and/or instructions for system 300 .
  • system memory 315 may include any suitable volatile memory, such as suitable DRAM.
  • system memory 315 may include Double Data Rate Type Quad Synchronous Dynamic Random Access Memory (DDR4 SDRAM).
  • DDR4 SDRAM Double Data Rate Type Quad Synchronous Dynamic Random Access Memory
  • system control module 310 may include one or more input/output (I/O) controllers to provide interfaces to NVM/storage devices 320 and communication interface(s) 325 .
  • I/O input/output
  • NVM/storage 320 may be used to store data and/or instructions.
  • NVM/storage 320 may include any suitable non-volatile memory (e.g., flash memory) and/or may include any suitable non-volatile storage device(s) (e.g., one or more hard drives (HDD), one or more compact disc (CD) drives, and/or one or more digital versatile disc (DVD) drives).
  • suitable non-volatile memory e.g., flash memory
  • suitable non-volatile storage device(s) e.g., one or more hard drives (HDD), one or more compact disc (CD) drives, and/or one or more digital versatile disc (DVD) drives.
  • HDD hard drives
  • CD compact disc
  • DVD digital versatile disc
  • NVM/storage device 320 may include a storage resource that is physically part of the device on which system 300 is installed, or it may be accessible by the device without necessarily being part of the device. For example, NVM/storage 320 may be accessed over a network via communication interface(s) 325 .
  • Communication interface(s) 325 may provide an interface for system 300 to communicate over one or more networks and/or with any other suitable device.
  • System 300 may communicate wirelessly with one or more components of a wireless network according to any of one or more wireless network standards and/or protocols.
  • processor(s) 305 may be packaged with logic of one or more controllers of system control module 310 (eg, memory controller module 330 ).
  • processor(s) 305 may be packaged with the logic of one or more controllers of the system control module 310 to form a system-in-package (SiP).
  • SiP system-in-package
  • at least one of the processor(s) 305 may be integrated on the same die as the logic of the one or more controllers of the system control module 310 .
  • at least one of the processor(s) 305 may be integrated on the same die with the logic of the one or more controllers of the system control module 310 to form a system on chip (SoC).
  • SoC system on chip
  • system 300 may be, but is not limited to, a server, workstation, desktop computing device, or mobile computing device (eg, laptop computing device, handheld computing device, tablet computer, netbook, etc.). In various embodiments, system 300 may have more or fewer components and/or a different architecture. For example, in some embodiments, system 300 includes one or more cameras, a keyboard, a liquid crystal display (LCD) screen (including a touchscreen display), non-volatile memory ports, multiple antennas, graphics chips, application-specific integrated circuits ( ASIC) and speakers.
  • LCD liquid crystal display
  • ASIC application-specific integrated circuits
  • the present application can be implemented in software and/or a combination of software and hardware, for example, it can be implemented by using an application specific integrated circuit (ASIC), a general-purpose computer or any other similar hardware devices.
  • ASIC application specific integrated circuit
  • the software program of the present application can be executed by a processor to realize the steps or functions described above.
  • the software program (including associated data structures) of the present application can be stored in a computer-readable recording medium such as RAM memory, magnetic or optical drive or floppy disk and the like.
  • some steps or functions of the present application may be implemented by hardware, for example, as a circuit that cooperates with a processor to execute each step or function.
  • a part of the present application can be applied as a computer program product, such as a computer program instruction.
  • a computer program product such as a computer program instruction.
  • the method and/or technical solution according to the present application can be invoked or provided through the operation of the computer.
  • computer program instructions exist in computer-readable media in forms including but not limited to source files, executable files, installation package files, etc. Limited to: the computer directly executes the instruction, or the computer compiles the instruction and then executes the corresponding compiled program, or the computer reads and executes the instruction, or the computer reads and installs the instruction and then executes the corresponding post-installation program program.
  • a computer readable medium may be any available computer readable storage medium or communication medium that can be accessed by a computer.
  • Communication media includes the media whereby communication signals embodying, for example, computer readable instructions, data structures, program modules or other data are transmitted from one system to another.
  • Communication media can include guided transmission media such as cables and wires (e.g., fiber optics, coaxial, etc.) and wireless (unguided transmission) media capable of propagating waves of energy, such as acoustic, electromagnetic, RF, microwave, and infrared .
  • Computer readable instructions, data structures, program modules or other data may be embodied, for example, as a modulated data signal in a wireless medium such as a carrier wave or similar mechanism such as embodied as part of spread spectrum technology.
  • modulated data signal means a signal that has one or more of its characteristics changed or set in such a manner as to encode information in the signal. Modulation can be analog, digital or mixed modulation techniques.
  • computer-readable storage media may include volatile and nonvolatile, volatile, volatile, or Removable and non-removable media.
  • computer-readable storage media include, but are not limited to, volatile memories such as random access memories (RAM, DRAM, SRAM); and nonvolatile memories such as flash memory, various read-only memories (ROM, PROM, EPROM) , EEPROM), magnetic and ferromagnetic/ferroelectric memory (MRAM, FeRAM); and magnetic and optical storage devices (hard disks, tapes, CDs, DVDs); or other media known now or developed in the future capable of storing data for computer systems Computer readable information/data used.
  • volatile memories such as random access memories (RAM, DRAM, SRAM
  • nonvolatile memories such as flash memory, various read-only memories (ROM, PROM, EPROM) , EEPROM), magnetic and ferromagnetic/ferroelectric memory (MRAM, FeRAM); and magnetic and optical storage devices (hard disks, tapes, CDs, DVDs); or other media known now or developed
  • an embodiment according to the present application includes an apparatus comprising a memory for storing computer program instructions and a processor for executing the program instructions, wherein when the computer program instructions are executed by the processor, triggering
  • the operation of the device is based on the foregoing methods and/or technical solutions according to multiple embodiments of the present application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • Multimedia (AREA)
  • Computer Graphics (AREA)
  • Human Computer Interaction (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

一种确定并呈现目标标记信息的方法与设备,方法包括:通过摄像装置拍摄关于当前场景的初始场景图像,根据初始场景图像对当前场景进行三维跟踪初始化;通过摄像装置拍摄关于当前场景的实时场景图像,通过三维跟踪获取摄像装置的实时位姿信息及当前场景对应的3D点云信息,获取用户在当前场景的实时场景图像中输入的标记信息及标记信息对应的目标图像位置;根据实时位姿信息及目标图像位置确定对应的目标空间位置,根据目标空间位置及标记信息生成对应的目标标记信息。方便用户对场景进行编辑从而提供相关交互信息,有利于多个用户的场景协作,提升了用户的使用体验。

Description

一种确定和呈现目标标记信息的方法与设备
本申请是以CN申请号为202111056350.6,申请日为2021.09.09的申请为基础,并主张其优先权,该CN申请的公开内容在此作为整体引入本申请中。
技术领域
本申请涉及通信领域,尤其涉及一种用于确定和呈现目标标记信息的技术。
背景技术
增强现实(Augmented Reality,AR)技术是一种将虚拟信息与真实世界巧妙融合的技术,广泛运用了多媒体、三维建模、实时跟踪及注册、智能交互、传感等多种技术手段,将计算机生成的文字、图像、三维模型、音乐、视频等虚拟信息模拟仿真后,应用到真实世界中,两种信息互为补充,从而实现对真实世界的“增强”。AR增强现实技术是促使真实世界信息和虚拟世界信息之间综合在一起的较新的技术内容,其将原本在现实世界的空间范围中比较难以进行体验的实体信息在电脑等科学技术的基础上,实施模拟仿真处理,将虚拟信息内容叠加在真实世界中加以有效应用,并且在这一过程中能够被人类感官所感知,从而实现超越现实的感官体验。在增强现实过程中,通常需要对特定位置或者特定目标等进行标记从而使用户更容易获取交互内容等。
发明内容
本申请的一个目的是提供一种确定并呈现目标标记信息的方法与设备。
根据本申请的一个方面,提供了一种确定目标标记信息的方法,应用于第一用户设备,该方法包括:
通过摄像装置拍摄关于当前场景的初始场景图像,根据所述初始场景图像对所述当前场景进行三维跟踪初始化;
通过所述摄像装置拍摄关于所述当前场景的实时场景图像,通过三维跟踪获取所述摄像装置的实时位姿信息及所述当前场景对应的3D点云信息,获取用户在所述当前场景的实时场景图像中输入的标记信息及所述标记信息对应的目标图像位置;
根据所述实时位姿信息及所述目标图像位置确定对应的目标空间位置,根据所述目标空间位置及所述标记信息生成对应的目标标记信息,其中,所述目标标记信息用于在所述3D点云信息的目标空间位置叠加呈现所述标记信息。
根据本申请的另一个方面,提供了一种呈现目标标记信息的方法,应用于第二用户设备该方法包括:
获取与当前场景相匹配的目标场景信息,其中,所述目标场景信息包括对应的目标标记信息,所述目标标记信息包括对应标记信息及目标空间位置;
通过所述摄像装置拍摄关于所述当前场景的当前场景图像;
根据所述摄像装置的当前位姿信息及所述目标空间位置确定对应的目标图像位置,在所述当前场景图像的目标图像位置叠加呈现所述标记信息。
根据本申请的一个方面,提供了一种确定并呈现目标标记信息的方法,该方法包括:
第一用户设备通过摄像装置拍摄关于当前场景的初始场景图像,根据所述初始场景图像对所述当前场景进行三维跟踪初始化;
所述第一用户设备通过所述摄像装置拍摄关于所述当前场景的实时场景图像,通过三维跟踪获取所述摄像装置的实时位姿信息及所述当前场景对应的3D点云信息,获取用户在所述当前场景的实时场景图像中输入的标记信息及所述标记信息对应的目标图像位置;
所述第一用户设备根据所述实时位姿信息及所述目标图像位置确定对应的目标空间位置,根据所述目标空间位置及所述标记信息生成对应的目标标记信息,其中,所述目标标记信息用于在所述3D点云信息的目标空间位置叠加呈现所述标记信息;
所述第二用户设备获取与所述当前场景相匹配的所述目标场景信息,其中,所述目标场景信息包括对应的目标标记信息,所述目标标记信息包括对应标记信息及目标空间位置;
所述第二用户设备通过所述摄像装置拍摄关于所述当前场景的当前场景图像;
所述第二用户设备根据所述摄像装置的当前位姿信息及所述目标空间位置确定对应的目标图像位置,在所述当前场景图像的目标图像位置叠加呈现所述标记信息。
根据本申请的一个方面,提供了一种确定目标标记信息的第一用户设备,其中,该设备包括:
一一模块,用于通过摄像装置拍摄关于当前场景的初始场景图像,根据所述初始场 景图像对所述当前场景进行三维跟踪初始化;
一二模块,用于通过所述摄像装置拍摄关于所述当前场景的实时场景图像,通过三维跟踪获取所述摄像装置的实时位姿信息及所述当前场景对应的3D点云信息,获取用户在所述当前场景的实时场景图像中输入的标记信息及所述标记信息对应的目标图像位置;
一三模块,用于根据所述实时位姿信息及所述目标图像位置确定对应的目标空间位置,根据所述目标空间位置及所述标记信息生成对应的目标标记信息,其中,所述目标标记信息用于在所述3D点云信息的目标空间位置叠加呈现所述标记信息。
根据本申请的另一个方面,提供了一种呈现目标标记信息的第二用户设备,其中,该设备包括:
二一模块,用于获取与当前场景相匹配的目标场景信息,其中,所述目标场景信息包括对应的目标标记信息,所述目标标记信息包括对应标记信息及目标空间位置;
二二模块,用于通过所述摄像装置拍摄关于所述当前场景的当前场景图像;
二三模块,用于根据所述摄像装置的当前位姿信息及所述目标空间位置确定对应的目标图像位置,在所述当前场景图像的目标图像位置叠加呈现所述标记信息。
根据本申请的一个方面,提供了一种计算机设备,其中,该设备包括:
处理器;以及
被安排成存储计算机可执行指令的存储器,所述可执行指令在被执行时使所述处理器执行如上任一所述方法的步骤。
根据本申请的一个方面,提供了一种计算机可读存储介质,其上存储有计算机程序/指令,其特征在于,该计算机程序/指令在被执行时使得系统进行执行如上任一所述方法的步骤。
根据本申请的一个方面,提供了一种计算机程序产品,包括计算机程序/指令,其特征在于,该计算机程序/指令被处理器执行时实现如上任一所述方法的步骤。
与现有技术相比,本申请通过对当前场景进行三维跟踪获取当前场景对应的3D点云信息和摄像装置的实时位姿,用于确定用户在当前场景的实时场景图像中输入的标记信息对应的空间位置,能够方便用户对场景中目标对象进行编辑从而提供相关交互信息,另外,还有利于多个用户的场景协作等,提升了用户的使用体验。
附图说明
通过阅读参照以下附图所作的对非限制性实施例所作的详细描述,本申请的其它特征、目的和优点将会变得更明显:
图1示出根据本申请一个实施例的一种确定目标标记信息的方法流程图;
图2示出根据本申请另一个实施例的一种呈现目标标记信息的方法流程图;
图3示出根据本申请一个实施例的一种确定并呈现目标标记信息的系统方法流程图;
图4示出根据本申请一个实施例的第一用户设备的功能模块;
图5示出根据本申请一个实施例的第二用户设备的功能模块;
图6示出可被用于实施本申请中所述的各个实施例的示例性系统。
附图中相同或相似的附图标记代表相同或相似的部件。
具体实施方式
下面结合附图对本申请作进一步详细描述。
在本申请一个典型的配置中,终端、服务网络的设备和可信方均包括一个或多个处理器(例如,中央处理器(Central Processing Unit,CPU))、输入/输出接口、网络接口和内存。
内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RandomAccess Memory,RAM)和/或非易失性内存等形式,如只读存储器(Read Only Memory,ROM)或闪存(Flash Memory)。内存是计算机可读介质的示例。
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(Phase-Change Memory,PCM)、可编程随机存取存储器(Programmable Random Access Memory,PRAM)、静态随机存取存储器(Static Random-Access Memory,SRAM)、动态随机存取存储器(Dynamic Random AccessMemory,DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(Electrically-Erasable Programmable Read-Only Memory,EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(Compact Disc Read-Only Memory,CD-ROM)、数字多功能光盘(Digital  Versatile Disc,DVD)或其他光学存储、磁盒式磁带,磁带磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。
本申请所指设备包括但不限于用户设备、网络设备、或用户设备与网络设备通过网络相集成所构成的设备。所述用户设备包括但不限于任何一种可与用户进行人机交互(例如通过触摸板进行人机交互)的移动电子产品,例如智能手机、平板电脑、智能眼镜等,所述移动电子产品可以采用任意操作系统,如Android操作系统、iOS操作系统等。其中,所述网络设备包括一种能够按照事先设定或存储的指令,自动进行数值计算和信息处理的电子设备,其硬件包括但不限于微处理器、专用集成电路(Application SpecificIntegrated Circuit,ASIC)、可编程逻辑器件(Programmable Logic Device,PLD)、现场可编程门阵列(Field Programmable Gate Array,FPGA)、数字信号处理器(Digital Signal Processor,DSP)、嵌入式设备等。所述网络设备包括但不限于计算机、网络主机、单个网络服务器、多个网络服务器集或多个服务器构成的云;在此,云由基于云计算(CloudComputing)的大量计算机或网络服务器构成,其中,云计算是分布式计算的一种,由一群松散耦合的计算机集组成的一个虚拟超级计算机。所述网络包括但不限于互联网、广域网、城域网、局域网、VPN网络、无线自组织网络(Ad Hoc网络)等。优选地,所述设备还可以是运行于所述用户设备、网络设备、或用户设备与网络设备、网络设备、触摸终端或网络设备与触摸终端通过网络相集成所构成的设备上的程序。
当然,本领域技术人员应能理解上述设备仅为举例,其他现有的或今后可能出现的设备如可适用于本申请,也应包含在本申请保护范围以内,并在此以引用方式包含于此。
在本申请的描述中,“多个”的含义是两个或者更多,除非另有明确具体的限定。
图1示出了根据本申请一个方面的一种确定目标标记信息的方法,应用于第一用户设备,其中,该方法包括步骤S101、步骤S102以及步骤S103。在步骤S101中,通过摄像装置拍摄关于当前场景的初始场景图像,根据所述初始场景图像对所述当前场景进行三维跟踪初始化;在步骤S102中,通过所述摄像装置拍摄关于所述当前场景的实时场景图像,通过三维跟踪获取所述摄像装置的实时位姿信息及所述当前场景对应的3D点云信息,获取用户在所述当前场景的实时场景图像中输入 的标记信息及所述标记信息对应的目标图像位置;在步骤S103中,根据所述实时位姿信息及所述目标图像位置确定对应的目标空间位置,根据所述目标空间位置及所述标记信息生成对应的目标标记信息,其中,所述目标标记信息用于在所述3D点云信息的目标空间位置叠加呈现所述标记信息。在此,所述第一用户设备包括但不限于带有摄像装置的人机交互设备,具体如个人电脑、智能手机、平板电脑、投影仪、智能眼镜或者智能头盔等,其中,第一用户设备可以是本身含有摄像装置,也可以是外接摄像装置。在一些情形下,第一用户设备还包括显示装置(如显示屏、投影仪等),用于在呈现的实时场景图像中叠加显示对应标记信息等。
具体而言,在步骤S101中,通过摄像装置拍摄关于当前场景的初始场景图像,根据所述初始场景图像对所述当前场景进行三维跟踪初始化。例如,用户持有第一用户设备,通过第一用户设备的摄像装置采集关于当前场景的初始场景图像,该初始场景图像的数量可以是一个或多个。第一用户设备可以基于该一个或多个初始场景图像对当前场景进行三维跟踪初始化,如SLAM(simultaneous localization and mapping,同步定位与建图)初始化等。具体地初始化方法包括但不限于双帧初始化、单帧初始化、2D识别初始化、3D模型初始化等。其中,第一用户设备可以在本地完成三维跟踪初始化,也可以是将采集的初始场景图像上传到其他设备(如网络服务器端)由其他设备进行三维跟踪初始化,在此不做限定。通过进行三维跟踪初始化能够获取摄像装置的初始位姿信息,在一些情形下,除获取初始位姿外,还可以获取当前场景对应的初始地图点信息。在一些实施例中,在完成SLAM初始化后,第一用户设备可以向用户呈现初始化完成的提示信息,以提醒用户添加标记信息(如AR标记信息)等。所述初始化的方法包括但不限于:通过识别所述当前场景中预设的2D标识物完成初始化;单帧初始化;双帧初始化;3D模型初始化等。具体示例如下:
1)通过2D识别SLAM初始化;
在当前场景的对应位置上放置2D识别图,将2D识别图提取2D特征并保存为文件,形成2D特征库。第一用户设备通过摄像装置获取放置有2D识别图的图像后,提取图像特征并与存储的特征库进行匹配识别,确定摄像装置相对于当前场景的位姿,将识别算法得到的信息(摄像装置位姿)发送给SLAM算法,完成跟踪的初始化。
2)单帧初始化
单帧初始化在图像传感器(即摄像装置)获取到一个近似平面场景的条件下,利用单应性矩阵得到对应的旋转和平移矩阵,从而初始化地图点和摄像装置的姿态。
3)双帧初始化
选取两个特征点数目大于一定阈值的两个连续帧进行匹配,匹配点数大于一定阈值视为匹配成功,然后计算两帧之间的单应性矩阵homography matrix和基本矩阵(fundamental matrix),根据情况选择使用单应性矩阵或者基本矩阵恢复位姿RT(两个图像帧的相对位姿),将初始化成功的第一帧(拍摄的第一帧)作为世界坐标系,得到第一帧到第二帧的相对位姿,然后通过三角化得到深度,计算得到地图点。更新关键帧之间的关系进而执行BA(Bundle Adjustment)优化,以优化地图点。最后将地图点的深度归一化后,添加初始化关键帧和当前关键帧到地图中,完成初始化。
4)3D模型初始化
首先需要获取跟踪目标的3D模型,利用3D模型来获取3D边缘特征,然后获取跟踪目标的初始位姿。在应用界面渲染初始位姿下的3D边缘特征,拍摄含跟踪目标的视频,读取视频的图像帧,用户通过3D模型边缘特征对准场景中的目标,进行跟踪匹配,跟踪到目标后,获得当前的位姿和特征点信息,传给SLAM算法进行初始化。
当然,本领域技术人员应能理解上述三维跟踪初始化方法仅为举例,其他现有的或今后可能出现的三维跟踪初始化方法如可适用于本申请,也应包含在本申请保护范围以内,并在此以引用方式包含于此。
在步骤S102中,通过所述摄像装置拍摄关于所述当前场景的实时场景图像,通过三维跟踪获取所述摄像装置的实时位姿信息及所述当前场景对应的3D点云信息,获取用户在所述当前场景的实时场景图像中输入的标记信息及所述标记信息对应的目标图像位置。例如,第一用户设备在进行三维跟踪初始化后,继续通过摄像装置扫描当前场景得到对应的实时场景图像,通过三维跟踪算法的跟踪线程、局部建图线程等对该实时场景图像进行处理,获取摄像装置的实时位姿信息及当前场景对应的3D点云信息,其中,该3D点云信息可以根据后续的实时场景图像信息进行更新。其中,3D点云信息包括3D点,例如,3D点云信息包括通过图像匹配和深度信息获取后确定的3D地图点,又例如,所述3D点云信息包括透过3D扫瞄仪所取 得之资料型式,该扫描资料以点的型式记录,每一个点包含有三维坐标,有些可能含有色彩资讯(R,G,B)或物体反射面强度。在一些实施方式中,所述3D点云信息除了包含对应的3D地图点,还包括但不限于:点云信息对应的关键帧、点云信息对应的共视图信息、点云信息对应的生长树信息等。其中,实时位姿信息包括摄像装置在空间中的实时位置和姿态,通过实时位姿信息可以进行图像位置与空间位置的转换等,如,位姿信息包括该摄像装置相对于当前场景的世界坐标系的外参,又如,位姿信息包括该摄像装置相对于当前场景的世界坐标系的外参以及摄像装置的相机坐标系与图像/像素坐标系的内参,在此不做限定。第一用户设备可以获取用户在实时场景图像中的相关操作,如触摸、点击、语音、手势或者头部动作等确定的目标点位的选中操作。第一用户设备基于用户在实时场景图像中的选中操作可以确定目标点位的目标图像位置,该目标图像位置用于表征目标点位在实时场景图像对应的像素/图像坐标系中的二维坐标信息等。第一用户设备还可以获取用户输入的标记信息,该标记信息包括用户输入的人机交互内容,用于叠加至目标图像位置在当前场景对应的空间位置等。例如,用户在第一用户设备上选择一标记信息,如3D箭头,然后点击显示屏幕中显示的实时场景图像中的某一位置,该点击位置为目标图像位置,然后该3D箭头叠加显示在实时场景中。
在一些实施方式中,所述标记信息包括但不限于:标识信息;文件信息;表单信息;应用调用信息;实时传感信息。例如,标记信息可以包括箭头、画笔-在屏幕上随意涂鸦、圆圈、几何形状等标识信息。还如,标记信息还可以包括对应多媒体文件信息,如图片、视频、3D模型、PDF文件、office文档等各类文件等。又如,标记信息还可以包括表单信息,例如在对应目标图像位置生成表单,供用户查看或输入内容等。还如,标记信息还可以包括应用调用信息,用于执行应用的相关指令等,如打开应用、调用应用具体的功能-如拨打电话、打开链接等。还如,标记信息还可以包括实时传感信息,用于连接传感装置(如传感器等)并获取目标对象的传感数据。在一些实施例中,标记信息包括以下任一项:标记标识,如标记的图标、名称等;标记内容,如PDF文件的内容、标记的颜色、尺寸等;标记类型,如文件信息、应用调用信息等。当然,本领域技术人员应能理解上述标记信息仅为举例,其他现有的或今后可能出现的标记信息如可适用于本申请,也应包含在本申请保护范围以内,并在此以引用方式包含于此。
在一些实施方式中,所述应用调用信息用于调用当前设备中已安装的第一目标应用。例如,第一用户设备当前已安装多个应用,该应用调用信息用于调用第一用户设备当前已安装的应用之一,例如启动对应应用后并执行相关快捷指令,如启动电话应用,并发起向张XX的通话等。当然,该标记信息当前呈现于第一用户设备,则该应用调用信息用于调用第一用户设备中的应用,若该标记信息呈现于其他用户设备(如第二用户设备等),则该应用调用信息用于调用对应第二用户设备中的应用,如启动第二用户设备的电话应用,并发起向张XX的通话。在一些情形下,对应第二用户设备中若未安装该应用调用信息对应的应用,则该应用调用信息用于提示用户安装对应应用并在安装完成后进行调用等。
在一些实施方式中,所述应用调用信息用于调用对应第三用户设备中已安装的第二目标应用,其中,所述第三用户设备与所述第一用户设备之间存在通信连接。例如,第三用户设备处于当前场景中,且第三用户设备与第一用户设备之间通过有线、无线或者网络设备等进行通信连接。基于对应通信连接,第一用户设备可以向第三用户设备发送应用调用信息对应的指令,从而第三用户设备调用相关应用执行对应操作等。具体地,如第一用户设备为增强现实眼镜,第三用户设备为当前场景中工作台上的操作设备,第三用户设备上安装有对工件进行操作的操作应用,第一用户设备可以基于用户的操作在实时场景图像中添加标记信息,该标记信息是关于第三用户设备的操作应用对应的应用调用信息,后续若获取到相关用户关于该应用调用信息的触发操作,则第三用户设备调用对应操作应用对工件进行加工等。在一些实施例中,用户设备(如第一用户设备或者第二用户设备等)获取用户在场景图像中关于呈现的标记信息的触发操作时,将对应应用调用信息对应的指令发送至第三用户设备,从而第三用户设备调用相关的应用执行相关操作或者指令等。其中,所述触发操作包括但不限于点击、触摸、手势指令、语音指令、按键、头部运动指令等。
在步骤S103中,根据所述实时位姿信息及所述目标图像位置确定对应的目标空间位置,根据所述目标空间位置及所述标记信息生成对应的目标标记信息,其中,所述目标标记信息用于在所述3D点云信息的目标空间位置叠加呈现所述标记信息。在一些实施例中,根据摄像装置获取的实时位姿信息,可以估计出摄像装置拍摄的实时场景图像上的二维坐标对应的空间三维坐标,从而基于目标图像位置确定对应 的目标空间位置等。例如,SLAM初始化完成后根据实时场景图像实时计算环境中的3D点云和摄像装置位姿,用户通过点击实时场景图像添加标记信息时,算法使用当前场景中的3D点云拟合出一个世界坐标系下的平面,得到平面表达式。同时通过摄像装置光心和用户点击点在像平面的坐标构建基于通过相机坐标系的射线,然后将此射线转换到世界坐标系下,在世界坐标系下由射线表达式和平面表达式计算得到射线与平面的交点,此交点即为摄像装置拍摄的场景图像中2D点击点所对应的3D空间点,将该3D空间点对应的坐标位置确定为标记信息对应的目标空间位置,该目标空间位置用于在空间中放置对应标记信息,使得该标记信息叠加显示在摄像装置拍摄的实时场景图像中的对应位置处,在实时场景图像上渲染该标记信息。第一用户设备根据对应目标空间位置和标记信息生成对应的目标标记信息,如生成对应标签配置文件,该配置文件包含空间坐标信息及标记信息等,在一些实施例中,对应标签配置文件可以与对应的场景3D点云信息存储于同一文件,从而方便管理和存储等。当然,本领域技术人员应能理解上述根据实时位姿信息及目标图像位置确定对应的目标空间位置的方法仅为举例,其他现有的或今后可能出现的确定方法如可适用于本申请,也应包含在本申请保护范围以内,并在此以引用方式包含于此。在一些实施方式中,所述标记信息包括实时传感信息,所述实时传感信息用于指示对应传感装置的实时传感数据,所述传感装置与所述第一用户设备之间存在通信连接。例如,实时传感数据包括通过信息传感器、射频识别技术、全球定位系统、红外线感应器、激光扫描器等各种装置与技术,实时采集任何需要监控、连接、互动的物体或过程,所包含的声、光、热、电、力学、化学、生物、位置等各种需要的信息。其中,较为常见的传感装置包括温度传感器、湿度传感器、亮度传感器等,上述传感装置仅为举例,不作限定。对应传感装置与第一用户设备之间通过有线、无线或者网络设备等建立通信连接,从而基于该通信连接,第一用户设备可以获取传感装置的实时传感信息等,该传感装置设置于目标对象用于采集目标对象(如其他设备或者物体等)的实时传感数据,该实时传感信息可以基于当前采集的实时图像进行更新,或者,该实时传感信息可以基于预定时间间隔进行更新。例如,在实时场景图像中,第一用户设备可以呈现该场景图像中包含的一个或多个可选对象,其中,每个可选择对象已设置相应的传感装置,用户在第一用户设备上选择一实时传感信息的标记信息,如温度传感器,然后点击显示屏幕中显示的已设置温度传感 器的某一可选择对象,该实时温度传感标记信息叠加显示在实时场景中的该可选择对象的位置处,后续该温度传感器的数据每隔0.5秒进行更新;又如,用户确定某一可选择对象设置的传感装置,然后用户在第一用户设备上选择该传感装置对应的实时传感信息的标记信息,点击显示屏幕中显示的该可选择对象,该选择的实时传感标记信息叠加显示在实时场景中的该可选择对象的位置处,后续该传感器的数据根据每张实时场景图像进行更新。
在一些实施方式中,所述方法还包括步骤S104(未示出),在步骤S104中,接收所述网络设备返回的、所述传感装置获取的实时传感信息;其中,所述根据所述目标空间位置及所述标记信息生成对应的目标标记信息,包括:根据所述目标空间位置及所述实时传感信息生成对应的目标标记信息,其中,所述目标标记信息用于在所述3D点云信息的目标空间位置叠加呈现所述实时传感信息。例如,第一用户设备基于用户的操作确定在实时场景图像中添加对应实时传感信息,并确定该实时传感信息对应的目标空间位置,该第一用户设备与传感装置通过网络设备建立通信连接,网络设备可以实时获取该传感装置的实时传感数据,并将该实时传感数据发送至第一用户设备。第一用户设备接收对应的实时传感数据,并在实时场景图像对应空间位置(如根据目标空间位置、实时位姿信息等实时计算对应的实时图像位置)叠加呈现该实时传感数据。
在一些实施方式中,所述根据所述实时位姿信息及所述目标图像位置确定对应的目标空间位置,包括:根据所述实时位姿信息将所述目标图像位置映射至空间中确定对应的3D直线;根据所述3D点云信息及所述3D直线确定目标点位对应的目标空间位置。例如,第一用户设备根据实时位姿信息及目标点位在实时场景图像中的目标图像位置先确定对应3D直线,再根据3D点云信息与该3D直线的空间位置关系,从而确定对应目标点位的目标空间位置,如根据3D直线与某平面的交点、3D直线与3D点云信息中特征点的距离等,其中,该3D直线可以是处于摄像装置相机坐标系中的直线,还可以是处于世界坐标系中的直线。例如,将目标图像位置即2D标记点(目标点位)P2d映射到摄像装置相机坐标系下的一条直线L3dC,将SLAM算法得到的3D点云映射到相机坐标系下,得到相机坐标系下的3D点,在一些实施例中,在相机坐标系下的3D点云中找到距离该直线L3dC垂直距离最近的点P3d’C,用P3d’C的深度值在直线L3dC上取该深度值对应的点P3dC;在另一些实施例中, 根据相机坐标系下的3D点云与L3dC的距离加权平均来作为深度值,根据这个深度值,在直线L3dC上取该深度值对应的点P3dC;将点P3dC映射到世界坐标系下得到P3d,则P3d即为该2D点在3D空间中的估计,从而确定目标图像位置对应的目标空间位置。还如,将世界坐标系中SLAM算法获得的3D点云映射到像素坐标系下,得到多个2D点,并记录2D点与3D点云的映射关系。获取与2D标记点P2d在一定距离阈值范围内的多个2D点,记为2Ds。在一些实施例中,在这些2Ds点中找到与标记点P2d距离最小的点,这个2Ds点对应的相机坐标系下的3D点的深度值就作为截取的深度值,在标记点P2d映射到相机坐标系下的一条直线L3dC上取该深度值对应的点,得到相机坐标系下的估计点P3dC,再转换到世界坐标系,得到目标空间位置;在一些实施例中,在这些2Ds点中找到与标记点P2d距离最小的点,这个2Ds点对应的世界坐标系下的3D点的深度值就作为截取的深度值,在标记点P2d映射到世界坐标系下的一条直线L3d上取该深度值对应的点,得到世界坐标系下的估计点P3d,得到目标空间位置;在另一些实施例中,根据这些2Ds点与标记点P2d之间的距离确定权重,根据这些2Ds点对应的相机坐标系下的3D点的深度值加权平均来确定最终深度值,加权平均就是(各点的权重*深度值)/点的个数,根据最终深度值,在标记点P2d映射到相机坐标系下的一条直线L3dC上截取估计点P3dC,再转换到世界坐标系,得到目标空间位置。在另一些实施例中,根据这些2Ds点与标记点P2d之间的距离确定权重,根据这些2Ds点对应的世界坐标系下的3D点的深度值加权平均来确定最终深度值,根据最终深度值,在标记点P2d映射到世界坐标系下的一条直线L3d上截取估计点P3d,得到目标空间位置。其中,上述坐标系之间的映射利用摄像装置的位姿信息。
在一些实施方式中,所述3D点云信息包括多个特征点,每个特征点包含对应的深度信息;其中,所述根据所述3D点云信息及所述3D直线确定目标点位对应的目标空间位置,包括:根据所述3D点云信息中各个特征点与所述3D直线的距离,从所述3D点云信息中确定至少一个目标特征点;基于所述至少一个目标特征点的深度信息确定所述3D直线上的目标点位的深度信息,从而确定对应的目标空间位置。其中,多个特征点是指3D点云信息中的3D点。如在一些情形下,通过在当前场景的实时场景图像中输入标记信息时对应的实时位姿信息将目标图像位置即2D标记点(目标点位)P2d映射到世界坐标系下的一条直线L3d,从SLAM算法获得的世 界坐标系下的3D点云中计算各特征点距离该直线L3d的距离,根据各特征点与该直线的距离筛选出对应的至少一个目标特征点。第一用户设备根据至少一个目标特征点的深度信息确定该目标点位的深度信息,如将至少一个目标特征点中某个目标特征点的深度信息作为目标点位的深度信息,或者通过加权平均的计算方式确定目标点位的深度信息等,根据该目标点位的深度信息确定对应的目标空间位置,如根据该目标点位的深度信息在标记点P2d映射到世界坐标系下的一条直线L3d上取该深度值对应的点,得到世界坐标系下的估计点P3d,从而得到目标空间位置。如在一些实施方式中,所述至少一个目标特征点包括与所述3D直线的距离最小的特征点。还如在一些实施方式中,所述至少一个目标特征点中每个目标特征点与所述3D直线的距离小于或等于距离阈值;其中,所述基于所述至少一个目标特征点的深度信息确定所述3D直线上的目标点位的深度信息,包括:根据所述至少一个目标特征点与所述3D直线的距离信息确定每个目标特征点的权重信息;基于所述每个目标特征点的深度信息、权重信息确定所述3D直线上的目标点位的深度信息。
例如,通过在当前场景的实时场景图像中输入标记信息时对应的实时位姿信息将2D标记点P2d映射到世界坐标系下的一条直线L3d,在一些实施例中,从SLAM算法获得的世界坐标系下的3D点云中找到距离该直线L3d垂直距离最近的点P3d’,根据该3D点P3d’的深度值在直线L3d上取该深度值对应的点作为估计点P3d,得到世界坐标系下的估计;在另一些实施例中,根据SLAM算法获得的世界坐标系下的3D点云与L3d的距离加权平均来作为深度值,进一步地,参与加权平均计算的3D点云,其与直线L3d的距离小于或等于距离阈值,根据这个深度值,在直线L3d上取该深度值对应的点作为估计点P3d,得到世界坐标系下的估计。其中,3D点与L3d的距离加权平均具体是(各个3D点的z值*权重系数)/点的个数,权重的大小是根据3D点到直线L3d的垂直距离,距离越小,权重越小,权重之和为1。
在一些实施方式中,所述方法还包括步骤S105(未示出),在步骤S105中,基于所述目标标记信息更新目标场景信息,其中,所述目标场景信息存储于场景数据库,所述场景数据库包括一个或多个目标场景信息,每个目标场景信息包括目标标记信息及对应的3D点云。例如,所述场景数据库可以保存在第一用户设备本地,还可以保存在与第一用户设备存在通信连接的网络设备。在一些情形下,第一用户设备根据目标标记信息、3D点云信息等生成关于当前场景的目标场景信息,并基于 该目标场景信息更新本地的场景数据库,或者,将该目标场景信息发送至对应网络设备以更新场景数据库等。在一些情形下,第一用户设备可以在确定完当前场景的全部目标标记信息后更新目标场景信息,或者,第一用户设备以预设时间间隔更新目标场景信息,又或者,第一用户设备基于用户的手动操作,如点击保存按钮,更新目标场景信息等。其中,目标场景信息包括此场景对应的3D点云信息,以及该3D点云信息相关联的至少一个目标标记信息,对应目标标记信息包含对应标记信息以及目标空间位置等。在一些实施方式中,所述每个目标场景信息还包括对应的设备参数信息。其中,设备参数信息包括设备摄像装置的内参或设备的标识信息,如生成3D点云信息的用户设备摄像装置的内参或该设备的标识信息。例如,通过第一用户设备对场景进行三维跟踪确定场景的3D点云信息并确定目标标记,目标场景信息除3D点云信息和目标标记信息外,还包括第一用户设备摄像装置的内参,当除第一用户设备外的其他用户设备使用该目标场景后,进行点云初始化时需要使用第一用户设备摄像装置的内参,从而增加目标标记在其他用户设备显示装置上显示位置的准确度。在一些情形下,目标场景信息可以直接包括设备摄像装置的内参,也可以通过设备的标识信息(如ID、名称等)确定设备摄像装置的内参,如设备摄像装置的内参与设备标识有对应关系。当然,在一些情形下,每个目标场景信息还包括对应的场景标识信息,如场景预览图及描述等,如用该场景的一张图片标识该场景,并在设备上呈现该场景的图片等。
第一用户设备存储对应目标场景信息后,可以在后续供用户进行调用或者继续采集等,如在一些实施方式中,所述方法还包括步骤S106(未示出),在步骤S106中,将所述目标场景信息发送至对应网络设备,其中,所述目标场景信息基于第二用户设备的场景调用请求经由所述网络设备发送至所述第二用户设备,所述目标场景信息用于在所述第二用户设备采集的当前场景图像中叠加所述标记信息。例如,第二用户持有第二用户设备,通过该第二用户设备向网络设备发送至关于当前场景的场景调用请求,对应场景调用请求包括当前场景的场景标识信息,网络设备根据场景标识信息匹配确定当前场景的目标场景信息,并将该目标场景信息发送至第二用户设备。或者网络设备预先向第二用户设备下发多个目标场景信息,第二用户设备可以基于用户选择操作或者第二用户设备拍摄的当前场景图像等从多个目标场景信息中确定用户选中或自动匹配的目标场景信息,然后该目标场景信息中的第一 用户设备对应用户添加的标记信息通过第二用户设备的显示装置呈现,当然,若存在其他用户在该目标场景信息中添加的其他标记信息,则同时呈现对应其他标记信息;换言之,该目标场景信息可能存在多个标记信息,每个标记信息的源用户可能不同,该多个标记信息根据最新更新结果,呈现于第二用户设备端。在一些实施例中,第二用户除了可以查看目标场景信息中包括的标记信息外,还可以对第二用户设备显示装置呈现的标记信息进行编辑操作,如新增、删除、修改、替换标记信息或者移动标记信息的位置等,在此仅为举例,不作限定。
在一些实施方式中,所述方法还包括步骤S107(未示出),在步骤S107中,接收对应第二用户设备发送的关于所述目标场景信息的场景调用请求;响应于所述场景调用请求,将所述目标场景信息发送至所述第二用户设备,其中,所述目标场景信息用于在所述第二用户设备采集的当前场景图像中叠加所述标记信息。例如,第一用户设备与第二用户设备直接建立了通信连接,无需通过网络设备进行数据存储和传输等。第二用户设备向第一用户设备发送至关于当前场景的场景调用请求,对应场景调用请求包括当前场景的场景标识信息,第一用户设备根据场景标识信息匹配确定当前场景的目标场景信息,并将该目标场景信息发送至第二用户设备。或者第一用户设备预先向第二用户设备共享多个目标场景信息,第二用户设备可以基于用户选择操作或者第二用户设备拍摄的当前场景图像等从多个目标场景信息中确定用户选中或自动匹配的目标场景信息,该目标场景信息中的标记信息通过第二用户设备的显示装置呈现。
在一些实施方式中,所述方法还包括步骤S108(未示出),在步骤S108中,通过所述第一用户设备的摄像装置继续拍摄关于所述当前场景的后续场景图像,基于所述目标场景信息将所述标记信息叠加呈现于所述当前场景对应的后续场景图像。例如,对应目标场景信息除了供其他设备进行调用之外,还可以供第一用户设备继续拍摄当前场景并进行后续更新等,如第一用户设备在基于目标标记信息更新目标场景信息之后继续添加其他标记信息或编辑已有标记信息从而继续更新目标场景信息,或者第一用户设备在后续时间段重新加载目标场景信息,并基于该目标场景信息添加标记或者对已有标记进行编辑等。第一用户设备通过摄像装置继续拍摄关于当前场景的后续场景图像,并在该后续场景图像中叠加呈现已有标记信息,如根据已有标记的目标空间位置、摄像装置的实时位姿信息等实时计算后续场景图像中 对应的实时图像位置进行呈现。
图2示出根据本申请另一个方面的一种呈现目标标记信息的方法,应用于第二用户设备,其中,该方法包括步骤S201、步骤S202以及步骤S203。在步骤S201中,获取与当前场景相匹配的目标场景信息,其中,所述目标场景信息包括对应的目标标记信息,所述目标标记信息包括对应标记信息及目标空间位置;在步骤S202中,通过所述摄像装置拍摄关于所述当前场景的当前场景图像;在步骤S203中,根据所述摄像装置的当前位姿信息及所述目标空间位置确定对应的目标图像位置,在所述当前场景图像的目标图像位置叠加呈现所述标记信息。例如,第二用户持有第二用户设备,第二用户设备可以通过向网络设备或者第一用户设备发送场景调用请求,从而通过第二用户设备下载一个或多个目标场景信息到本地。在一些情形下,第二用户设备上获取到目标场景信息后,通过第二用户设备的摄像装置扫描当前场景获取当前场景图像,并通过三维跟踪(如SLAM)进行点云初始化,将当前世界坐标系与目标场景信息中的3D点云信息的世界坐标系对齐,从而实现标记信息的目标空间位置与当前场景对齐,根据目标场景信息中标记信息的目标空间位置、通过三维跟踪获得的第二用户设备摄像装置的实时位姿信息等实时计算该标记信息在当前场景图像中对应的目标图像位置,该标记信息叠加显示在第二用户设备显示屏幕中的对应位置处,目标场景信息中包括的标记信息复现。第二用户设备获取与当前场景相匹配的目标场景信息,在一些情形下,可以通过第二用户手动选择获取,在另一些情形下,通过自动匹配获取,如第二用户设备通过摄像装置拍摄当前场景图像,根据该场景图像确定相匹配的目标场景信息,如通过三维跟踪的2D识别初始化(如场景中有2D识别图)确定相匹配的目标场景信息;又如通过三维跟踪的点云初始化,确定相匹配的目标场景信息。在一些情形下,上述第二用户设备获取与当前场景相匹配的目标场景信息,还可以基于第二用户设备提供的相关信息来确定,该相关信息包括但不限于:第二用户设备的位置信息,如GPS位置信息、wifi射频指纹信息等、第二用户设备用户的身份信息、第二用户设备用户的权限信息,如用户的访问权限、查看编辑权限等、第二用户设备用户正在执行的任务信息等。在一些实施例中,上述获取与当前场景相匹配的目标场景信息的方法可以组合使用,如先通过第二用户设备提供的相关信息下载对应的目标场景信息或者从已下载的目标场景信息中筛选符合的目标场景信息,然后再通过手动选择或自动匹配确定相匹 配的目标场景等。本领域技术人员应能理解,上述第二用户设备获取与当前场景相匹配的目标场景信息的方法仅为举例,其他现有的或今后可能出现的确定方法如可适用于本申请,也应包含在本申请保护范围以内,并在此以引用方式包含于此。在一些实施例中,SLAM的点云初始化流程包括:1)判断当前场景的场景图像与一个或多个目标场景信息中的点云相似度,选择相似度最高的点云(如分别采用BOW(bag of words)模型计算当前图像帧特征点与一个或多组目标场景信息中地图点云中的多个关键帧的特征点之间的匹配点数,选择匹配特征点数最多的一组)进行加载,执行SLAM重定位,完成初始化。其中,重定位流程如下:
SLAM系统对获取到的当前图像帧提取ORB特征,采用BOW(bag of words)模型计算当前图像帧特征点与地图点云中的多个关键帧之间的匹配点;匹配数量满足一定阈值被视为候选关键帧。对于每个候选关键帧通过ransac(随机采样一致性算法)和PNP估计当前帧相机位姿,然后更新估计得到的内点(内点是ransac算法中适用于估计模型的点)为地图点,进而利用图优化理论优化当前帧相机位姿,如果优化后的内点较少,则重复上述过程,对选择的多个候选关键帧的地图点进行更多的匹配,最后再次优化位姿,内点满足一定阈值时,重定位成功,从而建立与地图点云中的坐标系一致的SLAM坐标系(获得相机位姿),完成初始化。
在此,所述目标场景信息的生成过程与前述图1中目标场景信息的生成过程相同或相似,不再赘述。
在一些实施方式中,所述方法还包括步骤S204(未示出),在步骤S204中,获取对应用户关于所述目标场景信息的编辑操作,基于所述编辑操作更新所述目标场景信息。例如,编辑操作包括但不限于对目标场景信息中目标标记信息的编辑,如对标记信息的修改、添加替换和删除等,或者对于标记信息的位置的移动等。具体地,第二用户设备通过显示装置在当前场景图像上呈现目标场景信息中已添加的标记信息,标记信息复现后,第二用户可以在当前场景中进行编辑操作,具体示例如下:1)第二用户在显示装置上的当前场景图像中添加标记信息,可以调整标记信息的内容、位置、大小和角度等;
2)第二用户可以删除标记信息:
第二用户可以选中显示装置上显示的标记信息,如目标场景信息中已添加的标记信息或者第二用户新添加的标记信息,出现删除选项,进行删除。
3)、第二用户查看标记信息:
第二用户可以查看显示装置上显示的标记信息,如点击PDF标记,打开该PDF进行查看;又如查看温度传感器的实时传感信息等;
4)、第二用户操作标记信息
第二用户可以对当前场景中添加的标记信息执行相应的操作,如调用应用的快捷功能,第二用户点击该应用调用信息的标记后,可以启动该应用并调用相应的快捷功能;又如填写表单,第二用户点击该表单信息的标记后,可以填写该表单。
又例如,编辑操作包括对目标场景信息中的3D点云信息的更新,如通过移动第二用户设备的摄像装置,通过三维跟踪算法更新所述目标场景信息对应的3D点云信息。在一些实施例中,第二用户对目标场景的编辑操作同步到云端服务器,更新对应的场景数据库,如第二用户对目标场景的编辑操作覆盖之前的目标场景信息,又如,第二用户对目标场景的编辑操作另外保存为一个新的目标场景信息等。
如在一些实施方式中,所述方法还包括步骤S205(未示出),在步骤S205中,将更新后的目标场景信息发送至网络设备以更新对应场景数据库。在一些实施方式中,所述标记信息包括实时传感信息,所述实时传感信息用于指示对应传感装置的实时传感数据,所述传感装置与所述第二用户设备之间存在通信连接;其中,所述方法还包括步骤S207(未示出),在步骤S207中,获取所述传感装置对应的实时传感数据;其中,所述在所述当前场景图像的目标图像位置叠加呈现所述标记信息,包括:在所述当前场景图像的目标图像位置叠加呈现所述实时传感数据。例如,实时传感数据包括通过信息传感器、射频识别技术、全球定位系统、红外线感应器、激光扫描器等各种装置与技术,实时采集任何需要监控、连接、互动的物体或过程,所包含的声、光、热、电、力学、化学、生物、位置等各种需要的信息。对应传感装置与第二用户设备之间通过有线、无线或者网络设备等建立通信连接,从而基于该通信连接,第二用户设备可以获取传感装置的实时传感信息等,该传感装置设置于目标对象用于采集目标对象(如其他设备或者物体等)的实时传感数据,该实时传感信息可以基于当前采集的实时图像进行更新,或者,该实时传感信息可以基于预定时间间隔进行更新。例如,在实时场景图像中,第二用户设备可以呈现该场景图像中的某个目标对象已添加的一个或多个实时传感信息的标记信息,通过实时传感信息的更新,第二用户可以查看该目标对象的实时传感信息,以了解该目标对象的 设备状态。
在一些实施方式中,所述标记信息包括应用调用信息,所述应用调用信息用于调用当前设备中的目标应用;其中,所述方法还包括步骤S206(未示出),在步骤S206中,获取对应用户关于所述当前场景图像中应用调用信息的触发操作,基于所述触发操作调用所述第二用户设备中的目标应用。例如,第二用户设备当前已安装多个应用,该应用调用信息用于调用第二用户设备当前已安装的应用之一,例如启动对应应用后并执行相关快捷指令,如启动电话应用,并发起向张XX的通话等。在一些情形下,对应第二用户设备中若未安装该应用调用信息对应的应用,则该应用调用信息用于提示用户安装对应应用并在安装完成后进行调用等。第三用户设备处于当前场景中,且第三用户设备与第二用户设备之间通过有线、无线或者网络设备等进行通信连接。基于对应通信连接,第二用户设备可以向第三用户设备发送应用调用信息对应的指令,从而第三用户设备调用相关应用执行对应操作等。具体地,如第二用户设备为增强现实眼镜,第三用户设备为当前场景中工作台上的操作设备,第三用户设备上安装有对工件进行操作的操作应用,第二用户设备可以查看当前实时场景中已添加的应用调用信息的标识信息,如当前在实时场景图像中呈现了关于第三用户设备的操作应用对应的应用调用信息,若获取到第二用户关于该应用调用信息的触发操作,则第三用户设备调用对应操作设备对工件进行加工等。在一些情形下,第二用户也可以在实时场景信息中添加关于第三用户设备的操作应用对用的应用调用信息的标识信息。其中,所述触发操作包括但不限于点击、触摸、手势指令、语音指令、按键、头部运动指令等。
图3示出根据本申请一个方面的一种确定并呈现目标标记信息的方法,其中,所述方法包括:
第一用户设备通过摄像装置拍摄关于当前场景的初始场景图像,根据所述初始场景图像对所述当前场景进行三维跟踪初始化;
所述第一用户设备通过所述摄像装置拍摄关于所述当前场景的实时场景图像,通过三维跟踪获取所述摄像装置的实时位姿信息及所述当前场景对应的3D点云信息,获取用户在所述当前场景的实时场景图像中输入的标记信息及所述标记信息对应的目标图像位置;
所述第一用户设备根据所述实时位姿信息及所述目标图像位置确定对应的目 标空间位置,根据所述目标空间位置及所述标记信息生成对应的目标标记信息,其中,所述目标标记信息用于在所述3D点云信息的目标空间位置叠加呈现所述标记信息;
所述第二用户设备获取与所述当前场景相匹配的所述目标场景信息,其中,所述目标场景信息包括对应的目标标记信息,所述目标标记信息包括对应标记信息及目标空间位置;
所述第二用户设备通过所述摄像装置拍摄关于所述当前场景的当前场景图像;
所述第二用户设备根据所述摄像装置的当前位姿信息及所述目标空间位置确定对应的目标图像位置,在所述当前场景图像的目标图像位置叠加呈现所述标记信息。
在此,上述确定并呈现目标标记信息的过程与前述图1中确定目标标记信息和图2中呈现目标标记信息的过程相同或相似,不再赘述。
上文主要对本申请的一种确定并呈现目标标记信息的方法的各实施例进行介绍,此外,本申请还提供了能够实施上述各实施例的具体设备,下面我们结合图4、5进行介绍。
图4示出了根据本申请一个方面的一种确定目标标记信息的第一用户设备100,其中,该设备包括一一模块101、一二模块102以及一三模块103。一一模块101,用于通过摄像装置拍摄关于当前场景的初始场景图像,根据所述初始场景图像对所述当前场景进行三维跟踪初始化;一二模块102,用于通过所述摄像装置拍摄关于所述当前场景的实时场景图像,通过三维跟踪获取所述摄像装置的实时位姿信息及所述当前场景对应的3D点云信息,获取用户在所述当前场景的实时场景图像中输入的标记信息及所述标记信息对应的目标图像位置;一三模块103,用于根据所述实时位姿信息及所述目标图像位置确定对应的目标空间位置,根据所述目标空间位置及所述标记信息生成对应的目标标记信息,其中,所述目标标记信息用于在所述3D点云信息的目标空间位置叠加呈现所述标记信息。
在一些实施方式中,所述标记信息包括但不限于:标识信息;文件信息;表单信息;应用调用信息;实时传感信息。在一些实施方式中,所述应用调用信息用于调用当前设备中已安装的第一目标应用。在一些实施方式中,所述应用调用信息用于调用对应第三用户设备中已安装的第二目标应用,其中,所述第三用户设备与所 述第一用户设备之间存在通信连接。在一些实施方式中,所述标记信息包括实时传感信息,所述实时传感信息用于指示对应传感装置的实时传感数据,所述传感装置与所述第一用户设备之间存在通信连接。
在此,所述图4示出的一一模块101、一二模块102以及一三模块103对应的具体实施方式与前述图1示出的步骤S101、步骤S102以及步骤S103的实施例相同或相似,因而不再赘述,以引用的方式包含于此。
在一些实施方式中,所述设备还包括一四模块(未示出),用于接收所述网络设备返回的、所述传感装置获取的实时传感信息;其中,所述根据所述目标空间位置及所述标记信息生成对应的目标标记信息,包括:根据所述目标空间位置及所述实时传感信息生成对应的目标标记信息,其中,所述目标标记信息用于在所述3D点云信息的目标空间位置叠加呈现所述实时传感信息。在一些实施方式中,所述根据所述实时位姿信息及所述目标图像位置确定对应的目标空间位置,包括:根据所述实时位姿信息将所述目标图像位置映射至空间中确定对应的3D直线;根据所述3D点云信息及所述3D直线确定目标点位对应的目标空间位置。在一些实施方式中,所述3D点云信息包括多个特征点,每个特征点包含对应的深度信息;其中,所述根据所述3D点云信息及所述3D直线确定目标点位对应的目标空间位置,包括:根据所述3D点云信息中各个特征点与所述3D直线的距离,从所述3D点云信息中确定至少一个目标特征点;基于所述至少一个目标特征点的深度信息确定所述3D直线上的目标点位的深度信息,从而确定对应的目标空间位置。如在一些实施方式中,所述至少一个目标特征点包括与所述3D直线的距离最小的特征点。还如在一些实施方式中,所述至少一个目标特征点中每个目标特征点与所述3D直线的距离小于或等于距离阈值;其中,所述基于所述至少一个目标特征点的深度信息确定所述3D直线上的目标点位的深度信息,包括:根据所述至少一个目标特征点与所述3D直线的距离信息确定每个目标特征点的权重信息;基于所述每个目标特征点的深度信息、权重信息确定所述3D直线上的目标点位的深度信息。
在一些实施方式中,所述设备还包括一五模块(未示出),用于基于所述目标标记信息更新目标场景信息,其中,所述目标场景信息存储于场景数据库,所述场景数据库包括一个或多个目标场景信息,每个目标场景信息包括目标标记信息及对应的3D点云。在一些实施方式中,所述每个目标场景信息还包括对应的设备参数信 息。
在一些实施方式中,所述设备还包括一六模块(未示出),用于将所述目标场景信息发送至对应网络设备,其中,所述目标场景信息基于第二用户设备的场景调用请求经由所述网络设备发送至所述第二用户设备,所述目标场景信息用于在所述第二用户设备采集的当前场景图像中叠加所述标记信息。在一些实施方式中,所述设备还包括一七模块(未示出),用于接收对应第二用户设备发送的关于所述目标场景信息的场景调用请求;响应于所述场景调用请求,将所述目标场景信息发送至所述第二用户设备,其中,所述目标场景信息用于在所述第二用户设备采集的当前场景图像中叠加所述标记信息。在一些实施方式中,所述设备还包括一八模块(未示出),用于通过所述第一用户设备的摄像装置继续拍摄关于所述当前场景的后续场景图像,基于所述目标场景信息将所述标记信息叠加呈现于所述当前场景对应的后续场景图像。
在此,所述一四模块至一八模块对应的具体实施方式与前述步骤S104至步骤S108的实施例相同或相似,因而不再赘述,以引用的方式包含于此。
图5示出根据本申请另一个方面的一种呈现目标标记信息的第二用户设备200,其中,该设备包括二一模块201、二二模块202以及二三模块203。二一模块201,用于获取与当前场景相匹配的目标场景信息,其中,所述目标场景信息包括对应的目标标记信息,所述目标标记信息包括对应标记信息及目标空间位置;二二模块202,用于通过所述摄像装置拍摄关于所述当前场景的当前场景图像;二三模块203,用于根据所述摄像装置的当前位姿信息及所述目标空间位置确定对应的目标图像位置,在所述当前场景图像的目标图像位置叠加呈现所述标记信息。
在此,所述图5示出的二一模块201、二二模块202以及二三模块203对应的具体实施方式与前述图2示出的步骤S201、步骤S202以及步骤S203的实施例相同或相似,因而不再赘述,以引用的方式包含于此。
在一些实施方式中,所述设备还包括二四模块(未示出),用于获取对应用户关于所述目标场景信息的编辑操作,基于所述编辑操作更新所述目标场景信息。在一些实施方式中,所述设备还包括二五模块(未示出),用于将更新后的目标场景信息发送至网络设备以更新对应场景数据库。在一些实施方式中,所述标记信息包括实时传感信息,所述实时传感信息用于指示对应传感装置的实时传感数据,所述传感 装置与所述第二用户设备之间存在通信连接;其中,所述设备还包括二七模块(未示出),用于获取所述传感装置对应的实时传感数据;其中,所述在所述当前场景图像的目标图像位置叠加呈现所述标记信息,包括:在所述当前场景图像的目标图像位置叠加呈现所述实时传感数据。在一些实施方式中,所述标记信息包括应用调用信息,所述应用调用信息用于调用当前设备中的目标应用;其中,所述设备还包括二六模块(未示出),用于获取对应用户关于所述当前场景图像中应用调用信息的触发操作,基于所述触发操作调用所述第二用户设备中的目标应用。
在此,所述二四模块至二七模块对应的具体实施方式与前述步骤S204至步骤S207的实施例相同或相似,因而不再赘述,以引用的方式包含于此。
除上述各实施例介绍的方法和设备外,本申请还提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机代码,当所述计算机代码被执行时,如前任一项所述的方法被执行。
本申请还提供了一种计算机程序产品,当所述计算机程序产品被计算机设备执行时,如前任一项所述的方法被执行。
本申请还提供了一种计算机设备,所述计算机设备包括:
一个或多个处理器;
存储器,用于存储一个或多个计算机程序;
当所述一个或多个计算机程序被所述一个或多个处理器执行时,使得所述一个或多个处理器实现如前任一项所述的方法。
图6示出了可被用于实施本申请中所述的各个实施例的示例性系统;
如图6所示在一些实施例中,系统300能够作为各所述实施例中的任意一个上述设备。在一些实施例中,系统300可包括具有指令的一个或多个计算机可读介质(例如,系统存储器或NVM/存储设备320)以及与该一个或多个计算机可读介质耦合并被配置为执行指令以实现模块从而执行本申请中所述的动作的一个或多个处理器(例如,(一个或多个)处理器305)。
对于一个实施例,系统控制模块310可包括任意适当的接口控制器,以向(一个或多个)处理器305中的至少一个和/或与系统控制模块310通信的任意适当的设备或组件提供任意适当的接口。
系统控制模块310可包括存储器控制器模块330,以向系统存储器315提供接 口。存储器控制器模块330可以是硬件模块、软件模块和/或固件模块。
系统存储器315可被用于例如为系统300加载和存储数据和/或指令。对于一个实施例,系统存储器315可包括任意适当的易失性存储器,例如,适当的DRAM。在一些实施例中,系统存储器315可包括双倍数据速率类型四同步动态随机存取存储器(DDR4SDRAM)。
对于一个实施例,系统控制模块310可包括一个或多个输入/输出(I/O)控制器,以向NVM/存储设备320及(一个或多个)通信接口325提供接口。
例如,NVM/存储设备320可被用于存储数据和/或指令。NVM/存储设备320可包括任意适当的非易失性存储器(例如,闪存)和/或可包括任意适当的(一个或多个)非易失性存储设备(例如,一个或多个硬盘驱动器(HDD)、一个或多个光盘(CD)驱动器和/或一个或多个数字通用光盘(DVD)驱动器)。
NVM/存储设备320可包括在物理上作为系统300被安装在其上的设备的一部分的存储资源,或者其可被该设备访问而不必作为该设备的一部分。例如,NVM/存储设备320可通过网络经由(一个或多个)通信接口325进行访问。
(一个或多个)通信接口325可为系统300提供接口以通过一个或多个网络和/或与任意其他适当的设备通信。系统300可根据一个或多个无线网络标准和/或协议中的任意标准和/或协议来与无线网络的一个或多个组件进行无线通信。
对于一个实施例,(一个或多个)处理器305中的至少一个可与系统控制模块310的一个或多个控制器(例如,存储器控制器模块330)的逻辑封装在一起。对于一个实施例,(一个或多个)处理器305中的至少一个可与系统控制模块310的一个或多个控制器的逻辑封装在一起以形成系统级封装(SiP)。对于一个实施例,(一个或多个)处理器305中的至少一个可与系统控制模块310的一个或多个控制器的逻辑集成在同一模具上。对于一个实施例,(一个或多个)处理器305中的至少一个可与系统控制模块310的一个或多个控制器的逻辑集成在同一模具上以形成片上系统(SoC)。
在各个实施例中,系统300可以但不限于是:服务器、工作站、台式计算设备或移动计算设备(例如,膝上型计算设备、手持计算设备、平板电脑、上网本等)。在各个实施例中,系统300可具有更多或更少的组件和/或不同的架构。例如,在一些实施例中,系统300包括一个或多个摄像机、键盘、液晶显示器(LCD)屏幕(包 括触屏显示器)、非易失性存储器端口、多个天线、图形芯片、专用集成电路(ASIC)和扬声器。
需要注意的是,本申请可在软件和/或软件与硬件的组合体中被实施,例如,可采用专用集成电路(ASIC)、通用目的计算机或任何其他类似硬件设备来实现。在一个实施例中,本申请的软件程序可以通过处理器执行以实现上文所述步骤或功能。同样地,本申请的软件程序(包括相关的数据结构)可以被存储到计算机可读记录介质中,例如,RAM存储器,磁或光驱动器或软磁盘及类似设备。另外,本申请的一些步骤或功能可采用硬件来实现,例如,作为与处理器配合从而执行各个步骤或功能的电路。
另外,本申请的一部分可被应用为计算机程序产品,例如计算机程序指令,当其被计算机执行时,通过该计算机的操作,可以调用或提供根据本申请的方法和/或技术方案。本领域技术人员应能理解,计算机程序指令在计算机可读介质中的存在形式包括但不限于源文件、可执行文件、安装包文件等,相应地,计算机程序指令被计算机执行的方式包括但不限于:该计算机直接执行该指令,或者该计算机编译该指令后再执行对应的编译后程序,或者该计算机读取并执行该指令,或者该计算机读取并安装该指令后再执行对应的安装后程序。在此,计算机可读介质可以是可供计算机访问的任意可用的计算机可读存储介质或通信介质。
通信介质包括藉此包含例如计算机可读指令、数据结构、程序模块或其他数据的通信信号被从一个系统传送到另一系统的介质。通信介质可包括有导的传输介质(诸如电缆和线(例如,光纤、同轴等))和能传播能量波的无线(未有导的传输)介质,诸如声音、电磁、RF、微波和红外。计算机可读指令、数据结构、程序模块或其他数据可被体现为例如无线介质(诸如载波或诸如被体现为扩展频谱技术的一部分的类似机制)中的已调制数据信号。术语“已调制数据信号”指的是其一个或多个特征以在信号中编码信息的方式被更改或设定的信号。调制可以是模拟的、数字的或混合调制技术。
作为示例而非限制,计算机可读存储介质可包括以用于存储诸如计算机可读指令、数据结构、程序模块或其它数据的信息的任何方法或技术实现的易失性和非易失性、可移动和不可移动的介质。例如,计算机可读存储介质包括,但不限于,易失性存储器,诸如随机存储器(RAM,DRAM,SRAM);以及非易失性存储器,诸如闪存、 各种只读存储器(ROM,PROM,EPROM,EEPROM)、磁性和铁磁/铁电存储器(MRAM,FeRAM);以及磁性和光学存储设备(硬盘、磁带、CD、DVD);或其它现在已知的介质或今后开发的能够存储供计算机系统使用的计算机可读信息/数据。
在此,根据本申请的一个实施例包括一个装置,该装置包括用于存储计算机程序指令的存储器和用于执行程序指令的处理器,其中,当该计算机程序指令被该处理器执行时,触发该装置运行基于前述根据本申请的多个实施例的方法和/或技术方案。
对于本领域技术人员而言,显然本申请不限于上述示范性实施例的细节,而且在不背离本申请的精神或基本特征的情况下,能够以其他的具体形式实现本申请。因此,无论从哪一点来看,均应将实施例看作是示范性的,而且是非限制性的,本申请的范围由所附权利要求而不是上述说明限定,因此旨在将落在权利要求的等同要件的含义和范围内的所有变化涵括在本申请内。不应将权利要求中的任何附图标记视为限制所涉及的权利要求。此外,显然“包括”一词不排除其他单元或步骤,单数不排除复数。装置权利要求中陈述的多个单元或装置也可以由一个单元或装置通过软件或者硬件来实现。第一,第二等词语用来表示名称,而并不表示任何特定的顺序。

Claims (26)

  1. 一种确定目标标记信息的方法,应用于第一用户设备,其中,该方法包括:
    通过摄像装置拍摄关于当前场景的初始场景图像,根据所述初始场景图像对所述当前场景进行三维跟踪初始化;
    通过所述摄像装置拍摄关于所述当前场景的实时场景图像,通过三维跟踪获取所述摄像装置的实时位姿信息及所述当前场景对应的3D点云信息,获取用户在所述当前场景的实时场景图像中输入的标记信息及所述标记信息对应的目标图像位置;
    根据所述实时位姿信息及所述目标图像位置确定对应的目标空间位置,根据所述目标空间位置及所述标记信息生成对应的目标标记信息,其中,所述目标标记信息用于在所述3D点云信息的目标空间位置叠加呈现所述标记信息。
  2. 根据权利要求1所述的方法,其中,所述标记信息包括以下至少任一项:
    标识信息;
    文件信息;
    表单信息;
    应用调用信息;
    实时传感信息。
  3. 根据权利要求2所述的方法,其中,所述应用调用信息用于调用当前设备中已安装的第一目标应用。
  4. 根据权利要求2所述的方法,其中,所述应用调用信息用于调用对应第三用户设备中已安装的第二目标应用,其中,所述第三用户设备与所述第一用户设备之间存在通信连接。
  5. 根据权利要求2所述的方法,其中,所述标记信息包括实时传感信息,所述实时传感信息用于指示对应传感装置的实时传感数据,所述传感装置与所述第一用户设备之间存在通信连接。
  6. 根据权利要求5所述的方法,其中,所述方法还包括:
    向对应网络设备发送关于所述传感装置的数据获取请求;
    接收所述网络设备返回的、所述传感装置获取的实时传感信息;
    其中,所述根据所述目标空间位置及所述标记信息生成对应的目标标记信息,包括:
    根据所述目标空间位置及所述实时传感信息生成对应的目标标记信息,其中,所述目标标记信息用于在所述3D点云信息的目标空间位置叠加呈现所述实时传感信息。
  7. 根据权利要求1所述的方法,其中,所述根据所述实时位姿信息及所述目标图像位置确定对应的目标空间位置,包括:
    根据所述实时位姿信息将所述目标图像位置映射至空间中确定对应的3D直线;
    根据所述3D点云信息及所述3D直线确定目标点位对应的目标空间位置。
  8. 根据权利要求7所述的方法,其中,所述3D点云信息包括多个特征点,每个特征点包含对应的深度信息;其中,所述根据所述3D点云信息及所述3D直线确定目标点位对应的目标空间位置,包括:
    根据所述3D点云信息中各个特征点与所述3D直线的距离,从所述3D点云信息中确定至少一个目标特征点;
    基于所述至少一个目标特征点的深度信息确定所述3D直线上的目标点位的深度信息,从而确定对应的目标空间位置。
  9. 根据权利要求8所述的方法,其中,所述至少一个目标特征点包括与所述3D直线的距离最小的特征点。
  10. 根据权利要求8所述的方法,其中,所述至少一个目标特征点中每个目标特征点与所述3D直线的距离小于或等于距离阈值;其中,所述基于所述至少一个目标特征点的深度信息确定所述3D直线上的目标点位的深度信息,包括:
    根据所述至少一个目标特征点与所述3D直线的距离信息确定每个目标特征点的权重信息;
    基于所述每个目标特征点的深度信息、权重信息确定所述3D直线上的目标点位的深度信息。
  11. 根据权利要求1所述的方法,其中,所述方法还包括:
    基于所述目标标记信息更新目标场景信息,其中,所述目标场景信息存储于场景数据库,所述场景数据库包括一个或多个目标场景信息,每个目标场景信息包括目标标记信息及对应的3D点云信息。
  12. 根据权利要求11所述的方法,其中,所述每个目标场景信息还包括对应的设备参数信息。
  13. 根据权利要求11或12所述的方法,其中,所述方法还包括:
    将所述目标场景信息发送至对应网络设备,其中,所述目标场景信息基于第二用户设备的场景调用请求经由所述网络设备发送至所述第二用户设备,所述目标场景信息用于在所述第二用户设备采集的当前场景图像中叠加所述标记信息。
  14. 根据权利要求11或12所述的方法,其中,所述方法还包括:
    接收对应第二用户设备发送的关于所述目标场景信息的场景调用请求;
    响应于所述场景调用请求,将所述目标场景信息发送至所述第二用户设备,其中,所述目标场景信息用于在所述第二用户设备采集的当前场景图像中叠加所述标记信息。
  15. 根据权利要求11或12所述的方法,其中,所述方法还包括:
    通过所述第一用户设备的摄像装置继续拍摄关于所述当前场景的后续场景图像,基于所述目标场景信息将所述标记信息叠加呈现于所述当前场景对应的后续场景图像。
  16. 一种呈现目标标记信息的方法,应用于第二用户设备,其中,该方法包括:
    获取与当前场景相匹配的目标场景信息,其中,所述目标场景信息包括对应的目标标记信息,所述目标标记信息包括对应标记信息及目标空间位置;
    通过所述摄像装置拍摄关于所述当前场景的当前场景图像;
    根据所述摄像装置的当前位姿信息及所述目标空间位置确定对应的目标图像位置,在所述当前场景图像的目标图像位置叠加呈现所述标记信息。
  17. 根据权利要求16所述的方法,其中,所述方法还包括:
    获取对应用户关于所述目标场景信息的编辑操作,基于所述编辑操作更新所述目标场景信息。
  18. 根据权利要求17所述的方法,其中,所述方法还包括:
    将更新后的目标场景信息发送至网络设备以更新对应场景数据库。
  19. 根据权利要求16所述的方法,其中,所述标记信息包括应用调用信息,所述应用调用信息用于调用当前设备中的目标应用;其中,所述方法还包括:
    获取对应用户关于所述当前场景图像中应用调用信息的触发操作,基于所述触发操作调用所述第二用户设备中的目标应用。
  20. 根据权利要求16所述的方法,其中,所述标记信息包括实时传感信息,所述实时传感信息用于指示对应传感装置的实时传感数据,所述传感装置与所述第二用户设备之间存在通信连接;其中,所述方法还包括:
    获取所述传感装置对应的实时传感数据;
    其中,所述在所述当前场景图像的目标图像位置叠加呈现所述标记信息,包括:
    在所述当前场景图像的目标图像位置叠加呈现所述实时传感数据。
  21. 一种确定并呈现目标标记信息的方法,其中,所述方法包括:
    第一用户设备通过摄像装置拍摄关于当前场景的初始场景图像,根据所述初始场景图像对所述当前场景进行三维跟踪初始化;
    所述第一用户设备通过所述摄像装置拍摄关于所述当前场景的实时场景图像,通过三维跟踪获取所述摄像装置的实时位姿信息及所述当前场景对应的3D点云信息,获取用户在所述当前场景的实时场景图像中输入的标记信息及所述标记信息对应的目标图像位置;
    所述第一用户设备根据所述实时位姿信息及所述目标图像位置确定对应的目标空间位置,根据所述目标空间位置及所述标记信息生成对应的目标标记信息,其中,所述目标标记信息用于在所述3D点云信息的目标空间位置叠加呈现所述标记信息;
    所述第二用户设备获取与所述当前场景相匹配的所述目标场景信息,其中,所述目标场景信息包括对应的目标标记信息,所述目标标记信息包括对应标记信息及目标空间位置;
    所述第二用户设备通过所述摄像装置拍摄关于所述当前场景的当前场景图像;
    所述第二用户设备根据所述摄像装置的当前位姿信息及所述目标空间位置确定对应的目标图像位置,在所述当前场景图像的目标图像位置叠加呈现所述标记信息。
  22. 一种确定目标标记信息的第一用户设备,其中,该设备包括:
    一一模块,用于通过摄像装置拍摄关于当前场景的初始场景图像,根据所述初始场景图像对所述当前场景进行三维跟踪初始化;
    一二模块,用于通过所述摄像装置拍摄关于所述当前场景的实时场景图像,通过三维跟踪获取所述摄像装置的实时位姿信息及所述当前场景对应的3D点云信息,获取用户在所述当前场景的实时场景图像中输入的标记信息及所述标记信息对应的目标图像位置;
    一三模块,用于根据所述实时位姿信息及所述目标图像位置确定对应的目标空间位置,根据所述目标空间位置及所述标记信息生成对应的目标标记信息,其中,所述目标标记信息用于在所述3D点云信息的目标空间位置叠加呈现所述标记信息。
  23. 一种呈现目标标记信息的第二用户设备,其中,该设备包括:
    二一模块,用于获取与当前场景相匹配的目标场景信息,其中,所述目标场景信息包括对应的目标标记信息,所述目标标记信息包括对应标记信息及目标空间位置;
    二二模块,用于通过所述摄像装置拍摄关于所述当前场景的当前场景图像;
    二三模块,用于根据所述摄像装置的当前位姿信息及所述目标空间位置确定对应的目标图像位置,在所述当前场景图像的目标图像位置叠加呈现所述标记信息。
  24. 一种计算机设备,其中,该设备包括:
    处理器;以及
    被安排成存储计算机可执行指令的存储器,所述可执行指令在被执行时使所述处理器执行如权利要求1至20中任一项所述方法的步骤。
  25. 一种计算机可读存储介质,其上存储有计算机程序/指令,其特征在于,该计算机程序/指令在被执行时使得系统进行执行如权利要求1至20中任一项所述方法的步骤。
  26. 一种计算机程序产品,包括计算机程序/指令,其特征在于,该计算机程序/指令被处理器执行时实现权利要求1至20中任一项所述方法的步骤。
PCT/CN2022/110472 2021-09-09 2022-08-05 一种确定和呈现目标标记信息的方法与设备 WO2023035829A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111056350.6 2021-09-09
CN202111056350.6A CN113741698B (zh) 2021-09-09 2021-09-09 一种确定和呈现目标标记信息的方法与设备

Publications (1)

Publication Number Publication Date
WO2023035829A1 true WO2023035829A1 (zh) 2023-03-16

Family

ID=78737547

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/110472 WO2023035829A1 (zh) 2021-09-09 2022-08-05 一种确定和呈现目标标记信息的方法与设备

Country Status (2)

Country Link
CN (1) CN113741698B (zh)
WO (1) WO2023035829A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116363331A (zh) * 2023-04-03 2023-06-30 北京百度网讯科技有限公司 图像生成方法、装置、设备以及存储介质

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113741698B (zh) * 2021-09-09 2023-12-15 亮风台(上海)信息科技有限公司 一种确定和呈现目标标记信息的方法与设备
CN114332417B (zh) * 2021-12-13 2023-07-14 亮风台(上海)信息科技有限公司 一种多人场景交互的方法、设备、存储介质及程序产品
CN114089836B (zh) * 2022-01-20 2023-02-28 中兴通讯股份有限公司 标注方法、终端、服务器和存储介质
WO2023168836A1 (zh) * 2022-03-11 2023-09-14 亮风台(上海)信息科技有限公司 一种投影交互方法、设备、介质及程序产品
CN115460539B (zh) * 2022-06-30 2023-12-15 亮风台(上海)信息科技有限公司 一种获取电子围栏的方法、设备、介质及程序产品
CN115439635B (zh) * 2022-06-30 2024-04-26 亮风台(上海)信息科技有限公司 一种呈现目标对象的标记信息的方法与设备
CN117768627A (zh) * 2022-09-16 2024-03-26 华为技术有限公司 一种增强现实方法和计算装置
CN115268658A (zh) * 2022-09-30 2022-11-01 苏芯物联技术(南京)有限公司 一种基于增强现实的多方远程空间圈画标记方法
CN117369633A (zh) * 2023-10-07 2024-01-09 上海铱奇科技有限公司 一种基于ar的信息交互方法及系统
CN117745988A (zh) * 2023-12-20 2024-03-22 亮风台(上海)信息科技有限公司 一种用于呈现ar标签信息的方法与设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108830894A (zh) * 2018-06-19 2018-11-16 亮风台(上海)信息科技有限公司 基于增强现实的远程指导方法、装置、终端和存储介质
CN109669541A (zh) * 2018-09-04 2019-04-23 亮风台(上海)信息科技有限公司 一种用于配置增强现实内容的方法与设备
CN111709973A (zh) * 2020-06-16 2020-09-25 北京百度网讯科技有限公司 目标跟踪方法、装置、设备及存储介质
CN112907671A (zh) * 2021-03-31 2021-06-04 深圳市慧鲤科技有限公司 点云数据生成方法、装置、电子设备及存储介质
CN113741698A (zh) * 2021-09-09 2021-12-03 亮风台(上海)信息科技有限公司 一种确定和呈现目标标记信息的方法与设备

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103529959B (zh) * 2013-01-21 2016-06-29 Tcl集团股份有限公司 基于关键点射线碰撞检测的框选方法、系统及电子设备
US9911235B2 (en) * 2014-11-14 2018-03-06 Qualcomm Incorporated Spatial interaction in augmented reality
CN111199583B (zh) * 2018-11-16 2023-05-16 广东虚拟现实科技有限公司 一种虚拟内容显示方法、装置、终端设备及存储介质
CN110197148B (zh) * 2019-05-23 2020-12-01 北京三快在线科技有限公司 目标物体的标注方法、装置、电子设备和存储介质
CN111415388B (zh) * 2020-03-17 2023-10-24 Oppo广东移动通信有限公司 一种视觉定位方法及终端
CN111311684B (zh) * 2020-04-01 2021-02-05 亮风台(上海)信息科技有限公司 一种进行slam初始化的方法与设备
CN111950521A (zh) * 2020-08-27 2020-11-17 深圳市慧鲤科技有限公司 一种增强现实交互的方法、装置、电子设备及存储介质
CN113048980B (zh) * 2021-03-11 2023-03-14 浙江商汤科技开发有限公司 位姿优化方法、装置、电子设备及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108830894A (zh) * 2018-06-19 2018-11-16 亮风台(上海)信息科技有限公司 基于增强现实的远程指导方法、装置、终端和存储介质
CN109669541A (zh) * 2018-09-04 2019-04-23 亮风台(上海)信息科技有限公司 一种用于配置增强现实内容的方法与设备
CN111709973A (zh) * 2020-06-16 2020-09-25 北京百度网讯科技有限公司 目标跟踪方法、装置、设备及存储介质
CN112907671A (zh) * 2021-03-31 2021-06-04 深圳市慧鲤科技有限公司 点云数据生成方法、装置、电子设备及存储介质
CN113741698A (zh) * 2021-09-09 2021-12-03 亮风台(上海)信息科技有限公司 一种确定和呈现目标标记信息的方法与设备

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116363331A (zh) * 2023-04-03 2023-06-30 北京百度网讯科技有限公司 图像生成方法、装置、设备以及存储介质
CN116363331B (zh) * 2023-04-03 2024-02-23 北京百度网讯科技有限公司 图像生成方法、装置、设备以及存储介质

Also Published As

Publication number Publication date
CN113741698A (zh) 2021-12-03
CN113741698B (zh) 2023-12-15

Similar Documents

Publication Publication Date Title
WO2023035829A1 (zh) 一种确定和呈现目标标记信息的方法与设备
CN111311684B (zh) 一种进行slam初始化的方法与设备
CN109887003B (zh) 一种用于进行三维跟踪初始化的方法与设备
CN108304075B (zh) 一种在增强现实设备进行人机交互的方法与设备
EP2583254B1 (en) Mobile device based content mapping for augmented reality environment
KR20220009393A (ko) 이미지 기반 로컬화
CN111161347B (zh) 一种进行slam初始化的方法与设备
WO2023109153A1 (zh) 一种多人场景交互的方法、设备、存储介质及程序产品
CN108681389B (zh) 一种通过阅读设备进行阅读的方法与设备
CN109584377B (zh) 一种用于呈现增强现实内容的方法与设备
CN109656363B (zh) 一种用于设置增强交互内容的方法与设备
WO2017181699A1 (zh) 一种三维展现监控视频的方法及装置
TWI783472B (zh) Ar場景內容的生成方法、展示方法、電子設備及電腦可讀儲存介質
US11682206B2 (en) Methods and apparatus for projecting augmented reality enhancements to real objects in response to user gestures detected in a real environment
WO2022077977A1 (zh) 视频转换方法及视频转换装置
JPWO2021096931A5 (zh)
CN112819956A (zh) 一种三维地图构建方法、系统及服务器
WO2017132011A1 (en) Displaying geographic data on an image taken at an oblique angle
CN109636922B (zh) 一种用于呈现增强现实内容的方法与设备
CN115630191B (zh) 基于全动态视频的时空数据集检索方法、装置及存储介质
CN109669541B (zh) 一种用于配置增强现实内容的方法与设备
CN114170366B (zh) 基于点线特征融合的三维重建方法及电子设备
US20220198764A1 (en) Spatially Aware Environment Relocalization
CN111796754B (zh) 一种用于提供电子书籍的方法与设备
CN112102145B (zh) 图像处理方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22866303

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2022866303

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2022866303

Country of ref document: EP

Effective date: 20240329