WO2022237048A1 - 位姿获取方法、装置、电子设备、存储介质及程序 - Google Patents

位姿获取方法、装置、电子设备、存储介质及程序 Download PDF

Info

Publication number
WO2022237048A1
WO2022237048A1 PCT/CN2021/121034 CN2021121034W WO2022237048A1 WO 2022237048 A1 WO2022237048 A1 WO 2022237048A1 CN 2021121034 W CN2021121034 W CN 2021121034W WO 2022237048 A1 WO2022237048 A1 WO 2022237048A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
information
pose
pose information
matching
Prior art date
Application number
PCT/CN2021/121034
Other languages
English (en)
French (fr)
Inventor
夏睿
谢卫健
王楠
张也
Original Assignee
浙江商汤科技开发有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 浙江商汤科技开发有限公司 filed Critical 浙江商汤科技开发有限公司
Priority to KR1020227017413A priority Critical patent/KR102464271B1/ko
Publication of WO2022237048A1 publication Critical patent/WO2022237048A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/006Mixed reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2203/00Indexing scheme relating to G06F3/00 - G06F3/048
    • G06F2203/01Indexing scheme relating to G06F3/01
    • G06F2203/012Walk-in-place systems for allowing a user to walk in a virtual environment while constraining him to a given position in the physical environment

Definitions

  • the present application relates to the technical field of object recognition, and in particular to a pose acquisition method, device, electronic equipment, storage medium and program.
  • augmented reality Augmented Reality
  • AR Augmented Reality
  • 3D object recognition can present an augmented reality rendering effect based on the recognition results, but in related technologies, the use of augmented reality technology to recognize 3D objects has low efficiency and poor accuracy.
  • the present application provides a pose acquisition method, device, electronic equipment, storage medium and program.
  • a pose acquisition method including:
  • the first pose information In response to the absence or invalidity of the first pose information, acquire a second image, and determine the first pose information according to the second image and the space model, wherein the second image is the The scanned image of the object to be scanned, the first pose information is the pose information of the electronic device and/or the object to be scanned;
  • the method further includes: in response to the fact that the second pose information and the first pose information do not meet a preset first condition, determining that the first pose information is invalid. In this way, the efficiency and accuracy of pose information acquisition can be improved, which is also conducive to improving the efficiency and accuracy of recognizing three-dimensional objects using augmented reality technology.
  • determining the first pose information according to the second image and the space model includes: acquiring at least one image frame corresponding to the second image in the space model, and determining the The first matching information between the feature points of the second image and the feature points of the at least one image frame; obtain the point cloud corresponding to the at least one image frame in the space model, and according to the first matching information, determining second matching information between the feature points of the second image and the three-dimensional points of the point cloud; and determining the first pose information according to the first matching information and the second matching information.
  • the acquiring at least one image frame corresponding to the second image in the spatial model includes: determining the similarity between each image frame in the spatial model and the second image; An image frame whose similarity with the second image is higher than a preset similarity threshold is determined as an image frame corresponding to the second image. In this way, the image frames corresponding to the second image can be selected more accurately.
  • the determining the first matching information between the feature points of the second image and the feature points of the at least one image frame includes: acquiring the feature points and descriptors of the second image, And the feature point and descriptor of the image frame; according to the descriptor of the second image and the descriptor of the image frame, determine the initial distance between the feature point of the second image and the feature point of the image frame matching information; determining a fundamental matrix and/or an essential matrix of the second image and the image frame according to the initial matching information; filtering the initial matching information according to the fundamental matrix and/or essential matrix, Obtain the first matching information.
  • the initial matching information is filtered by using the fundamental matrix and/or the essential matrix, so that the inliers in the initial matching information can be completely preserved in the first matching information.
  • the determining the second matching information between the feature point of the second image and the 3D point of the point cloud according to the first matching information includes: matching the feature point with the feature point of the image frame The feature points of the second image that are point-matched are matched with the three-dimensional points of the point cloud corresponding to the feature points of the image frame to obtain the second matching information. In this way, by using the feature points of the image frame as a medium, the matching of the feature points of the second image with the three-dimensional points of the point cloud is realized.
  • the determining the first pose information according to the first matching information and the second matching information includes: acquiring the acceleration of gravity of the electronic device; according to the first matching information and the second matching information and the gravitational acceleration to determine the first pose information.
  • the obtained first pose information is relatively accurate, and furthermore, the second pose information obtained based on the first pose information can be relatively accurate.
  • the determining the second pose information according to the first image, the space model and the first pose information includes: according to the first pose information and the first image, determining third pose information corresponding to the first image, wherein the third pose information is the pose information of the electronic device relative to the object to be scanned; according to the third pose information, Determining third matching information between the feature points of the first image and the three-dimensional points of the point cloud of the space model; in response to the third matching information meeting the preset second condition, according to the third pose Information, determine the fourth matching information between the feature points of the first image and the feature points of at least one image frame of the space model; according to the third matching information and the fourth matching information, determine the fourth matching information Two pose information.
  • the second pose information can be further accurately determined.
  • the first pose information includes fourth pose information, wherein the fourth pose information is the pose information of the object to be scanned in the world coordinate system; according to the The first pose information and the first image, and determining the third pose information corresponding to the first image includes: acquiring fifth pose information from a positioning module according to the first image, wherein the first pose information is The five pose information is pose information of the electronic device in a world coordinate system; the third pose information is determined according to the fourth pose information and the fifth pose information. In this way, through the absolute poses of the object to be scanned and the electronic device in the unified coordinate system, the relative poses of the two can be quickly and accurately determined.
  • the determining the third matching information between the feature points of the first image and the three-dimensional points of the point cloud of the space model according to the third pose information includes: according to the third pose information Three pose information, projecting the point cloud of the space model onto the first image to form a plurality of projection points, and extracting a descriptor of each projection point; extracting feature points of the first image frame and a descriptor; according to the descriptor corresponding to the feature point and the descriptor of the projection point, determine third matching information between the feature point and the three-dimensional point of the point cloud.
  • the camera model can be used to project the point cloud onto the first image.
  • the determining fourth matching information between the feature points of the first image and the feature points of at least one image frame of the space model according to the third pose information includes: according to the The third pose information, and the pose information of the image frame of the space model, determine at least one image frame matching the third pose information; acquire the feature points and descriptors of the first image, and The feature points and descriptors of the image frame matched with the third pose information; according to the descriptor of the first image and the descriptor of the image frame, determine the feature points of the first image and the image The fourth matching information between the feature points of the frame.
  • the pose information of an image frame is the same or similar to that of a first image (for example, the angle difference is within a preset range)
  • the determining the second pose information according to the third matching information and the fourth matching information includes: acquiring the acceleration of gravity of the electronic device; according to the third matching information , the fourth matching information and the acceleration of gravity to determine the second pose information. In this way, by introducing the acceleration of gravity, the second pose information can be determined more accurately.
  • the second pose information and the first pose information meet a preset first condition, including: the error between the second pose information and the first pose information is smaller than a preset A preset error threshold; and/or, the third matching information meets the preset second condition, including: the number of matching combinations between the first image and the point cloud of the space model is greater than the preset number Threshold, wherein the matching combination includes feature points and three-dimensional points that match each other.
  • the number of matching combinations between the first image and the point cloud of the space model is used to set the second condition, so that the matching degree of the third matching information can be judged more reasonably.
  • the obtaining the space model of the object to be scanned includes: obtaining multiple frames of modeling images scanned by the electronic device for the object to be scanned, and synchronously obtaining the sixth pose corresponding to each frame of modeling images Information; matching the feature points of the multi-frame modeling image, and triangulating the feature points according to the matching result to form a point cloud; determining at least one image frame from the multi-frame modeling image, and Determine the point cloud corresponding to each image frame; construct the at least one image frame, the sixth pose information corresponding to each image frame, and the point cloud as a space model. In this way, the constructed spatial model has more detailed information.
  • a pose acquisition device including:
  • An acquisition module configured to acquire a first image and a spatial model of the object to be scanned, wherein the first image is an image scanned by the electronic device for the object to be scanned;
  • the first pose module is configured to acquire a second image in response to missing or invalid first pose information, and determine the first pose information according to the second image and the space model, wherein the first pose information
  • the second image is an image scanned by the electronic device for the object to be scanned, and the first pose information is pose information of the electronic device and/or the object to be scanned;
  • the second pose module is configured to determine second pose information according to the first image, the space model, and the first pose information, wherein the second pose information is the electronic device and the /or pose information of the object to be scanned;
  • An output module configured to output the second pose information in response to the second pose information and the first pose information meeting a preset first condition.
  • the output module is further configured to: determine that the first pose information is invalid in response to the second pose information and the first pose information not meeting a preset first condition.
  • the first pose module is further configured to:
  • the first pose module when configured to acquire at least one image frame corresponding to the second image in the space model, it is further configured to:
  • An image frame whose similarity with the second image is higher than a preset similarity threshold is determined as an image frame corresponding to the second image.
  • the first pose module when the first pose module is configured to determine the first matching information between the feature points of the second image and the feature points of the at least one image frame, it is further configured to:
  • the initial matching information is filtered according to the fundamental matrix and/or the essential matrix to obtain the first matching information.
  • the first pose module when the first pose module is configured to determine the second matching information between the feature points of the second image and the three-dimensional points of the point cloud according to the first matching information, it is further configured for:
  • the first pose module when the first pose module is configured to determine the first pose information according to the first matching information and the second matching information, it is further configured to:
  • the second pose module is further configured to:
  • the third pose information is the position of the electronic device relative to the object to be scanned Pose information
  • the third pose information determine third matching information between the feature points of the first image and the three-dimensional points of the point cloud of the space model
  • the second pose information is determined according to the third matching information and the fourth matching information.
  • the first pose information includes fourth pose information, wherein the fourth pose information is pose information of the object to be scanned in a world coordinate system;
  • the second pose module is configured to determine the third pose information corresponding to the first image according to the first pose information and the first image, it is further configured to:
  • the fifth pose information is pose information of the electronic device in a world coordinate system
  • the third pose information is determined according to the fourth pose information and the fifth pose information.
  • the second pose module is configured to determine, according to the third pose information, third matching information between the feature points of the first image and the three-dimensional points of the point cloud of the space model, Also configured as:
  • the third pose information project the point cloud of the space model onto the first image to form a plurality of projection points, and extract a descriptor of each projection point;
  • Third matching information between the feature point and the 3D point of the point cloud is determined according to the descriptor corresponding to the feature point and the descriptor of the projection point.
  • the second pose module is configured to determine fourth matching information between feature points of the first image and feature points of at least one image frame of the space model according to the third pose information , it is also configured as:
  • the second pose module when the second pose module is configured to determine the second pose information according to the third matching information and the fourth matching information, it is further configured to:
  • the second pose information is determined according to the third matching information, the fourth matching information and the gravitational acceleration.
  • the second pose information and the first pose information meet a preset first condition, including:
  • the error between the second pose information and the first pose information is smaller than a preset error threshold; and/or,
  • the third matching information meets the preset second condition, including:
  • the number of matching combinations between the first image and the point cloud of the space model is greater than a preset number threshold, wherein the matching combination includes a pair of feature points and three-dimensional points that match each other.
  • the acquisition module when the acquisition module is configured to acquire the object space model to be scanned, it is also configured to:
  • the at least one image frame, the sixth pose information corresponding to each image frame, and the point cloud are constructed as a space model.
  • an electronic device the device includes a memory and a processor, the memory is configured to store computer instructions that can be run on the processor, and the processor is configured to execute the The method described in the first aspect is implemented when the computer instructions are described.
  • a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, the method described in the first aspect is implemented.
  • a computer program includes computer readable code, and when the computer readable code runs in an electronic device, a processor of the electronic device executes a configuration In order to realize the method described in the first aspect.
  • the second image is acquired, according to the The second image and the space model determine the first pose information, and then determine the second pose information according to the first image, the space model and the first pose information, and finally respond to the second The pose information and the first pose information meet a preset first condition, and the second pose information is output.
  • the first pose information is determined according to the second image scanned by the electronic device for the object to be scanned and the space model, and after the first pose information is determined, it can be continuously used to determine the first For the second pose information corresponding to the image, the first pose information is not updated until the second pose information and the first pose information do not meet the first condition, so the efficiency and accuracy of pose information acquisition can be improved, that is, the Efficiency and accuracy in recognizing 3D objects using augmented reality technology.
  • FIG. 1A is a flowchart of a method for acquiring pose information shown in an embodiment of the present application
  • FIG. 1B shows a schematic diagram of a system architecture to which the method for obtaining pose information according to an embodiment of the present disclosure can be applied;
  • FIG. 2 is a schematic diagram of an image collected by an electronic device shown in an embodiment of the present application
  • Fig. 3 is a schematic diagram of the acquisition process of the spatial model shown in the embodiment of the present application.
  • FIG. 4 is a schematic structural diagram of a pose information acquisition device shown in an embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of an electronic device shown in an embodiment of the present application.
  • first, second, third, etc. may be used in this application to describe various information, the information should not be limited to these terms. These terms are only used to distinguish information of the same type from one another. For example, without departing from the scope of the present application, first information may also be called second information, and similarly, second information may also be called first information. Depending on the context, the word “if” as used herein may be interpreted as “at” or “when” or “in response to a determination.”
  • the electronic device when using augmented reality technology to identify three-dimensional objects, displays the space model and at the same time presents the preview image scanned for the object to be scanned.
  • the angle of view so that the outline of the object to be scanned on the electronic device matches the outline of the space model, and on this basis, the object to be scanned can be tracked by scanning, and once the tracking fails, the user needs to return to the originally found suitable Viewing angle, re-align the spatial model and the preview image of the object to be scanned, so the efficiency and accuracy of tracking the object to be scanned are low, the user operation is difficult, and the user experience is poor.
  • At least one embodiment of the present application provides a pose acquisition method. Please refer to FIG. 1A , which shows the flow of the method, including steps S101 to S103.
  • the method may be performed by electronic equipment such as a terminal device or a server
  • the terminal device may be user equipment (User Equipment, UE), mobile device, user terminal, terminal, cellular phone, cordless phone, personal digital assistant (Personal Digital Assistant, PDA) handheld devices, computing devices, vehicle-mounted devices, wearable devices, etc.
  • the method can be implemented by calling the computer-readable instructions stored in the memory by the processor.
  • the method may be executed by a server, and the server may be a local server, a cloud server, or the like.
  • step S101 a first image and a spatial model of the object to be scanned are acquired, wherein the first image is an image obtained by scanning the object to be scanned by an electronic device.
  • the electronic device may be a terminal device such as a mobile phone or a tablet computer, or may be an image acquisition device such as a camera or a scanning device.
  • the acquisition of the first image in this step, the determination and output of the second pose information in the subsequent steps, and the determination and update of the first pose information may also be performed by the terminal device.
  • the object to be scanned may be a three-dimensional object targeted by augmented reality technology.
  • the electronic device When the electronic device scans the object to be scanned, it can continuously obtain multiple frames of the first image, that is, obtain an image sequence; the first image is any frame in the above image sequence, that is, the pose provided by the embodiment of the present application
  • the acquisition method can be performed for any frame in the above image sequence; in some possible implementation manners, the method can be performed for each frame of the first image obtained when the electronic device scans the object to be scanned, namely The second pose information corresponding to the first image of each frame is obtained.
  • the object to be scanned may be stationary, and the electronic device moves around the object to be scanned. For example, in the example shown in FIG.
  • the electronic device moves around the object to be scanned 21 and
  • the acquisition process of three image frames when acquiring an image that is, the electronic device acquires an image frame at the position of the previous image frame 22, then moves to the position of the previous image frame 23 to acquire an image frame, and then moves to the current image
  • the position of frame 24 captures an image frame.
  • the space model includes a point cloud of the object to be scanned, at least one image frame, and pose information corresponding to each image frame (such as the sixth pose information mentioned below).
  • the image frame can be understood as an image captured by the electronic device under the corresponding sixth pose information of the object to be scanned.
  • Each image frame corresponds to a part of the point cloud, and the corresponding relationship can be determined by the triangulation relationship of the image feature points during the modeling process, and can also be determined by the pose information.
  • step S102 in response to the absence or invalidity of the first pose information, a second image is obtained, and the first pose information is determined according to the second image and the space model, wherein the second image is
  • the electronic device scans the image obtained with respect to the object to be scanned, and the first pose information is pose information of the electronic device and/or the object to be scanned.
  • the first pose information is missing, so the first pose information needs to be determined.
  • the first pose information needs to be re-determined, that is, update first pose information.
  • the pose information of the electronic device may be pose information (Tcw) of the electronic device in the world coordinate system, that is, pose information of the electronic device relative to the origin of the world coordinate system.
  • the pose information of the object to be scanned may be the pose information (Tow) of the object to be scanned in the world coordinate system, that is, the pose information of the object to be scanned relative to the origin of the world coordinate system.
  • the pose information of the electronic device and the object to be scanned may be pose information (Tco) of the electronic device relative to the object to be scanned.
  • step S103 according to the first image, the space model and the first pose information, determine the second pose information, wherein the second pose information is the electronic device and/or the waiting The pose information of the scanned object.
  • the first pose information For each frame of the first image, the first pose information must be used when determining the corresponding second pose information, and the first pose information can be reused until it is updated. Due to the use of the first pose information, the user can avoid the operation of manually aligning the model and the object to be scanned, thereby improving the efficiency and accuracy of obtaining the second pose information, thereby improving the efficiency and accuracy of tracking the object to be scanned.
  • the first pose information can be determined by a detector or a detection module, and the detector or detection module is used to obtain an image scanned by the electronic device as a second image, and determine the first pose information according to the second image and the space model, that is, to detect
  • the tracker or detection module is used to obtain the tracking starting point, that is, to guide the tracker to track the object to be scanned.
  • the second pose information can be determined by a tracker or a tracking module, and the tracker or a tracking module is used to obtain an image scanned by the electronic device as the first image, and use the first image, the space model and the first pose information to determine the second pose information.
  • Pose information that is, the tracker or tracking module is used to track the object to be scanned.
  • the first pose information When determining the first pose information, only the first image and space model can be used, and there is no other guidance information.
  • the second pose information on the basis of using the second image and space model, the first bit is also added. Therefore, the speed of determining the first pose information is slower than the speed of determining the second pose information, that is, the efficiency of determining the first pose information is lower than that of determining the second pose information, so the first position
  • the determination of the pose information can improve the accuracy of the second pose information, and the reuse of the first pose information by the second pose information can improve efficiency.
  • a frame of image scanned by the electronic device can be used not only as the first image, but also as the second image, or as the first image and the second image at the same time.
  • the image scanned by the electronic device can be used as the first image; when the first pose information exists and is valid, there is no need to determine Or when updating the first pose information, the image scanned by the electronic device can be used as the second image; when a frame of image scanned by the electronic device is used as the first image to determine the first pose information, the electronic device has not yet scanned
  • the next frame of image is obtained (for example, the electronic device has not moved relative to the object to be scanned or has not yet reached the period of collecting the next frame of image after moving), the frame of image can continue to be used as the second image for determining the second pose information.
  • step S104 outputting the second pose information in response to the second pose information meeting the first preset condition with the first pose information.
  • an error threshold may be preset, and a first condition may be preset that an error between the second pose information and the first pose information is smaller than the above error threshold.
  • the same type of pose can be compared, that is, the pose information of the electronic device in the first pose information in the world coordinate system can be compared with the second pose information.
  • the pose information of the electronic device in the world coordinate system in the pose information can also be compared with the pose information of the object to be scanned in the world coordinate system in the first pose information and the pose information of the object to be scanned in the second pose information.
  • the second pose information and the first pose information meet the first condition, which means that the second pose information is consistent with the first pose information, and both pose information are valid poses, so the second pose information Outputting means outputting the second pose information of the first image of the frame, and meanwhile the first pose information can continue to be used to determine the second pose information of the next frame of the first image.
  • the second pose information is more comprehensive than the first pose information, and has strong pertinence and high determination efficiency for each frame of the first image, so outputting the second pose information is more convenient for tracking the object to be scanned.
  • the second pose information and the first pose information do not meet the first condition, which can indicate that the second pose information is inconsistent with the second pose information, then at least one of the two pose information is an invalid pose, so the first
  • the second pose information cannot be output as a valid pose, that is, the first image of the frame does not obtain a valid pose, and the first pose information cannot continue to be used to determine the second pose information of the first image in the next frame, that is, it needs
  • the first pose information is updated, and at this time it can be determined that the first pose information is invalid. Updating the first pose information refers to reacquiring the second image, using the reacquired second image to re-determine the first pose information, and deleting the original first pose information.
  • a corresponding augmented reality rendering effect may be presented according to the second pose information.
  • the second image is acquired, according to the The second image and the space model determine the first pose information, and then determine the second pose information according to the first image, the space model and the first pose information, and finally respond to the second The pose information and the first pose information meet a preset first condition, and the second pose information is output; otherwise, it is determined that the first pose information is invalid.
  • the first pose information is determined according to the second image scanned by the electronic device for the object to be scanned and the space model, and after the first pose information is determined, it can be continuously used to determine the first
  • the second pose information corresponding to the image the first pose information is not updated until the second pose information and the first pose information do not meet the first condition, so the efficiency and accuracy of pose information acquisition can be improved, that is, It is beneficial to improve the efficiency and accuracy of recognizing three-dimensional objects using the augmented reality technology.
  • the first pose information may be determined according to the second image and the space model in the following manner: first, at least one image frame corresponding to the second image in the space model is obtained, And determine the first matching information between the feature points of the second image and the feature points of the at least one image frame (because the feature points of the second image and the image frame are two-dimensional points, the first matching information is two-dimensional -two-dimensional (2 Dimensional-2 Dimensional, 2D-2D) matching); Next, obtain the point cloud corresponding to the at least one image frame in the space model, and determine the described first matching information according to the first matching information The second matching information between the feature point of the second image and the three-dimensional point of the point cloud (because the feature point of the second image is a two-dimensional point, the second matching information is two-dimensional-three-dimensional (2 Dimensional-3 Dimensional, 2D-3D) matching); finally, according to the first matching information and the second matching information, determine the first pose information.
  • FIG. 1B shows a schematic diagram of a system architecture to which the pose acquisition method of the embodiment of the present disclosure can be applied; as shown in FIG. 1B , the system architecture includes: a pose acquisition terminal 201 , a network 202 and an electronic device 203 .
  • the pose acquisition terminal 201 and the electronic device 203 establish a communication connection through the network 202, and the electronic device 203 reports the image scanned for the object to be scanned to the pose acquisition terminal 201 through the network 202; the pose acquisition terminal 201 Acquire the first image and the space model of the object to be scanned.
  • the pose acquisition terminal 201 uploads the output second pose information to the network 202 .
  • the electronic device 203 may include an image acquisition device or an image scanning device, and the pose acquisition terminal 201 may include a vision processing device capable of processing visual information or a remote server.
  • the network 202 may be connected in a wired or wireless manner.
  • the electronic device 203 can communicate with the visual processing device through a wired connection, such as performing data communication through a bus;
  • the electronic device 203 can perform data interaction with a remote server through a wireless network.
  • the electronic device 203 may be a vision processing device with a video capture module, or a host with a camera.
  • the pose acquisition method of the embodiment of the present disclosure may be executed by the electronic device 203, and the above-mentioned system architecture may not include the network 202 and the server.
  • the similarity between each image frame in the space model and the second image can be determined first, and then the similarity with the second image
  • An image frame whose similarity is higher than a preset similarity threshold is determined as an image frame corresponding to the second image.
  • the similarity threshold is preset in advance, the higher the threshold, the fewer image frames corresponding to the second image will be screened out, and the lower the threshold, the more image frames corresponding to the second image will be screened out.
  • the pose information of the image frame corresponding to the second image is the same as or similar to the pose information of the second image.
  • the Euclidean distance between the feature points of the image frame and the feature points of the second image can be calculated, and then the similarity can be obtained according to the Euclidean distance.
  • the image frame in the space model can be converted into image retrieval information, and enough feature points of the second image can be extracted, and then image retrieval can be used to find Image frames for similarity thresholding.
  • Descriptors of all image frames can be clustered layer by layer through a clustering algorithm (such as a K-means clustering (k-means) algorithm), so as to obtain image retrieval information composed of words representing these descriptors.
  • the image retrieval method refers to determining the condition that the similarity with the feature points of the second image exceeds the similarity threshold, and then using the above conditions to traverse each information in the image retrieval information, and filtering out the information that meets the above conditions , and use the image frame corresponding to the filtered information as the image frame whose similarity with the second image is higher than the similarity threshold.
  • the initial matching information when determining the first matching information between the feature points of the second image and the feature points of the at least one image frame: first obtain the feature points and descriptors of the second image, and the Feature points and descriptors; then according to the descriptors of the second image and the descriptors of the image frame, determine the initial matching information between the feature points of the second image and the feature points of the image frame; then according to The initial matching information determines the fundamental matrix and/or essential matrix of the second image and the image frame; finally, according to the fundamental matrix and/or essential matrix, the initial matching information is filtered to obtain the the first matching information.
  • the descriptor with the closest Hamming distance can be found in the image frame, and then conversely, for each descriptor in the image frame A descriptor finds the descriptor with the closest Hamming distance in the second image. If a descriptor in the second image and a descriptor in the image frame are the descriptors with the closest Hamming distance to each other, it is considered that the above two Descriptor matching, and then determine the matching of the two feature points corresponding to the above two descriptors, and all the matching feature points constitute the initial matching information.
  • the fundamental matrix and/or the essential matrix when determining the fundamental matrix and/or the essential matrix, it may be calculated by a random sample consensus algorithm (Random Sample Consensus, RANSAC).
  • RANSAC Random Sample Consensus
  • multiple fundamental matrices and/or essential matrices can also be calculated by RANSAC and 5-point algorithm, and the interior points of each fundamental matrix and/or essential matrix are determined, and then the fundamental matrix and/or essential matrix with the largest number of interior points The matrix is determined as the final calculation result. If the two matching feature points conform to the fundamental matrix and/or essential matrix, then the two feature points are interior points; on the contrary, if the two matching feature points do not conform to the fundamental matrix and/or essential matrix, then the The two feature points are outliers. When the basic matrix and/or essential matrix is used to filter the initial matching information, the inliers in the initial matching information are retained, that is, the outliers in the initial matching information are deleted.
  • the first matching information with the feature points of the image frame can be The feature points of the second image are matched with the 3D points of the point cloud corresponding to the feature points of the image frame to obtain the second matching information. That is to say, the feature points of the second image are matched with the 3D points of the point cloud by using the feature points of the image frame as a medium.
  • the acceleration of gravity of the electronic device may be obtained first; then according to the first matching information and the second matching information, and matching the information with the gravitational acceleration to determine the first pose information.
  • the electronic device may have an acceleration sensor and/or a gyroscope, and thus may obtain the acceleration of gravity from the acceleration sensor and/or the gyroscope.
  • the PnP (pespective-n-point) algorithm can be used to solve the first pose information by using the first matching information, and the first pose information can be solved by using the second matching information by decomposing the fundamental matrix and/or essential matrix. pose information.
  • the constraint condition of the acceleration of gravity can be added, that is, the acceleration of gravity is used to constrain the rotation angle (such as roll angle and pitch angle) in the pose of the electronic device.
  • the above two solving processes can be combined in the Hybrid form to solve the first pose information, that is, the first pose information is solved by comprehensively using the first matching information, the second matching information and the acceleration of gravity.
  • the first matching information can provide a constraint of 1 degree of freedom
  • the second matching information can provide constraints of 2 degrees of freedom
  • the acceleration of gravity provides 1 degree of freedom
  • a certain number of first matching can be randomly selected Information
  • a certain amount of second matching information and the acceleration of gravity are combined to form six degrees of freedom to solve the first pose information.
  • the first matching information can be constructed through the relationship of the Plücke coordinate system to construct an equation, and the The first matching information constructs an equation through the camera projection matrix model, and then solves multiple simultaneous equations through a solver (such as Grobner Basis Solution); or uses the above two solving processes independently through RANSAC to solve the problem in a robust manner.
  • the first pose information that is, according to different frequency ratios, alternately select the first matching information and the acceleration of gravity to solve the first pose information, and the second matching information and the acceleration of gravity to solve the first pose information, and the obtained Error calculation is performed between the first pose information and all matching information.
  • the number of interior points is large enough (for example, exceeds a certain threshold)
  • it is determined that the first pose information at this time is accurate, and the solution is ended.
  • the obtained first pose information is more accurate, which in turn can make the The second pose information obtained from the first pose information is more accurate.
  • the first pose information may be determined by the detector or the detection module for use by the tracker or the tracking module.
  • the second pose information may be determined according to the first image, the space model, and the first pose information in the following manner: first, according to the first pose information and the the first image, and determine the third pose information corresponding to the first image, wherein the third pose information is the pose information of the electronic device relative to the object to be scanned; next, according to the The third pose information determines the third matching information between the feature points of the first image and the three-dimensional points of the point cloud of the space model (since the feature points of the first image are two-dimensional points, the third matching information 2D-3D matching); Next, in response to the third matching information meeting the preset second condition, according to the third pose information, determine the feature points of the first image and the spatial model The fourth matching information between the feature points of at least one image frame (because the feature points of the first image and the image frame are two-dimensional points, the fourth matching information is 2D-3D matching); finally, according to the third matching information and the fourth matching information to determine the second pose information.
  • the first pose information may include fourth pose information
  • the fourth pose information is coordinate information (Tow) of the object to be scanned in the world coordinate system.
  • the fourth pose information remains unchanged.
  • the fifth pose can be obtained from the positioning module first according to the first image information, wherein the fifth pose information is the pose information (Tcw) of the electronic device in the world coordinate system; then according to the fourth pose information and the fifth pose information, determine the Third pose information.
  • the positioning module can be a Visual Inertial Simultaneous Localization and Mapping (VISLAM) module, and VISLAM can output the pose information of the electronic device in the world coordinate system in real time during operation.
  • the pose information of the object to be scanned in the world coordinate system is the absolute pose of the object to be scanned, and the pose information of the electronic device in the world coordinate system is the absolute pose of the electronic device.
  • the absolute pose in the unified coordinate system determines the relative pose of the two, that is, the pose information (Tco) of the electronic device relative to the object to be scanned, or the pose information (Toc) of the object to be scanned relative to the electronic device, the above
  • the pose information (Tco) of the electronic device relative to the object to be scanned is selected as the third pose information, and of course the pose information (Toc) of the object to be scanned relative to the electronic device can also be selected as the third pose information.
  • the third pose information when determining the third matching information between the feature points of the first image and the three-dimensional points of the point cloud of the space model: first, according to the third pose information, Projecting the point cloud of the space model onto the first image to form a plurality of projection points, and extracting a descriptor of each projection point; then extracting feature points and descriptors of the first image frame; Finally, according to the descriptor corresponding to the feature point and the descriptor of the projected point, third matching information between the feature point and the 3D point of the point cloud is determined.
  • the third pose information can represent the relative pose of the electronic device that took the first image and the object to be scanned, that is, it can represent the direction and angle of the electronic device and the object to be scanned, so the camera model can be used to map the point cloud projection as the first on an image.
  • each 3D point of the point cloud corresponds to at least one feature point of the image frame, and extracting a 3D point corresponding to The descriptors of all the feature points of the 3D point are obtained by fusing these descriptors to obtain the descriptors of the projection points of the 3D point.
  • the third matching information when determining the third matching information, you can first find the descriptor of the projection point with the closest Hamming distance for the descriptor of each feature point, and then conversely, for the descriptor of each projection point Find the descriptor of the feature point with the closest Hamming distance. If the descriptor of a feature point and the descriptor of a projected point are the descriptors with the closest Hamming distance, the two descriptors above are considered to match, and then the above The feature points corresponding to the two descriptors are matched with the 3D points, and all the matched feature points and 3D points constitute the third matching information.
  • the second condition may be that the number of matching combinations between the first image and the point cloud of the space model is greater than a preset number threshold.
  • the matching combination includes a pair of feature points and three-dimensional points that match each other.
  • the number of matching combinations represents the effectiveness of the first pose information to a certain extent. If the first pose information is invalid, the number of matching combinations will inevitably decrease or disappear. If the first pose information is valid, the matching combination The number must be more.
  • the judgment of the second condition is a pre-judgment step before the validity of the first pose information is judged in step S104, if the third matching information does not meet the second condition, that is, the number of matching combinations is less than or equal to the preset number threshold, the first pose information and the second pose information must not meet the first condition, so there is no need to perform subsequent steps to solve the second pose information, and it can be directly determined that the first pose information is invalid, if the third matching information meets
  • the second condition that is, the number of matching combinations is greater than the preset number threshold, it is not possible to directly determine whether the first pose information is valid, so continue to solve the second pose information, and based on the first pose information and the second pose information Whether the pose information meets the first condition is used to judge the validity of the first pose information.
  • the fourth matching information between the feature points of the first image and the feature points of at least one image frame of the space model according to the third pose information, the third position pose information, and the pose information of each image frame of the space model determine at least one image frame that matches the third pose information; then acquire the feature points and descriptors of the first image, and The feature points and descriptors of each image frame matched by the third pose information; finally, according to the descriptors of the first image and the descriptors of the image frame, determine the feature points and the descriptors of the first image and the The fourth matching information between the feature points of the image frame.
  • Each image frame has pose information (such as the sixth pose information below), which represents the relative pose of the electronic device that acquires the image frame and the object to be scanned, that is, the electronic device is in this relative pose , the image frame can be obtained; and the third pose information represents the relative pose of the electronic device that obtains the first image and the object to be scanned, that is, when the electronic device is in the relative pose, the first image can be obtained .
  • pose information of an image frame is the same or similar to that of a first image (for example, the angle difference is within a preset range), it can be determined that the image frame matches the first image.
  • the descriptor with the closest Hamming distance can be found in the image frame for each descriptor in the first image, and then conversely, for each descriptor in the image frame, in the first image Find the descriptor with the closest Hamming distance, if a certain descriptor in the first image and a certain descriptor in the image frame are the descriptors with the closest Hamming distance to each other, it is considered that the above two descriptors match, and then determine The two feature points corresponding to the above two descriptors are matched, and all the matched feature points form the fourth matching information.
  • the gravitational acceleration of the electronic device may be obtained first; then according to the third matching information, the first 4. Match the information with the gravitational acceleration to determine the second pose information.
  • the electronic device may have an acceleration sensor and/or a gyroscope, and thus may obtain the acceleration of gravity from the acceleration sensor and/or the gyroscope.
  • the PnP algorithm can be used to obtain the second pose information by using the fourth matching information, and the second pose information can be obtained by using the third matching information by decomposing the fundamental matrix and/or essential matrix algorithm.
  • the constraint condition of the acceleration of gravity can be added, that is, the acceleration of gravity is used to constrain the rotation angle (such as roll angle and pitch angle) in the pose of the electronic device.
  • the above two solving processes can be combined in the Hybrid form to solve the second pose information, that is, the second pose information can be solved by comprehensively using the third matching information, the fourth matching information and the acceleration of gravity.
  • the first matching information can provide a constraint of 1 degree of freedom
  • the second matching information can provide constraints of 2 degrees of freedom
  • the acceleration of gravity provides 1 degree of freedom
  • a certain number of third matching can be randomly selected Information
  • a certain amount of fourth matching information and gravitational acceleration are combined to form six degrees of freedom to solve the second pose information.
  • the fourth matching information can be used to construct an equation through the relationship of the Pluck coordinate system, and the The third matching information constructs an equation through the camera projection matrix model, and then solves multiple simultaneous equations through a solver (such as Grobner Basis Solution); or uses the above two solving processes independently through RANSAC to solve the problem in a robust manner.
  • the second pose information that is, according to different frequency ratios, alternately select the third matching information and the acceleration of gravity to solve the second pose information, and the fourth matching information and the acceleration of gravity to solve the second pose information, and the obtained Error calculation is performed between the second pose information and all matching information.
  • the number of interior points is large enough (for example, exceeds a certain threshold)
  • the second pose information may be determined by the tracker or the tracking module, and the first pose information obtained by the detector or the detection module is used in the determination process. Since the accuracy rate of the first pose information determined by the detector or the detection module is higher than that of the tracker or the tracking module, and the efficiency is lower than that of the tracker, the detector or the detection module is used to determine (reusable) first pose information, Using the tracker or tracking module to frequently output the second pose information can not only determine the tracking starting point of the tracker through the detector or detection module, thereby improving the accuracy of pose acquisition, but also avoiding the manual alignment of the spatial model and the object to be scanned. The cumbersome operation and inaccurate tracking can ensure the efficiency of pose acquisition.
  • the spatial model of the object to be scanned can be obtained in the following manner: First, obtain multiple frames of modeling images scanned by the electronic device for the object to be scanned, and simultaneously obtain the sixth bit corresponding to each frame of modeling images posture information; Next, match the feature points of the multi-frame modeling images, and triangulate the feature points according to the matching results to form a point cloud; Next, from the multi-frame modeling images Determine at least one image frame, and determine the point cloud corresponding to each image frame; finally, construct the at least one image frame, the sixth pose information corresponding to each image frame, and the point cloud as a space model.
  • the method of inter-frame descriptor matching or optical flow tracking matching can be used.
  • the position of a certain landmark in the three-dimensional space can be tracked between consecutive frames through the matching between two frames. Through the matching relationship between these consecutive frames and the pose information of each frame, it can be Construct a system of equations, and by solving this system of equations, the depth information of the landmark position can be obtained.
  • Electronic equipment scans modeling images at a high frequency (for example, 30 hertz (Hz) frequency), and when selecting image frames, only part of the modeling images can be selected, so that the file size of the entire model will not be too large, which is beneficial to Subsequent file sharing can also reduce the memory consumption of the model when running on the mobile phone.
  • a high frequency for example, 30 hertz (Hz) frequency
  • the acquisition process of the spatial model is shown in Figure 3.
  • the user can obtain the three-dimensional bounding box surrounding the object through the application program interface, and guide the user to model around the selected three-dimensional object 301 .
  • the system will establish point clouds and image key frame information of the model at various angles (for example, model image frames 31, 32 to model image frame 38 shown in FIG. 3 ).
  • all the point cloud information in the 3D bounding box is saved, which is the 3D point cloud model of the object.
  • the space model includes a point cloud in a three-dimensional frame and a modeling image frame, and each image frame is marked with sixth pose information.
  • the sixth pose information can be the pose information of the electronic device relative to the object to be scanned.
  • the information can first obtain the pose information of the electronic device in the world coordinate system from the positioning module in the electronic device, such as the VISLAM module, and then the above pose The information is combined with the pre-acquired pose information of the object to be scanned in the world coordinate system to obtain the sixth pose information.
  • the positioning module in the electronic device such as the VISLAM module
  • the terminal device can use the pose information acquisition method provided in this application to scan the product.
  • the product comes with a certain product description and effect display, and the terminal device can be used to start the scanning program, which can run the pose acquisition method provided by this application, so that when the terminal device scans the product, the first pose information can be obtained and the second pose information can be output.
  • Pose information when the second pose information is output, the program can present the corresponding product on the display screen of the terminal device using reality augmentation technology according to the mapping effect between the second pose information and the product description and/or effect display Description and/or effect display.
  • display enhancement technology may be used to present an explanation and/or display effect of the interaction process.
  • FIG. 4 shows a schematic structural diagram of the pose acquisition device 400, including:
  • the obtaining module 401 is configured to obtain a first image and a spatial model of the object to be scanned, wherein the first image is an image scanned by the electronic device for the object to be scanned;
  • the first pose module 402 is configured to acquire a second image in response to missing or invalid first pose information, and determine the first pose information according to the second image and the space model, wherein the The second image is an image scanned by the electronic device for the object to be scanned, and the first pose information is pose information of the electronic device and/or the object to be scanned;
  • the second pose module 403 is configured to determine second pose information according to the first image, the space model, and the first pose information, wherein the second pose information is the electronic device And/or the pose information of the object to be scanned;
  • An output module 404 configured to output the second pose information in response to the second pose information and the first pose information meeting a preset first condition; otherwise, determine the first pose information invalid
  • the first pose module :
  • the first pose module when configured to acquire at least one image frame corresponding to the second image in the space model, it is also configured to:
  • An image frame whose similarity with the second image is higher than a preset similarity threshold is determined as an image frame corresponding to the second image.
  • the first pose module when the first pose module is configured to determine the first matching information between the feature points of the second image and the feature points of the at least one image frame, it is also configured to:
  • the initial matching information is filtered according to the fundamental matrix and/or the essential matrix to obtain the first matching information.
  • the first pose module is configured to determine the second matching information between the feature points of the second image and the three-dimensional points of the point cloud according to the first matching information , also configured as:
  • the first pose module when the first pose module is configured to determine the first pose information according to the first matching information and the second matching information, it is further configured to:
  • the second pose module is further configured to:
  • the third pose information is the position of the electronic device relative to the object to be scanned Pose information
  • the third pose information determine third matching information between the feature points of the first image and the three-dimensional points of the point cloud of the space model
  • the second pose information is determined according to the third matching information and the fourth matching information.
  • the first pose information includes fourth pose information, wherein the fourth pose information is pose information of the object to be scanned in a world coordinate system;
  • the second pose module is configured to determine the third pose information corresponding to the first image according to the first pose information and the first image, it is further configured to:
  • the fifth pose information is pose information of the electronic device in a world coordinate system
  • the third pose information is determined according to the fourth pose information and the fifth pose information.
  • the second pose module is configured to determine a third match between the feature points of the first image and the three-dimensional points of the point cloud of the space model according to the third pose information
  • information is also configured as:
  • the third pose information project the point cloud of the space model onto the first image to form a plurality of projection points, and extract a descriptor of each projection point;
  • Third matching information between the feature point and the 3D point of the point cloud is determined according to the descriptor corresponding to the feature point and the descriptor of the projection point.
  • the second pose module is configured to, according to the third pose information, determine the first position between the feature point of the first image and the feature point of at least one image frame of the space model.
  • matching information it is also configured as:
  • the second pose module when the second pose module is configured to determine the second pose information according to the third matching information and the fourth matching information, it is further configured to:
  • the second pose information is determined according to the third matching information, the fourth matching information and the gravitational acceleration.
  • the second pose information and the first pose information meet a preset first condition, including:
  • the error between the second pose information and the first pose information is smaller than a preset error threshold; and/or,
  • the third matching information meets the preset second condition, including:
  • the number of matching combinations between the first image and the point cloud of the space model is greater than a preset number threshold, wherein the matching combination includes a pair of feature points and three-dimensional points that match each other.
  • the acquisition module when configured to acquire the object space model to be scanned, it is also configured to:
  • the at least one image frame, the sixth pose information corresponding to each image frame, and the point cloud are constructed as a space model.
  • At least one embodiment of the present application provides an electronic device. Please refer to FIG. 5, which shows the structure of the electronic device.
  • the electronic device 500 includes a memory 501 and a processor 502.
  • the memory uses Computer instructions that can be executed on a processor are stored, and the processor is configured to acquire pose information based on the method described in any one of the first aspect when executing the computer instructions.
  • At least one embodiment of the present application provides a computer-readable storage medium, on which a computer program is stored, and when the program is executed by a processor, the method described in any one of the first aspect is implemented.
  • Computer readable storage media may be volatile or nonvolatile computer readable storage media.
  • At least one embodiment of the present application provides a computer program product, including computer readable codes, when the computer readable codes are run on a device, the processor in the device executes to implement any one of the first aspect Directives for the methods described in Item .
  • the present application relates to a pose acquisition method, device, electronic equipment, and storage medium.
  • the method includes: acquiring a first image, wherein the first image is an image scanned by the electronic equipment for the object to be scanned; responding to the first position If the pose information is missing or invalid, obtain the second image, and determine the first pose information according to the second image and the space model, wherein the second image is the image scanned by the electronic device for the object to be scanned, and the first pose information is the electronic The pose information of the device and/or the object to be scanned, wherein the second pose information is the pose information of the electronic device and/or the object to be scanned; according to the first image, the space model and the first pose information, determine the second Pose information: outputting the second pose information in response to the second pose information meeting the first preset condition with the first pose information.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Computer Graphics (AREA)
  • Computer Hardware Design (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)
  • Processing Or Creating Images (AREA)
  • Studio Devices (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

本申请实施例涉及一种位姿获取方法、装置、电子设备、存储介质及程序,所述方法包括:获取第一图像,其中,第一图像为电子设备针对待扫描对象扫描得到的图像;响应于第一位姿信息缺失或无效,获取第二图像,并根据第二图像和空间模型确定第一位姿信息,其中,第二图像为电子设备针对待扫描对象扫描得到的图像,第一位姿信息为电子设备和/或待扫描对象的位姿信息,其中,第二位姿信息为电子设备和/或待扫描对象的位姿信息;根据第一图像、空间模型和第一位姿信息,确定第二位姿信息;响应于第二位姿信息与第一位姿信息符合预设的第一条件,输出第二位姿信息。

Description

位姿获取方法、装置、电子设备、存储介质及程序
相关申请的交叉引用
本专利申请要求2021年05月11日提交的中国专利申请号为202110510890.0、申请人为浙江商汤科技开发有限公司,申请名称为“位姿获取方法、装置、电子设备及存储介质”的优先权,该申请的全文以引用的方式并入本申请中。
技术领域
本申请涉及物体识别技术领域,尤其涉及一种位姿获取方法、装置、电子设备、存储介质及程序。
背景技术
随着人工智能技术的发展,增强现实(Augmented Reality,AR)技术逐渐应用到了生产生活的各个领域。使用增强现实技术进行三维物体识别,可以根据识别结果呈现出增强现实的渲染效果,但是相关技术中使用增强现实技术识别三维物体的效率较低,准确率较差。
发明内容
本申请提供一种位姿获取方法、装置、电子设备、存储介质及程序。
根据本申请实施例的第一方面,提供一种位姿获取方法,包括:
获取第一图像和待扫描对象的空间模型,其中,所述第一图像为电子设备针对待扫描对象扫描得到的图像;
响应于第一位姿信息缺失或无效,获取第二图像,并根据所述第二图像和所述空间模型确定所述第一位姿信息,其中,所述第二图像为所述电子设备针对所述待扫描对象扫描得到的图像,所述第一位姿信息为所述电子设备和/或所述待扫描对象的位姿信息;
根据所述第一图像、所述空间模型和所述第一位姿信息,确定第二位姿信息,其中,所述第二位姿信息为所述电子设备和/或所述待扫描对象的位姿信息;
响应于所述第二位姿信息与所述第一位姿信息符合预设的第一条件,输出所述第二位姿信息。
在一些实施例中,所述方法还包括;响应于所述第二位姿信息与所述第一位姿信息不符合预设的第一条件,确定所述第一位姿信息无效。如此,可以提高位姿信息获取的效率和准确率,也就有利于提高使用增强现实技术识别三维物体的效率和准确率。
在一些实施例中,根据所述第二图像和所述空间模型确定所述第一位姿信息,包括:获取所述空间模型中与所述第二图像对应的至少一个图像帧,并确定所述第二图像的特征点与所述至少一个图象帧的特征点间的第一匹配信息;获取所述空间模型中与所述至少一图像帧对应的点云,并根据所述第一匹配信息,确定所述第二图像的特征点与所述点云的三维点间的第二匹配信息;根据所述第一匹配信息和所述第二匹配信息,确定所述第一位姿信息。
在一些实施例中,所述获取所述空间模型中与所述第二图像对应的至少一个图像帧,包括:确定所述空间模型中的每个图像帧与所述第二图像的相似度;将与所述第二图像的相似度高于预设的相似度阈值的图像帧,确定为与所述第二图像对应的图像帧。如此,能够更加准确的筛选出第二图像对应的图像帧。
在一些实施例中,所述确定所述第二图像的特征点与所述至少一个图象帧的特征点间的第一匹配信息,包括:获取所述第二图像的特征点及描述子,以及所述图像帧的特征点及描述子;根据所述第 二图像的描述子和所述图像帧的描述子,确定所述第二图像的特征点与所述图像帧的特征点间的初始匹配信息;根据所述初始匹配信息,确定所述第二图像与所述图像帧的基础矩阵和/或本质矩阵;根据所述基础矩阵和/或本质矩阵,对所述初始匹配信息进行过滤,得到所述第一匹配信息。如此,利用基础矩阵和/或本质矩阵,对所述初始匹配信息进行过滤,使得第一匹配信息中能够完整保留初始匹配信息中的内点。
在一些实施例中,所述根据所述第一匹配信息,确定所述第二图像的特征点与所述点云的三维点间的第二匹配信息,包括:将与所述图像帧的特征点匹配的所述第二图像的特征点,和与所述图像帧的特征点对应的所述点云的三维点进行匹配,得到所述第二匹配信息。如此,通过将图像帧的特征点作为媒介,实现第二图像的特征点与点云的三维点的匹配。
在一些实施例中,所述根据所述第一匹配信息和所述第二匹配信息,确定所述第一位姿信息,包括:获取所述电子设备的重力加速度;根据所述第一匹配信息和所述第二匹配信息以及所述重力加速度,确定所述第一位姿信息。如此,使得所得到的第一位姿信息较为准确,进而能够使得基于该第一位姿信息得到的第二位姿信息较为准确。
在一些实施例中,所述根据所述第一图像、所述空间模型和所述第一位姿信息,确定第二位姿信息,包括:根据所述第一位姿信息和所述第一图像,确定所述第一图像对应的第三位姿信息,其中,所述第三位姿信息为所述电子设备相对所述待扫描对象的位姿信息;根据所述第三位姿信息,确定所述第一图像的特征点与所述空间模型的点云的三维点间的第三匹配信息;响应于所述第三匹配信息符合预设的第二条件,根据所述第三位姿信息,确定所述第一图像的特征点与所述空间模型的至少一个图像帧的特征点间的第四匹配信息;根据所述第三匹配信息和所述第四匹配信息,确定所述第二位姿信息。如此,通过引入第一图像的特征点与所述空间模型的点云的三维点间的第三匹配信息,能够进一步精确地确定第二位姿信息。
在一些实施例中,所述第一位姿信息包括第四位姿信息,其中,所述第四位姿信息为所述待扫描对象在世界坐标系内的位姿信息;所述根据所述第一位姿信息和所述第一图像,确定所述第一图像对应的第三位姿信息,包括:根据所述第一图像,从定位模块获取第五位姿信息,其中,所述第五位姿信息为所述电子设备在世界坐标系内的位姿信息;根据所述第四位姿信息和所述第五位姿信息,确定所述第三位姿信息。如此,通过待扫描对象和电子设备在统一坐标系内的绝对位姿,能够快速准确地确定二者的相对位姿。
在一些实施例中,所述根据所述第三位姿信息,确定所述第一图像的特征点与所述空间模型的点云的三维点间的第三匹配信息,包括:根据所述第三位姿信息,将所述空间模型的点云投影至所述第一图像上,形成多个投影点,并提取每个所述投影点的描述子;提取所述第一图像帧的特征点和描述子;根据所述特征点对应的描述子和所述投影点的描述子,确定所述特征点与所述点云的三维点间的第三匹配信息。如此,能够实现利用相机模型将点云投影映射当第一图像上。
在一些实施例中,所述根据所述第三位姿信息,确定所述第一图像的特征点与所述空间模型的至少一个图像帧的特征点间的第四匹配信息,包括:根据所述第三位姿信息,以及所述空间模型的图像帧的位姿信息,确定与所述第三位姿信息匹配的至少一个图像帧;获取所述第一图像的特征点和描述子,以及与所述第三位姿信息匹配的图像帧的特征点和描述子;根据所述第一图像的描述子和所述图像帧的描述子,确定所述第一图像的特征点与所述图像帧的特征点间的第四匹配信息。如此,当某个 图像帧的位姿信息与某个第一图像的位姿信息相同或相近(例如角度差在预设范围内)时,能够确定该图像帧与该第一图像相匹配。
在一些实施例中,所述根据所述第三匹配信息和所述第四匹配信息,确定所述第二位姿信息,包括:获取所述电子设备的重力加速度;根据所述第三匹配信息、所述第四匹配信息和所述重力加速度,确定所述第二位姿信息。如此,通过引入重力加速度,能够更加准确地确定第二位姿信息。
在一些实施例中,所述第二位姿信息与所述第一位姿信息符合预设的第一条件,包括:所述第二位姿信息与所述第一位姿信息的误差小于预设的误差阈值;和/或,所述第三匹配信息符合预设的第二条件,包括:所述第一图像与所述空间模型的点云间的匹配组合的数量,大于预设的数量阈值,其中,所述匹配组合包括相互匹配的特征点和三维点。如此,采用第一图像与所述空间模型的点云间的匹配组合的数量设定第二条件,进而能够更加合理地判断第三匹配信息的匹配度。
在一些实施例中,所述获取待扫描对象空间模型,包括:获取所述电子设备针对所述待扫描对象扫描得到多帧建模图像,并同步获取每帧建模图像对应的第六位姿信息;将所述多帧建模图像的特征点进行匹配,并根据匹配结果对所述特征点进行三角化,以形成点云;从所述多帧建模图像中确定至少一个图像帧,并确定每个图像帧对应的点云;将所述至少一个图像帧、每个图像帧对应的第六位姿信息以及所述点云构建为空间模型。如此,使得构建的空间模型具有更多的细节信息。
根据本申请实施例的第二方面,提供一种位姿获取装置,包括:
获取模块,配置为获取第一图像和待扫描对象的空间模型,其中,所述第一图像为电子设备针对待扫描对象扫描得到的图像;
第一位姿模块,配置为响应于第一位姿信息缺失或无效,获取第二图像,并根据所述第二图像和所述空间模型确定所述第一位姿信息,其中,所述第二图像为所述电子设备针对所述待扫描对象扫描得到的图像,所述第一位姿信息为所述电子设备和/或所述待扫描对象的位姿信息;
第二位姿模块,配置为根据所述第一图像、所述空间模型和所述第一位姿信息,确定第二位姿信息,其中,所述第二位姿信息为所述电子设备和/或所述待扫描对象的位姿信息;
输出模块,配置为响应于所述第二位姿信息与所述第一位姿信息符合预设的第一条件,输出所述第二位姿信息。
在一些实施例中,所述输出模块还配置为;响应于所述第二位姿信息与所述第一位姿信息不符合预设的第一条件,确定所述第一位姿信息无效。
在一些实施例中,所述第一位姿模块还配置为:
获取所述空间模型中与所述第二图像对应的至少一个图像帧,并确定所述第二图像的特征点与所述至少一个图象帧的特征点间的第一匹配信息;
获取所述空间模型中与所述至少一图像帧对应的点云,并根据所述第一匹配信息,确定所述第二图像的特征点与所述点云的三维点间的第二匹配信息;
根据所述第一匹配信息和所述第二匹配信息,确定所述第一位姿信息
在一些实施例中,所述第一位姿模块配置为获取所述空间模型中与所述第二图像对应的至少一个图像帧时,还配置为:
确定所述空间模型中的每个图像帧与所述第二图像的相似度;
将与所述第二图像的相似度高于预设的相似度阈值的图像帧,确定为与所述第二图像对应的图像帧。
在一些实施例中,所述第一位姿模块配置为确定所述第二图像的特征点与所述至少一个图象帧的特征点间的第一匹配信息时,还配置为:
获取所述第二图像的特征点及描述子,以及所述图像帧的特征点及描述子;
根据所述第二图像的描述子和所述图像帧的描述子,确定所述第二图像的特征点与所述图像帧的特征点间的初始匹配信息;
根据所述初始匹配信息,确定所述第二图像与所述图像帧的基础矩阵和/或本质矩阵;
根据所述基础矩阵和/或本质矩阵,对所述初始匹配信息进行过滤,得到所述第一匹配信息。
在一些实施例中,所述第一位姿模块配置为根据所述第一匹配信息,确定所述第二图像的特征点与所述点云的三维点间的第二匹配信息时,还配置为:
将与所述图像帧的特征点匹配的所述第二图像的特征点,和与所述图像帧的特征点对应的所述点云的三维点进行匹配,得到所述第二匹配信息。
在一些实施例中,所述第一位姿模块配置为根据所述第一匹配信息和所述第二匹配信息,确定所述第一位姿信息时,还配置为:
获取所述电子设备的重力加速度;
根据所述第一匹配信息和所述第二匹配信息以及所述重力加速度,确定所述第一位姿信息
在一些实施例中,所述第二位姿模块还配置为:
根据所述第一位姿信息和所述第一图像,确定所述第一图像对应的第三位姿信息,其中,所述第三位姿信息为所述电子设备相对所述待扫描对象的位姿信息;
根据所述第三位姿信息,确定所述第一图像的特征点与所述空间模型的点云的三维点间的第三匹配信息;
响应于所述第三匹配信息符合预设的第二条件,根据所述第三位姿信息,确定所述第一图像的特征点与所述空间模型的至少一个图像帧的特征点间的第四匹配信息;
根据所述第三匹配信息和所述第四匹配信息,确定所述第二位姿信息。
在一些实施例中,所述第一位姿信息包括第四位姿信息,其中,所述第四位姿信息为所述待扫描对象在世界坐标系内的位姿信息;
所述第二位姿模块配置为根据所述第一位姿信息和所述第一图像,确定所述第一图像对应的第三位姿信息时,还配置为:
根据所述第一图像,从定位模块获取第五位姿信息,其中,所述第五位姿信息为所述电子设备在世界坐标系内的位姿信息;
根据所述第四位姿信息和所述第五位姿信息,确定所述第三位姿信息。
在一些实施例中,第二位姿模块配置为根据所述第三位姿信息,确定所述第一图像的特征点与所述空间模型的点云的三维点间的第三匹配信息时,还配置为:
根据所述第三位姿信息,将所述空间模型的点云投影至所述第一图像上,形成多个投影点,并提取每个所述投影点的描述子;
提取所述第一图像帧的特征点和描述子;
根据所述特征点对应的描述子和所述投影点的描述子,确定所述特征点与所述点云的三维点间的第三匹配信息。
在一些实施例中,第二位姿模块配置为根据所述第三位姿信息,确定所述第一图像的特征点与所述空间模型的至少一个图像帧的特征点间的第四匹配信息时,还配置为:
根据所述第三位姿信息,以及所述空间模型的图像帧的位姿信息,确定与所述第三位姿信息匹配的至少一个图像帧;
获取所述第一图像的特征点和描述子,以及与所述第三位姿信息匹配的图像帧的特征点和描述子;
根据所述第一图像的描述子和所述图像帧的描述子,确定所述第一图像的特征点与所述图像帧的特征点间的第四匹配信息。
在一些实施例中,第二位姿模块配置为根据所述第三匹配信息和所述第四匹配信息,确定所述第二位姿信息时,还配置为:
获取所述电子设备的重力加速度;
根据所述第三匹配信息、所述第四匹配信息和所述重力加速度,确定所述第二位姿信息。
在一些实施例中,所述第二位姿信息与所述第一位姿信息符合预设的第一条件,包括:
所述第二位姿信息与所述第一位姿信息的误差小于预设的误差阈值;和/或,
所述第三匹配信息符合预设的第二条件,包括:
所述第一图像与所述空间模型的点云间的匹配组合的数量,大于预设的数量阈值,其中,所述匹配组合包括相互匹配的一对特征点和三维点。
在一些实施例中,所述获取模块配置为获取待扫描对象空间模型时,还配置为:
获取所述电子设备针对所述待扫描对象扫描得到多帧建模图像,并同步获取每帧建模图像对应的第六位姿信息;
将所述多帧建模图像的特征点进行匹配,并根据匹配结果对所述特征点进行三角化,以形成点云;
从所述多帧建模图像中确定至少一个图像帧,并确定每个图像帧对应的点云;
将所述至少一个图像帧、每个图像帧对应的第六位姿信息以及所述点云构建为空间模型。
根据本申请实施例的第三方面,提供一种电子设备,所述设备包括存储器、处理器,所述存储器配置为存储可在处理器上运行的计算机指令,所述处理器配置为在执行所述计算机指令时实现第一方面所述的方法。
根据本申请实施例的第四方面,提供一种计算机可读存储介质,其上存储有计算机程序,所述程序被处理器执行时实现第一方面所述的方法。
根据本申请实施例的第五方面提供一种计算机程序,所述计算机程序包括计算机可读代码,在所述计算机可读代码在电子设备中运行的情况下,所述电子设备的处理器执行配置为实现第一方面所述的方法。
根据上述实施例可知,通过获取电子设备针对待扫描对象扫描得到的第一图像和所述待扫描对象的空间模型,并响应于第一位姿信息缺失或无效,获取第二图像,根据所述第二图像和所述空间模型 确定所述第一位姿信息,再根据所述第一图像、所述空间模型和第一位姿信息,确定第二位姿信息,最后响应于所述第二位姿信息与所述第一位姿信息符合预设的第一条件,输出所述第二位姿信息、。由于第一位姿信息是根据所述电子设备针对所述待扫描对象扫描得到的第二图像和所述空间模型确定的,且确定第一位姿信息后,可以持续用于确定多帧第一图像对应的第二位姿信息,直至第二位姿信息与第一位姿信息不符合第一条件才更新一次第一位姿信息,因此可以提高位姿信息获取的效率和准确率,即提高使用增强现实技术识别三维物体的效率和准确率。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本申请。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本申请的实施例,并与说明书一起用于解释本申请的原理。
图1A是本申请实施例示出的位姿信息获取方法的流程图;
图1B示出可以应用本公开实施例的位姿信息获取方法的一种系统架构示意图;
图2是本申请实施例示出的电子设备采集图像的示意图;
图3是本申请实施例示出的空间模型的获取过程的示意图;
图4是本申请实施例示出的位姿信息获取装置的结构示意图;
图5是本申请实施例示出的电子设备的结构示意图。
具体实施方式
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本申请相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本申请的一些方面相一致的装置和方法的例子。
在本申请使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本申请。在本申请和所附权利要求书中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。还应当理解,本文中使用的术语“和/或”是指并包含一个或多个相关联的列出项目的任何或所有可能组合。
应当理解,尽管在本申请可能采用术语第一、第二、第三等来描述各种信息,但这些信息不应限于这些术语。这些术语仅用来将同一类型的信息彼此区分开。例如,在不脱离本申请范围的情况下,第一信息也可以被称为第二信息,类似地,第二信息也可以被称为第一信息。取决于语境,如在此所使用的词语“如果”可以被解释成为“在……时”或“当……时”或“响应于确定”。
相关技术中,使用增强现实技术识别三维物体时,电子设备显示空间模型的同时呈现针对待扫描对象扫描得到的预览图像,用户需要人工对齐空间模型和待扫描对象的预览图像,即需要寻找到合适的视角,使待扫描对象呈现在电子设备上的轮廓和空间模型的轮廓相匹配,才能在此基础上通过扫描跟踪待扫描对象,且一旦跟踪失败,用户还需要重新回到最初找到的合适的视角,重新对齐空间模型和待扫描对象的预览图像,因此跟踪待扫描对象的效率和准确率均较低,用户操作难度大,使用体验差。
第一方面,本申请至少一个实施例提供了一种位姿获取方法,请参照附图1A,其示出了该方法的流程,包括步骤S101至步骤S103。
其中,该方法可以由终端设备或服务器等电子设备执行,终端设备可以为用户设备(User Equipment,UE)、移动设备、用户终端、终端、蜂窝电话、无绳电话、个人数字处理(Personal Digital Assistant,PDA)手持设备、计算设备、车载设备、可穿戴设备等,该方法可以通过处理器调用存储 器中存储的计算机可读指令的方式来实现。或者,可以通过服务器执行该方法,服务器可以为本地服务器、云端服务器等。
在步骤S101中,获取第一图像和待扫描对象的空间模型,其中,所述第一图像为电子设备针对待扫描对象扫描得到的图像。
其中,电子设备可以为手机、平板电脑等终端设备,也可以为相机、扫描设备等图像采集设备。当电子设备为终端设备时,本步骤中获取第一图像、后续步骤中第二位姿信息的确定和输出以及第一位姿信息的确定和更新,也都可以由终端设备执行。待扫描对象可以是现实增强技术针对的三维物体。
电子设备针对待扫描对象进行扫描时,可以连续获得多帧第一图像,即获得一个图像序列;第一图像就是上述图像序列中的任意一帧,也就是说,本申请实施例提供的位姿获取方法可以针对上述图像序列中的任意一帧来执行;在一些可能的实现方式中,可以在电子设备针对待扫描对象进行扫描时,针对得到的每一帧第一图像均执行该方法,即得到每一帧第一图像对应的第二位姿信息。电子设备针对待扫描对象进行扫描时,可以是待扫描对象静止,电子设备环绕待扫描对象进行移动,例如,在图2所示的示例中,示出了电子设备环绕待扫描对象21进行移动并采集图像时的三个图像帧的采集过程,即电子设备在上上一图像帧22的位置采集一帧图像帧,然后移动至上一图像帧23的位置采集一帧图像帧,再移动至当前图像帧24的位置采集一帧图像帧。
空间模型包括该待扫描对象的点云、至少一个图像帧和每个图像帧对应的位姿信息(如下文提到的第六位姿信息)。其中,图像帧可以理解为,电子设备在对应的第六位姿信息下对待扫描对象拍摄得到的图像。每个图像帧对应部分点云,对应关系可以由建模过程中图像特征点三角化的关系确定,还可以由位姿信息确定。
在步骤S102中,响应于第一位姿信息缺失或无效,获取第二图像,并根据所述第二图像和所述空间模型确定所述第一位姿信息,其中,所述第二图像为所述电子设备针对所述待扫描对象扫描得到的图像,所述第一位姿信息为所述电子设备和/或所述待扫描对象的位姿信息。
该方法初始运行时,第一位姿信息是缺失的,因此需要确定第一位姿信息,该方法运行过程中,如果第一位姿信息无效,则需要重新确定第一位姿信息,即更新第一位姿信息。
其中,电子设备的位姿信息可以为电子设备在世界坐标系内的位姿信息(Tcw),即电子设备相对于世界坐标系的原点的位姿信息。待扫描对象的位姿信息可以为待扫描物在世界坐标系内的位姿信息(Tow),即待扫描对象相对于世界坐标系的原点的位姿信息。电子设备和待扫描对象的位姿信息可以为电子设备相对于待扫描对象的位姿信息(Tco)。
在步骤S103中,根据所述第一图像、所述空间模型和第一位姿信息,确定第二位姿信息,其中,所述第二位姿信息为所述电子设备和/或所述待扫描对象的位姿信息。
针对每一帧第一图像,确定对应的第二位姿信息时都要利用第一位姿信息,而第一位姿信息是可以重复利用的,直至其被更新。由于第一位姿信息的利用,可以避免用户人工对齐模型和待扫描对象的操作,从而可以提高获取第二位姿信息的效率和准确率,进而提高跟踪待扫描对象的效率和准确率。
第一位姿信息可以由检测器或检测模块确定,检测器或检测模块用于获取电子设备扫描得到的图像作为第二图像,并根据第二图像和空间模型确定第一位姿信息,即检测器或检测模块用于得出跟踪起点,即指导跟踪器对待扫描对象的跟踪。第二位姿信息可以由跟踪器或跟踪模块确定,跟踪器或跟踪模块用于获取电子设备扫描得到的图像作为第一图像,并利用第一图像、空间模型和第一位姿信息确定第二位姿信息,即跟踪器或跟踪模块用于跟踪待扫描对象。确定第一位姿信息时,只能利用第一图像和空间模型,无其他指导信息,而确定第二位姿信息时,在利用第二图像和空间模型的基础上,还增加了第一位姿信息的指导,因此确定第一位姿信息的速度比确定第二位姿信息的速度慢,即确定第一位姿信息的效率较之确定第二位姿信息的效率低,因此第一位姿信息的确定可以提高第二位姿信息的准确率,第二位姿信息重复利用第一位姿信息可以提高效率。
需要注意的是,电子设备扫描得到的一帧图像,既能够作为第一图像,也能够作为第二图像,还可以同时作为第一图像和第二图像。当第一位姿信息缺失或无效时,即需要确定或更新第一位姿信息时,可以将电子设备扫描得到的图像作为第一图像;当第一位姿信息存在且有效时,即无需确定或更 新第一位姿信息时,可以将电子设备扫描得到的图像作为第二图像;当电子设备扫描得到的一帧图像作为第一图像,用于确定第一位姿信息后,电子设备尚未扫描得到下一帧图像(例如电子设备相对于待扫描对象未发生移动或移动后还未到采集下一帧图像的周期),则该帧图像可以继续作为第二图像,用于确定第二位姿信息。
在步骤S104中,响应于所述第二位姿信息与所述第一位姿信息符合预设的第一条件,输出所述第二位姿信息。
在一种可能的实施方式中,可以预设误差阈值,并预设第一条件为所述第二位姿信息与所述第一位姿信息的误差小于上述误差阈值。比较第一位姿信息和第二位姿信息的误差时,可以比较同种类型的位姿,即可以比较第一位姿信息中的电子设备在世界坐标系内的位姿信息和第二位姿信息中的电子设备在世界坐标系内的位姿信息,也可以比较第一位姿信息中的待扫描对象在世界坐标系内的位姿信息和第二位姿信息中的待扫描对象在世界坐标系内的位姿信息,还可以比较第一位姿信息中的电子设备相对待扫描对象的位姿信息和第二位姿信息中的电子设备相对待扫描对象的位姿信息。
第二位姿信息与第一位姿信息符合第一条件,可以表示第二位姿信息与第一位姿信息一致,则两种位姿信息均为有效位姿,因此将第二位姿信息输出,即将该帧第一图像的第二位姿信息输出,同时第一位姿信息可以继续用于确定下一帧第一图像的第二位姿信息。第二位姿信息较之第一位姿信息更全面,且对于各帧第一图像的针对性强、确定效率高,因此输出第二位姿信息更便于对待扫描对象的跟踪。
第二位姿信息与第一位姿信息不符合第一条件,可以表示第二位姿信息与第二位姿信息不一致,则两种位姿信息中的至少一种为无效位姿,因此第二位姿信息无法作为有效位姿进行输出,即该帧第一图像未获得有效位姿,同时第一位姿信息无法继续用于确定下一帧第一图像的第二位姿信息,即需要对第一位姿信息进行更新,此时可确定第一位姿信息无效。更新第一位姿信息,就是指重新获取第二图像,并利用重新获取的第二图像重新确定第一位姿信息,同时删除原第一位姿信息。
另外,当输出第二位姿信息后,可以根据第二位姿信息呈现对应的增强现实渲染效果。
根据上述实施例可知,通过获取电子设备针对待扫描对象扫描得到的第一图像和所述待扫描对象的空间模型,并响应于第一位姿信息缺失或无效,获取第二图像,根据所述第二图像和所述空间模型确定所述第一位姿信息,再根据所述第一图像、所述空间模型和第一位姿信息,确定第二位姿信息,最后响应于所述第二位姿信息与所述第一位姿信息符合预设的第一条件,输出所述第二位姿信息,否则确定第一位姿信息无效。由于第一位姿信息是根据所述电子设备针对所述待扫描对象扫描得到的第二图像和所述空间模型确定的,且确定第一位姿信息后,可以持续用于确定多帧第一图像对应的第二位姿信息,直至第二位姿信息与第一位姿信息不符合第一条件才更新一次第一位姿信息,因此可以提高位姿信息获取的效率和准确率,也就有利于提高使用增强现实技术识别三维物体的效率和准确率。
本申请的一些实施例中,可以按照下述方式根据第二图像和所述空间模型确定第一位姿信息:首先,获取所述空间模型中与所述第二图像对应的至少一个图像帧,并确定所述第二图像的特征点与所述至少一个图像帧的特征点间的第一匹配信息(由于第二图像和图像帧的特征点均为二维点,因此第一匹配信息二维-二维(2 Dimensional-2 Dimensional,2D-2D)匹配);接下来,获取所述空间模型中与所述至少一图像帧对应的点云,并根据所述第一匹配信息,确定所述第二图像的特征点与所述点云的三维点间的第二匹配信息(由于第二图像的特征点为二维点,因此第二匹配信息为二维-三维(2 Dimensional-3 Dimensional,2D-3D)匹配);最后,根据所述第一匹配信息和所述第二匹配信息,确定所述第一位姿信息。
图1B示出可以应用本公开实施例的位姿获取方法的一种系统架构示意图;如图1B所示,该系统架构中包括:位姿获取终端201、网络202和电子设备203。为实现支撑一个示例性应用,位姿获取终端201和电子设备203通过网络202建立通信连接,电子设备203通过网络202向位姿获取终端201上报针对待扫描对象扫描得到的图像;位姿获取终端201获取到第一图像和待扫描对象的空间模型,首先,响应于第一位姿信息缺失或无效,获取第二图像,并根据第二图像和空间模型确定第一位姿信息;其次,根据第一图像、空间模型和第一位姿信息,确定第二位姿信息;然后响应于第二位姿 信息与第一位姿信息符合预设的第一条件,输出第二位姿信息。最后,位姿获取终端201将输出的第二位姿信息上传至网络202。
作为示例,电子设备203可以包括图像采集设备或者图像扫描设备,位姿获取终端201可以包括具有视觉信息处理能力的视觉处理设备或远程服务器。网络202可以采用有线或无线连接方式。其中,当位姿获取终端201为视觉处理设备时,电子设备203可以通过有线连接的方式与视觉处理设备通信连接,例如通过总线进行数据通信;当位姿获取终端201为远程服务器时,电子设备203可以通过无线网络与远程服务器进行数据交互。
或者,在一些场景中,电子设备203可以是带有视频采集模组的视觉处理设备,可以是带有摄像头的主机。这时,本公开实施例的位姿获取方法可以由电子设备203执行,上述系统架构可以不包含网络202和服务器。
其中,获取空间模型中与第二图像对应的至少一个图像帧时:可以先确定所述空间模型中的每个图像帧与所述第二图像的相似度,再将与所述第二图像的相似度高于预设的相似度阈值的图像帧,确定为与所述第二图像对应的图像帧。相似度阈值提前预设,阈值越高,则筛选出的与第二图像对应的图像帧越少,阈值越低,则筛选出的与第二图像对应的图像帧越多。与第二图像对应的图像帧的位姿信息与第二图像的位姿信息相同或相近。在一个示例中,确定图像帧与第二图像的相似度时,可以通过计算图像帧的特征点与第二图像的特征点的欧氏距离,进而根据欧氏距离得到相似度。
在一些可能的实现方式中,可以将空间模型中的图像帧转换为图像检索信息,并提取第二图像的足够多的特征点,进而使用图像检索的方式查找到与第二图像相似度高于相似度阈值的的图像帧。可以通过聚类算法(例如K均值聚类(k-means)算法)将所有图像帧的描述子进行逐层聚类,从而得到表示这些描述子的单词组成的图像检索信息。图像检索的方式,指的是确定与第二图像的特征点的相似度超过相似度阈值的条件,然后利用上述条件对图像检索信息中的每个信息进行遍历,将满足上述条件的信息筛选出来,并将筛选出的信息对应的图像帧,作为与第二图像相似度高于相似度阈值的的图像帧。
其中,确定所述第二图像的特征点与所述至少一个图像帧的特征点间的第一匹配信息时:可以先获取所述第二图像的特征点及描述子,以及所述图像帧的特征点及描述子;再根据所述第二图像的描述子和所述图像帧的描述子,确定所述第二图像的特征点与所述图像帧的特征点间的初始匹配信息;然后根据所述初始匹配信息,确定所述第二图像与所述图像帧的基础矩阵和/或本质矩阵;最后,根据所述基础矩阵和/或本质矩阵,对所述初始匹配信息进行过滤,得到所述第一匹配信息。
在一些可能的实现方式中,确定初始匹配信息时,可以先为第二图像中的每个描述子在图像帧中寻找汉明距离最近的描述子,然后再反过来,为图像帧中的每个描述子在第二图像中寻找汉明距离最近的描述子,若第二图像中的某个描述子和图像帧中的某个描述子互为汉明距离最近的描述子,则认为上述两个描述子匹配,进而确定上述两个描述子对应的两个特征点匹配,全部的相互匹配的特征点组成了初始匹配信息。
在一些可能的实现方式中,确定基础矩阵和/或本质矩阵时,可以通过随机样本一致算法(Random Sample Consensus,RANSAC)计算得到。优选的,还可以通过RANSAC和5点算法计算出多个基础矩阵和/或本质矩阵,并确定每个基础矩阵和/或本质矩阵的内点,再将内点数最多的基础矩阵和/或本质矩阵确定为最终的计算结果。若相互匹配的两个特征点符合基础矩阵和/或本质矩阵,则该两个特征点为内点;相反的,若相互匹配的两个特征点不符合基础矩阵和/或本质矩阵,则该两个特征点为外点。利用基础矩阵和/或本质矩阵过滤初始匹配信息时,也是保留初始匹配信息中的内点,即删除初始匹配信息中的外点。
其中,根据所述第一匹配信息,确定所述第二图像的特征点与所述点云的三维点间的第二匹配信息时:可以将与所述图像帧的特征点匹配的所述第二图像的特征点,和与所述图像帧的特征点对应的所述点云的三维点进行匹配,得到所述第二匹配信息。也就是说,通过图像帧的特征点作为媒介,将第二图像的特征点与点云的三维点进行匹配。
其中,根据所述第一匹配信息和所述第二匹配信息,确定所述第一位姿信息时:可以先获取所述 电子设备的重力加速度;再根据所述第一匹配信息和所述第二匹配信息以及所述重力加速度,确定所述第一位姿信息。
在一些可能的实现方式中,电子设备可以具有加速度传感器和/或陀螺仪等,因此可以从加速度传感器和/或陀螺仪等获取重力加速度。在计算机视觉中,可以使用PnP(pespective-n-point)算法,利用第一匹配信息求解出第一位姿信息,可以通过分解基础矩阵和/或本质矩阵,利用第二匹配信息求解出第一位姿信息。上述两种求解过程中,都可以加入重力加速度的约束条件,即利用重力加速度约束电子设备位姿中的旋转角(如roll角、pitch角)。然后,可以通过Hybrid形式综合上述两种求解过程,以求解出第一位姿信息,即综合利用第一匹配信息、第二匹配信息和重力加速度求解出第一位姿信息,该求解过程中需要六个不同的自由度,第一匹配信息能够提供1个自由度的约束,第二匹配信息能够提供2个自由度的约束,重力加速度提供1个自由度,可以随机选取一定数量的第一匹配信息、一定数量的第二匹配信息和重力加速度进行组合,构成六个自由度,来求解第一位姿信息,求解时,可以将第一匹配信息通过普吕克坐标系关系构建等式,将第一匹配信息通过相机投影矩阵模型构建等式,再通过求解器(例如Grobner Basis Solution)对联立的多个等式求解;或者分别通过RANSAC方式独立利用上述两种求解过程,以鲁棒求解出第一位姿信息,即按照不同的次数比例,先后交替选择第一匹配信息与重力加速度求解出第一位姿信息,和第二匹配信息与重力加速度求解出第一位姿信息,求解出的第一位姿信息与全部的匹配信息进行误差计算,当内点数量足够大(例如超过一定的阈值)时,确定此时的第一位姿信息是准确的,结束求解。
由于加入了重力加速度的约束条件,而且综合了第一匹配信息(2D-2D匹配)以及第二匹配信息(2D-3D匹配),使得所得到的第一位姿信息较为准确,进而能够使得基于该第一位姿信息得到的第二位姿信息较为准确。
上述实施例中,可以通过检测器或检测模块确定第一位姿信息,以供跟踪器或跟踪模块利用。
本申请的一些实施例中,可以按照下述方式根据所述第一图像、所述空间模型和第一位姿信息,确定第二位姿信息:首先,根据所述第一位姿信息和所述第一图像,确定所述第一图像对应的第三位姿信息,其中,所述第三位姿信息为所述电子设备相对所述待扫描对象的位姿信息;接下来,根据所述第三位姿信息,确定所述第一图像的特征点与所述空间模型的点云的三维点间的第三匹配信息(由于第一图像的特征点为二维点,因此第三匹配信息为2D-3D匹配);接下来,响应于所述第三匹配信息符合预设的第二条件,根据所述第三位姿信息,确定所述第一图像的特征点与所述空间模型的至少一个图像帧的特征点间的第四匹配信息(由于第一图像和图像帧的特征点均为二维点,因此第四匹配信息为2D-3D匹配);最后,根据所述第三匹配信息和所述第四匹配信息,确定所述第二位姿信息。
其中,第一位姿信息可以包括第四位姿信息,第四位姿信息为待扫描对象在世界坐标系内的坐标信息(Tow)。当待扫描对象的位置静止时,则第四位姿信息保持不变。基于此,根据所述第一位姿信息和所述第一图像,确定所述第一图像对应的第三位姿信息时:可以先根据所述第一图像,从定位模块获取第五位姿信息,其中,所述第五位姿信息为所述电子设备在世界坐标系内的位姿信息(Tcw);再根据所述第四位姿信息和所述第五位姿信息,确定所述第三位姿信息。
在一些可能的实现方式中,定位模块可以是视觉惯性同步定位与绘图(Visual Inertial Simultaneous Localization and Mapping,VISLAM)模块,VISLAM在运行过程中可以实时输出电子设备在世界坐标系内的位姿信息。待扫描对象在世界坐标系内的位姿信息为待扫描对象的绝对位姿,电子设备在世界坐标系内的位姿信息为电子设备的绝对位姿,因此可以通过待扫描对象和电子设备在统一坐标系内的绝对位姿,确定二者的相对位姿,即电子设备相对于待扫描对象的位姿信息(Tco),或待扫描对象相对于电子设备的位姿信息(Toc),上述步骤中选择电子设备相对于待扫描对象的位姿信息(Tco)作为第三位姿信息,当然也可以选择待扫描对象相对于电子设备的位姿信息(Toc)作为第三位姿信息。
其中,根据所述第三位姿信息,确定所述第一图像的特征点与所述空间模型的点云的三维点间的第三匹配信息时:可以先根据所述第三位姿信息,将所述空间模型的点云投影至所述第一图像上,形成多个投影点,并提取每个所述投影点的描述子;再提取所述第一图像帧的特征点和描述子;最后根 据所述特征点对应的描述子和所述投影点的描述子,确定所述特征点与所述点云的三维点间的第三匹配信息。
由于第三位姿信息可以表征拍摄第一图像的电子设备与待扫描对象的相对位姿,即能够表征电子设备与待扫描对象的方向和角度,因此可以利用相机模型将点云投影映射当第一图像上。
由于点云的三维点可以是在建模过程中,通过图像帧的特征点匹配和三角化得到的,因此点云的每一个三维点都对应至少一个图像帧的特征点,提取一个三维点对应的所有特征点的描述子,并通过融合这些描述子得到该三维点的投影点的描述子。
在一些可能的实现方式中,确定第三匹配信息时,可以先为每个特征点的描述子寻找汉明距离最近的投影点的描述子,然后再反过来,为每个投影点的描述子寻找汉明距离最近的特征点的描述子,若某个特征点的描述子和某个投影点的描述子互为汉明距离最近的描述子,则认为上述两个描述子匹配,进而确定上述两个描述子对应的特征点和三维点匹配,全部的相互匹配的特征点和三维点组成了第三匹配信息。
本申请实施例中,第二条件可以是所述第一图像与所述空间模型的点云间的匹配组合的数量,大于预设的数量阈值。其中,所述匹配组合包括相互匹配的一对特征点和三维点。匹配组合的数量,在一定程度上表征了第一位姿信息的有效性,若第一位姿信息无效,则匹配组合的数量必然减少或消失,若第一位姿信息有效,则匹配组合的数量必然较多。第二条件的判断,是在步骤S104判断第一位姿信息的有效性之前的一个前置判断步骤,若第三匹配信息不符合第二条件,即匹配组合的数量小于或等于预设的数量阈值,则第一位姿信息必然与第二位姿信息不符合第一条件,因此无需进行后续求解第二位姿信息的步骤,可以直接判定第一位姿信息无效,若第三匹配信息符合第二条件,即匹配组合的数量大于预设的数量阈值,则还不能直接确定第一位姿信息是否有效,因此继续求解出第二位姿信息,并根据第一位姿信息和第二位姿信息是否符合第一条件,来判断第一位姿信息的有效性。
基于此,根据所述第三位姿信息,确定所述第一图像的特征点与所述空间模型的至少一个图像帧的特征点间的第四匹配信息时,可以先根据所述第三位姿信息,以及所述空间模型的每个图像帧的位姿信息,确定与所述第三位姿信息匹配的至少一个图像帧;再获取所述第一图像的特征点和描述子,以及与所述第三位姿信息匹配的每个图像帧的特征点和描述子;最后根据所述第一图像的描述子和所述图像帧的描述子,确定所述第一图像的特征点与所述图像帧的特征点间的第四匹配信息。
每个图像帧均具有位姿信息(如下文的第六位姿信息),该位姿信息表征获取该图像帧的电子设备与待扫描对象的相对位姿,即电子设备在该相对位姿下时,可以获取到该图像帧;而第三位姿信息表征获取第一图像的电子设备与待扫描对象的相对位姿,即电子设备在该相对位姿下时,可以获取到该第一图像。当某个图像帧的位姿信息与某个第一图像的位姿信息相同或相近(例如角度差在预设范围内)时,则可以确定该图像帧与该第一图像相匹配。
确定第四匹配信息时,可以先为第一图像中的每个描述子在图像帧中寻找汉明距离最近的描述子,然后再反过来,为图像帧中的每个描述子在第一图像中寻找汉明距离最近的描述子,若第一图像中的某个描述子和图像帧中的某个描述子互为汉明距离最近的描述子,则认为上述两个描述子匹配,进而确定上述两个描述子对应的两个特征点匹配,全部的相互匹配的特征点组成了第四匹配信息。
其中,根据所述第三匹配信息和所述第四匹配信息,确定所述第二位姿信息时,可以先获取所述电子设备的重力加速度;再根据所述第三匹配信息、所述第四匹配信息和所述重力加速度,确定所述第二位姿信息。
在一些可能的实现方式中,电子设备可以具有加速度传感器和/或陀螺仪等,因此可以从加速度传感器和/或陀螺仪等获取重力加速度。在计算机视觉中,可以使用PnP算法,利用第四匹配信息求解出第二位姿信息,可以通过分解基础矩阵和/或本质矩阵算法,利用第三匹配信息求解出第二位姿信息。上述两种求解过程中,都可以加入重力加速度的约束条件,即利用重力加速度约束电子设备位姿中的旋转角(如roll角、pitch角)。然后,可以通过Hybrid形式综合上述两种求解过程,以求解出第二位姿信息,即综合利用第三匹配信息、第四匹配信息和重力加速度求解出第二位姿信息,该求解过程中需要六个不同的自由度,第一匹配信息能够提供1个自由度的约束,第二匹配信息能够提供2 个自由度的约束,重力加速度提供1个自由度,可以随机选取一定数量的第三匹配信息、一定数量的第四匹配信息和重力加速度进行组合,构成六个自由度,来求解第二位姿信息,求解时,可以将第四匹配信息通过普吕克坐标系关系构建等式,将第三匹配信息通过相机投影矩阵模型构建等式,再通过求解器(例如Grobner Basis Solution)对联立的多个等式求解;或者分别通过RANSAC方式独立利用上述两种求解过程,以鲁棒求解出第二位姿信息,即按照不同的次数比例,先后交替选择第三匹配信息与重力加速度求解出第二位姿信息,和第四匹配信息与重力加速度求解出第二位姿信息,求解出的第二位姿信息与全部的匹配信息进行误差计算,当内点数量足够大(例如超过一定的阈值)时,确定此时的第二位姿信息是准确的,结束求解。
上述实施例中,可以通过跟踪器或跟踪模块确定第二位姿信息,且确定过程中利用检测器或检测模块得到的第一位姿信息。由于检测器或检测模块确定第一位姿信息的准确率高于跟踪器或跟踪模块,且效率低于跟踪器,因此利用检测器或检测模块确定(可以重复利用的)第一位姿信息,利用跟踪器或跟踪模块频繁输出第二位姿信息,既能够通过检测器或检测模块确定跟踪器的跟踪起点,从而提高位姿获取的准确性,且避免手动对齐空间模型与待扫描对象造成的繁琐操作和跟踪不准确,且能够保证位姿获取的效率。
本公开的一些实施例中,可以按照下述方式获取待扫描对象空间模型:首先,获取电子设备针对待扫描对象扫描得到多帧建模图像,并同步获取每帧建模图像对应的第六位姿信息;接下来,将所述多帧建模图像的特征点进行匹配,并根据匹配结果对所述特征点进行三角化,以形成点云;接下来,从所述多帧建模图像中确定至少一个图像帧,并确定每个图像帧对应的点云;最后,将所述至少一个图像帧、每个图像帧对应的第六位姿信息以及所述点云构建为空间模型。
进行特征匹配过程中,可以采用帧间描述子匹配或者是光流跟踪匹配的方法。三角化的过程中,通过两帧之间的匹配可以在连续帧之间对三维空间中某一个路标位置进行跟踪,通过这些连续多帧之间的匹配关系以及每一帧的位姿信息,可以构建等式方程组,通过对这个方程组求解,可以得到这个路标位置的深度信息。
电子设备扫描建模图像时的频率较高(例如采用30赫兹(Hz)的频率),而选择图像帧时可以只选择部分建模图像,从而使得整个模型的文件体积不会太大,有利于后续的文件共享,而且可减少模型在手机端运行时的内存耗费。
在一个示例中,空间模型的获取过程如图3所示,在实际扫描过程中,通过应用程序交互界面用户可以得到包围物体的三维包围框,并引导用户环绕被选中的三维物体301进行建模。在用户的移动过程中,系统会建立模型各个角度下的点云和图像关键帧信息(比如,图3中所示模型图像帧31、32至模型图像帧38)。最后将三维包围框中所有的点云信息保存下来即该物体的三维点云模型。空间模型包括三维框中的点云和建模图像帧,且每个图像帧标注了第六位姿信息。第六位姿信息可以是电子设备相对于待扫描对象的位姿信息,可以先向电子设备内的定位模块,例如VISLAM模块获取电子设备在世界坐标系内的位姿信息,再将上述位姿信息与预先获得的待扫描对象在世界坐标系内的位姿信息进行结合,得到第六位姿信息。
在一些实施例中,终端设备可采用本申请提供的位姿信息获取方法对产品进行扫描。该产品附带一定的产品说明和效果展示,可以采用终端设备启动扫描程序,该程序可以运行本申请提供的位姿获取方法,从而能够在终端设备扫描产品时得到第一位姿信息并输出第二位姿信息,当输出第二位姿信息时,该程序可以根据第二位姿信息与产品说明和/或效果展示的映射效果,在终端设备的显示屏上利用现实增强技术呈现出对应的产品说明和/或效果展示。例如,该产品是冰箱时,可以在第二位姿信息为终端设备正对冰箱的人机交互界面时,利用显示增强技术呈现出交互过程的说明和/或展示效果。
根据本申请实施例的第二方面,提供一种位姿获取装置,请参照附图4,其示出了该位姿获取装置400的结构示意图,包括:
获取模块401,配置为获取第一图像和待扫描对象的空间模型,其中,所述第一图像为电子设备针对待扫描对象扫描得到的图像;
第一位姿模块402,配置为响应于第一位姿信息缺失或无效,获取第二图像,并根据所述第二图像和所述空间模型确定所述第一位姿信息,其中,所述第二图像为所述电子设备针对所述待扫描对象扫描得到的图像,所述第一位姿信息为所述电子设备和/或所述待扫描对象的位姿信息;
第二位姿模块403,配置为根据所述第一图像、所述空间模型和所述第一位姿信息,确定第二位姿信息,其中,所述第二位姿信息为所述电子设备和/或所述待扫描对象的位姿信息;
输出模块404,配置为响应于所述第二位姿信息与所述第一位姿信息符合预设的第一条件,输出所述第二位姿信息;否则,确定所述第一位姿信息无效
在本公开的一些实施例中,所述第一位姿模块:
获取所述空间模型中与所述第二图像对应的至少一个图像帧,并确定所述第二图像的特征点与所述至少一个图象帧的特征点间的第一匹配信息;
获取所述空间模型中与所述至少一图像帧对应的点云,并根据所述第一匹配信息,确定所述第二图像的特征点与所述点云的三维点间的第二匹配信息;
根据所述第一匹配信息和所述第二匹配信息,确定所述第一位姿信息
在本公开的一些实施例中,所述第一位姿模块配置为获取所述空间模型中与所述第二图像对应的至少一个图像帧时,还配置为:
确定所述空间模型中的每个图像帧与所述第二图像的相似度;
将与所述第二图像的相似度高于预设的相似度阈值的图像帧,确定为与所述第二图像对应的图像帧。
在本公开的一些实施例中,所述第一位姿模块配置为确定所述第二图像的特征点与所述至少一个图象帧的特征点间的第一匹配信息时,还配置为:
获取所述第二图像的特征点及描述子,以及所述图像帧的特征点及描述子;
根据所述第二图像的描述子和所述图像帧的描述子,确定所述第二图像的特征点与所述图像帧的特征点间的初始匹配信息;
根据所述初始匹配信息,确定所述第二图像与所述图像帧的基础矩阵和/或本质矩阵;
根据所述基础矩阵和/或本质矩阵,对所述初始匹配信息进行过滤,得到所述第一匹配信息。
在本公开的一些实施例中,所述第一位姿模块配置为根据所述第一匹配信息,确定所述第二图像的特征点与所述点云的三维点间的第二匹配信息时,还配置为:
将与所述图像帧的特征点匹配的所述第二图像的特征点,和与所述图像帧的特征点对应的所述点云的三维点进行匹配,得到所述第二匹配信息。
在本公开的一些实施例中,所述第一位姿模块配置为根据所述第一匹配信息和所述第二匹配信息,确定所述第一位姿信息时,还配置为:
获取所述电子设备的重力加速度;
根据所述第一匹配信息和所述第二匹配信息以及所述重力加速度,确定所述第一位姿信息
在本公开的一些实施例中,所述第二位姿模块还配置为:
根据所述第一位姿信息和所述第一图像,确定所述第一图像对应的第三位姿信息,其中,所述第三位姿信息为所述电子设备相对所述待扫描对象的位姿信息;
根据所述第三位姿信息,确定所述第一图像的特征点与所述空间模型的点云的三维点间的第三匹配信息;
响应于所述第三匹配信息符合预设的第二条件,根据所述第三位姿信息,确定所述第一图像的特征点与所述空间模型的至少一个图像帧的特征点间的第四匹配信息;
根据所述第三匹配信息和所述第四匹配信息,确定所述第二位姿信息。
在本公开的一些实施例中,所述第一位姿信息包括第四位姿信息,其中,所述第四位姿信息为所述待扫描对象在世界坐标系内的位姿信息;
所述第二位姿模块配置为根据所述第一位姿信息和所述第一图像,确定所述第一图像对应的第三位姿信息时,还配置为:
根据所述第一图像,从定位模块获取第五位姿信息,其中,所述第五位姿信息为所述电子设备在世界坐标系内的位姿信息;
根据所述第四位姿信息和所述第五位姿信息,确定所述第三位姿信息。
在本公开的一些实施例中,第二位姿模块配置为根据所述第三位姿信息,确定所述第一图像的特征点与所述空间模型的点云的三维点间的第三匹配信息时,还配置为:
根据所述第三位姿信息,将所述空间模型的点云投影至所述第一图像上,形成多个投影点,并提取每个所述投影点的描述子;
提取所述第一图像帧的特征点和描述子;
根据所述特征点对应的描述子和所述投影点的描述子,确定所述特征点与所述点云的三维点间的第三匹配信息。
在本公开的一些实施例中,第二位姿模块配置为根据所述第三位姿信息,确定所述第一图像的特征点与所述空间模型的至少一个图像帧的特征点间的第四匹配信息时,还配置为:
根据所述第三位姿信息,以及所述空间模型的图像帧的位姿信息,确定与所述第三位姿信息匹配的至少一个图像帧;
获取所述第一图像的特征点和描述子,以及与所述第三位姿信息匹配的图像帧的特征点和描述子;
根据所述第一图像的描述子和所述图像帧的描述子,确定所述第一图像的特征点与所述图像帧的特征点间的第四匹配信息。
在本公开的一些实施例中,第二位姿模块配置为根据所述第三匹配信息和所述第四匹配信息,确定所述第二位姿信息时,还配置为:
获取所述电子设备的重力加速度;
根据所述第三匹配信息、所述第四匹配信息和所述重力加速度,确定所述第二位姿信息。
在本公开的一些实施例中,所述第二位姿信息与所述第一位姿信息符合预设的第一条件,包括:
所述第二位姿信息与所述第一位姿信息的误差小于预设的误差阈值;和/或,
所述第三匹配信息符合预设的第二条件,包括:
所述第一图像与所述空间模型的点云间的匹配组合的数量,大于预设的数量阈值,其中,所述匹配组合包括相互匹配的一对特征点和三维点。
在本公开的一些实施例中,所述获取模块配置为获取待扫描对象空间模型时,还配置为:
获取所述电子设备针对所述待扫描对象扫描得到多帧建模图像,并同步获取每帧建模图像对应的第六位姿信息;
将所述多帧建模图像的特征点进行匹配,并根据匹配结果对所述特征点进行三角化,以形成点云;
从所述多帧建模图像中确定至少一个图像帧,并确定每个图像帧对应的点云;
将所述至少一个图像帧、每个图像帧对应的第六位姿信息以及所述点云构建为空间模型。
关于上述实施例中的装置,其中各个模块执行操作的具体方式已经在第三方面有关该方法的实施例中进行了详细描述,此处将不做详细阐述说明。
第三方面,本申请至少一个实施例提供了一种电子设备,请参照附图5,其示出了该电子设备的结构,所述电子设备500包括存储器501、处理器502,所述存储器用于存储可在处理器上运行的计算机指令,所述处理器用于在执行所述计算机指令时基于第一方面任一项所述的方法对位姿信息进行获取。
第四方面,本申请至少一个实施例提供了一种计算机可读存储介质,其上存储有计算机程序,所述程序被处理器执行时实现第一方面任一项所述的方法。计算机可读存储介质可以是易失性或非易失性计算机可读存储介质。
第五方面,本申请至少一个实施例提供了一种计算机程序产品,包括计算机可读代码,当计算机可读代码在设备上运行时,设备中的处理器执行用于实现如第一方面任一项所述的方法的指令。
在本申请中,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性。术 语“多个”指两个或两个以上,除非另有明确的限定。
本领域技术人员在考虑说明书及实践这里公开的公开后,将容易想到本申请的其它实施方案。本申请旨在涵盖本申请的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本申请的一般性原理并包括本申请未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本申请的真正范围和精神由下面的权利要求指出。
应当理解的是,本申请并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本申请的范围仅由所附的权利要求来限制。
工业实用性
本申请涉及一种位姿获取方法、装置、电子设备及存储介质,所述方法包括:获取第一图像,其中,第一图像为电子设备针对待扫描对象扫描得到的图像;响应于第一位姿信息缺失或无效,获取第二图像,并根据第二图像和空间模型确定第一位姿信息,其中,第二图像为电子设备针对待扫描对象扫描得到的图像,第一位姿信息为电子设备和/或待扫描对象的位姿信息,其中,第二位姿信息为电子设备和/或待扫描对象的位姿信息;根据第一图像、空间模型和第一位姿信息,确定第二位姿信息;响应于第二位姿信息与第一位姿信息符合预设的第一条件,输出第二位姿信息。

Claims (31)

  1. 一种位姿获取方法,其中,所述方法包括:
    获取第一图像和待扫描对象的空间模型,其中,所述第一图像为电子设备针对待扫描对象扫描得到的图像;
    响应于第一位姿信息缺失或无效,获取第二图像,并根据所述第二图像和所述空间模型确定所述第一位姿信息,其中,所述第二图像为所述电子设备针对所述待扫描对象扫描得到的图像,所述第一位姿信息为所述电子设备和/或所述待扫描对象的位姿信息;
    根据所述第一图像、所述空间模型和所述第一位姿信息,确定第二位姿信息,其中,所述第二位姿信息为所述电子设备和/或所述待扫描对象的位姿信息;
    响应于所述第二位姿信息与所述第一位姿信息符合预设的第一条件,输出所述第二位姿信息。
  2. 根据权利要求1所述的位姿获取方法,其中,所述方法还包括;
    响应于所述第二位姿信息与所述第一位姿信息不符合预设的第一条件,确定所述第一位姿信息无效。
  3. 根据权利要求1或2所述的位姿获取方法,其中,所述根据所述第二图像和所述空间模型确定所述第一位姿信息,包括:
    获取所述空间模型中与所述第二图像对应的至少一个图像帧,并确定所述第二图像的特征点与所述至少一个图象帧的特征点间的第一匹配信息;
    获取所述空间模型中与所述至少一图像帧对应的点云,并根据所述第一匹配信息,确定所述第二图像的特征点与所述点云的三维点间的第二匹配信息;
    根据所述第一匹配信息和所述第二匹配信息,确定所述第一位姿信息。
  4. 根据权利要求3所述的位姿获取方法,其中,所述获取所述空间模型中与所述第二图像对应的至少一个图像帧,包括:
    确定所述空间模型中的每个图像帧与所述第二图像的相似度;
    将与所述第二图像的相似度高于预设的相似度阈值的图像帧,确定为与所述第二图像对应的图像帧。
  5. 根据权利要求3或4所述的位姿获取方法,其中,所述确定所述第二图像的特征点与所述至少一个图象帧的特征点间的第一匹配信息,包括:
    获取所述第二图像的特征点及描述子,以及所述图像帧的特征点及描述子;
    根据所述第二图像的描述子和所述图像帧的描述子,确定所述第二图像的特征点与所述图像帧的特征点间的初始匹配信息;
    根据所述初始匹配信息,确定所述第二图像与所述图像帧的基础矩阵和/或本质矩阵;
    根据所述基础矩阵和/或本质矩阵,对所述初始匹配信息进行过滤,得到所述第一匹配信息。
  6. 根据权利要求3至5任一项所述的位姿获取方法,其中,所述根据所述第一匹配信息,确定所述第二图像的特征点与所述点云的三维点间的第二匹配信息,包括:
    将与所述图像帧的特征点匹配的所述第二图像的特征点,和与所述图像帧的特征点对应的所述点云的三维点进行匹配,得到所述第二匹配信息。
  7. 根据权利要求3至6任一项所述的位姿获取方法,其中,所述根据所述第一匹配信息和所述第二匹配信息,确定所述第一位姿信息,包括:
    获取所述电子设备的重力加速度;
    根据所述第一匹配信息和所述第二匹配信息以及所述重力加速度,确定所述第一位姿信息。
  8. 根据权利要求1至7任一项所述的位姿获取方法,其中,所述根据所述第一图像、所述空间模型和所述第一位姿信息,确定第二位姿信息,包括:
    根据所述第一位姿信息和所述第一图像,确定所述第一图像对应的第三位姿信息,其中,所述第三位姿信息为所述电子设备相对所述待扫描对象的位姿信息;
    根据所述第三位姿信息,确定所述第一图像的特征点与所述空间模型的点云的三维点间的第三匹配信息;
    响应于所述第三匹配信息符合预设的第二条件,根据所述第三位姿信息,确定所述第一图像的特征点与所述空间模型的至少一个图像帧的特征点间的第四匹配信息;
    根据所述第三匹配信息和所述第四匹配信息,确定所述第二位姿信息。
  9. 根据权利要求8所述的位姿获取方法,其中,所述第一位姿信息包括第四位姿信息,其中,所述第四位姿信息为所述待扫描对象在世界坐标系内的位姿信息;
    所述根据所述第一位姿信息和所述第一图像,确定所述第一图像对应的第三位姿信息,包括:
    根据所述第一图像,从定位模块获取第五位姿信息,其中,所述第五位姿信息为所述电子设备在世界坐标系内的位姿信息;
    根据所述第四位姿信息和所述第五位姿信息,确定所述第三位姿信息。
  10. 根据权利要求8或9所述的位姿获取方法,其中,所述根据所述第三位姿信息,确定所述第一图像的特征点与所述空间模型的点云的三维点间的第三匹配信息,包括:
    根据所述第三位姿信息,将所述空间模型的点云投影至所述第一图像上,形成多个投影点,并提取每个所述投影点的描述子;
    提取所述第一图像帧的特征点和描述子;
    根据所述特征点对应的描述子和所述投影点的描述子,确定所述特征点与所述点云的三维点间的第三匹配信息。
  11. 根据权利要求8至10任一项所述的位姿获取方法,其中,所述根据所述第三位姿信息,确定所述第一图像的特征点与所述空间模型的至少一个图像帧的特征点间的第四匹配信息,包括:
    根据所述第三位姿信息,以及所述空间模型的图像帧的位姿信息,确定与所述第三位姿信息匹配的至少一个图像帧;
    获取所述第一图像的特征点和描述子,以及与所述第三位姿信息匹配的图像帧的特征点和描述子;
    根据所述第一图像的描述子和所述图像帧的描述子,确定所述第一图像的特征点与所述图像帧的特征点间的第四匹配信息。
  12. 根据权利要求8至11任一项所述的位姿获取方法,其中,所述根据所述第三匹配信息和所述第四匹配信息,确定所述第二位姿信息,包括:
    获取所述电子设备的重力加速度;
    根据所述第三匹配信息、所述第四匹配信息和所述重力加速度,确定所述第二位姿信息。
  13. 根据权利要求8至12任一项所述的位姿获取方法,其中,所述第二位姿信息与所述第一位姿信息符合预设的第一条件,包括:
    所述第二位姿信息与所述第一位姿信息的误差小于预设的误差阈值;和/或,
    所述第三匹配信息符合预设的第二条件,包括:
    所述第一图像与所述空间模型的点云间的匹配组合的数量,大于预设的数量阈值,其中,所述匹配组合包括相互匹配的一对特征点和三维点。
  14. 根据权利要求1至13任一项所述的位姿获取方法,其中,所述获取待扫描对象空间模型,包括:
    获取所述电子设备针对所述待扫描对象扫描得到多帧建模图像,并同步获取每帧建模图像对应的第六位姿信息;
    将所述多帧建模图像的特征点进行匹配,并根据匹配结果对所述特征点进行三角化,以形成点云;
    从所述多帧建模图像中确定至少一个图像帧,并确定每个图像帧对应的点云;
    将所述至少一个图像帧、每个图像帧对应的第六位姿信息以及所述点云构建为空间模型。
  15. 一种位姿获取装置,其中,包括:
    获取模块,配置为获取第一图像和待扫描对象的空间模型,其中,所述第一图像为电子设备针对待扫描对象扫描得到的图像;
    第一位姿模块,配置为响应于第一位姿信息缺失或无效,获取第二图像,并根据所述第二图像和所述空间模型确定所述第一位姿信息,其中,所述第二图像为所述电子设备针对所述待扫描对象扫描得到的图像,所述第一位姿信息为所述电子设备和/或所述待扫描对象的位姿信息;
    第二位姿模块,配置为根据所述第一图像、所述空间模型和所述第一位姿信息,确定第二位姿信息,其中,所述第二位姿信息为所述电子设备和/或所述待扫描对象的位姿信息;
    输出模块,配置为响应于所述第二位姿信息与所述第一位姿信息符合预设的第一条件,输出所述第二位姿信息。
  16. 根据权利要求15所述的位姿获取装置,其中,所述输出模块还配置为;
    响应于所述第二位姿信息与所述第一位姿信息不符合预设的第一条件,确定所述第一位姿信息无效。
  17. 根据权利要求15或16所述的位姿获取装置,其中,所述第一位姿模块还配置为:
    获取所述空间模型中与所述第二图像对应的至少一个图像帧,并确定所述第二图像的特征点与所述至少一个图象帧的特征点间的第一匹配信息;
    获取所述空间模型中与所述至少一图像帧对应的点云,并根据所述第一匹配信息,确定所述第二图像的特征点与所述点云的三维点间的第二匹配信息;
    根据所述第一匹配信息和所述第二匹配信息,确定所述第一位姿信息。
  18. 根据权利要求17所述的位姿获取装置,其中,所述第一位姿模块配置为获取所述空间模型中与所述第二图像对应的至少一个图像帧时,还配置为:
    确定所述空间模型中的每个图像帧与所述第二图像的相似度;
    将与所述第二图像的相似度高于预设的相似度阈值的图像帧,确定为与所述第二图像对应的图像帧。
  19. 根据权利要求17或18所述的位姿获取装置,其中,所述第一位姿模块配置为确定所述第二图像的特征点与所述至少一个图象帧的特征点间的第一匹配信息时,还配置为:
    获取所述第二图像的特征点及描述子,以及所述图像帧的特征点及描述子;
    根据所述第二图像的描述子和所述图像帧的描述子,确定所述第二图像的特征点与所述图像帧的特征点间的初始匹配信息;
    根据所述初始匹配信息,确定所述第二图像与所述图像帧的基础矩阵和/或本质矩阵;
    根据所述基础矩阵和/或本质矩阵,对所述初始匹配信息进行过滤,得到所述第一匹配信息。
  20. 根据权利要求17至19任一项所述的位姿获取装置,其中,所述第一位姿模块配置为根据所述第一匹配信息,确定所述第二图像的特征点与所述点云的三维点间的第二匹配信息时,还配置为:
    将与所述图像帧的特征点匹配的所述第二图像的特征点,和与所述图像帧的特征点对应的所述点云的三维点进行匹配,得到所述第二匹配信息。
  21. 根据权利要求17至20任一项所述的位姿获取装置,其中,所述第一位姿模块配置为根据所述第一匹配信息和所述第二匹配信息,确定所述第一位姿信息时,还配置为:
    获取所述电子设备的重力加速度;
    根据所述第一匹配信息和所述第二匹配信息以及所述重力加速度,确定所述第一位姿信息。
  22. 根据权利要求15至21任一项所述的位姿获取装置,其中,所述第二位姿模块还配置为:
    根据所述第一位姿信息和所述第一图像,确定所述第一图像对应的第三位姿信息,其中,所述第三位姿信息为所述电子设备相对所述待扫描对象的位姿信息;
    根据所述第三位姿信息,确定所述第一图像的特征点与所述空间模型的点云的三维点间的第三匹配信息;
    响应于所述第三匹配信息符合预设的第二条件,根据所述第三位姿信息,确定所述第一图像的特 征点与所述空间模型的至少一个图像帧的特征点间的第四匹配信息;
    根据所述第三匹配信息和所述第四匹配信息,确定所述第二位姿信息。
  23. 根据权利要求22所述的位姿获取装置,其中,所述第一位姿信息包括第四位姿信息,其中,所述第四位姿信息为所述待扫描对象在世界坐标系内的位姿信息;
    所述第二位姿模块配置为根据所述第一位姿信息和所述第一图像,确定所述第一图像对应的第三位姿信息时,还配置为:
    根据所述第一图像,从定位模块获取第五位姿信息,其中,所述第五位姿信息为所述电子设备在世界坐标系内的位姿信息;
    根据所述第四位姿信息和所述第五位姿信息,确定所述第三位姿信息。
  24. 根据权利要求22或23所述的位姿获取装置,其中,第二位姿模块配置为根据所述第三位姿信息,确定所述第一图像的特征点与所述空间模型的点云的三维点间的第三匹配信息时,还配置为:
    根据所述第三位姿信息,将所述空间模型的点云投影至所述第一图像上,形成多个投影点,并提取每个所述投影点的描述子;
    提取所述第一图像帧的特征点和描述子;
    根据所述特征点对应的描述子和所述投影点的描述子,确定所述特征点与所述点云的三维点间的第三匹配信息。
  25. 根据权利要求22至24任一项所述的位姿获取装置,其中,第二位姿模块配置为根据所述第三位姿信息,确定所述第一图像的特征点与所述空间模型的至少一个图像帧的特征点间的第四匹配信息时,还配置为:
    根据所述第三位姿信息,以及所述空间模型的图像帧的位姿信息,确定与所述第三位姿信息匹配的至少一个图像帧;
    获取所述第一图像的特征点和描述子,以及与所述第三位姿信息匹配的图像帧的特征点和描述子;
    根据所述第一图像的描述子和所述图像帧的描述子,确定所述第一图像的特征点与所述图像帧的特征点间的第四匹配信息。
  26. 根据权利要求22至25任一项所述的位姿获取装置,其中,第二位姿模块配置为根据所述第三匹配信息和所述第四匹配信息,确定所述第二位姿信息时,还配置为:
    获取所述电子设备的重力加速度;
    根据所述第三匹配信息、所述第四匹配信息和所述重力加速度,确定所述第二位姿信息。
  27. 根据权利要求22至26任一项所述的位姿获取装置,其中,所述第二位姿信息与所述第一位姿信息符合预设的第一条件,包括:
    所述第二位姿信息与所述第一位姿信息的误差小于预设的误差阈值;和/或,
    所述第三匹配信息符合预设的第二条件,包括:
    所述第一图像与所述空间模型的点云间的匹配组合的数量,大于预设的数量阈值,其中,所述匹配组合包括相互匹配的一对特征点和三维点。
  28. 根据权利要求15至27任一项所述的位姿获取装置,其中,所述获取模块配置为获取待扫描对象空间模型时,还配置为:
    获取所述电子设备针对所述待扫描对象扫描得到多帧建模图像,并同步获取每帧建模图像对应的第六位姿信息;
    将所述多帧建模图像的特征点进行匹配,并根据匹配结果对所述特征点进行三角化,以形成点云;
    从所述多帧建模图像中确定至少一个图像帧,并确定每个图像帧对应的点云;
    将所述至少一个图像帧、每个图像帧对应的第六位姿信息以及所述点云构建为空间模型。
  29. 一种电子设备,其中,所述设备包括存储器、处理器,所述存储器配置为存储可在处理器上运行的计算机指令,所述处理器配置为在执行所述计算机指令时实现权利要求1至14任一项所述的方法。
  30. 一种计算机可读存储介质,其上存储有计算机程序,其中,所述程序被处理器执行时实现权利要求1至14任一所述的方法。
  31. 一种计算机程序,所述计算机程序包括计算机可读代码,在所述计算机可读代码在电子设备中运行的情况下,所述电子设备的处理器执行配置为实现如权利要求1至14中任意一项所述的位姿获取方法。
PCT/CN2021/121034 2021-05-11 2021-09-27 位姿获取方法、装置、电子设备、存储介质及程序 WO2022237048A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
KR1020227017413A KR102464271B1 (ko) 2021-05-11 2021-09-27 포즈 획득 방법, 장치, 전자 기기, 저장 매체 및 프로그램

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110510890.0A CN113190120B (zh) 2021-05-11 2021-05-11 位姿获取方法、装置、电子设备及存储介质
CN202110510890.0 2021-05-11

Publications (1)

Publication Number Publication Date
WO2022237048A1 true WO2022237048A1 (zh) 2022-11-17

Family

ID=76981167

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/121034 WO2022237048A1 (zh) 2021-05-11 2021-09-27 位姿获取方法、装置、电子设备、存储介质及程序

Country Status (4)

Country Link
KR (1) KR102464271B1 (zh)
CN (1) CN113190120B (zh)
TW (1) TW202244680A (zh)
WO (1) WO2022237048A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116758157A (zh) * 2023-06-14 2023-09-15 深圳市华赛睿飞智能科技有限公司 一种无人机室内三维空间测绘方法、系统及存储介质

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113190120B (zh) * 2021-05-11 2022-06-24 浙江商汤科技开发有限公司 位姿获取方法、装置、电子设备及存储介质
CN113808196A (zh) * 2021-09-09 2021-12-17 浙江商汤科技开发有限公司 平面融合定位方法、装置、电子设备及存储介质
CN116352323B (zh) * 2023-04-10 2024-07-30 深圳市晨东智能家居有限公司 一种交互式的焊接环境建模系统及方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20120082319A (ko) * 2011-01-13 2012-07-23 주식회사 팬택 윈도우 형태의 증강현실을 제공하는 장치 및 방법
CN109087359A (zh) * 2018-08-30 2018-12-25 网易(杭州)网络有限公司 位姿确定方法、位姿确定装置、介质和计算设备
CN112197764A (zh) * 2020-12-07 2021-01-08 广州极飞科技有限公司 实时位姿确定方法、装置及电子设备
CN113190120A (zh) * 2021-05-11 2021-07-30 浙江商汤科技开发有限公司 位姿获取方法、装置、电子设备及存储介质

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10515259B2 (en) * 2015-02-26 2019-12-24 Mitsubishi Electric Research Laboratories, Inc. Method and system for determining 3D object poses and landmark points using surface patches
US10970425B2 (en) * 2017-12-26 2021-04-06 Seiko Epson Corporation Object detection and tracking
CN109463003A (zh) * 2018-03-05 2019-03-12 香港应用科技研究院有限公司 对象识别
CN109947886B (zh) * 2019-03-19 2023-01-10 腾讯科技(深圳)有限公司 图像处理方法、装置、电子设备及存储介质
CN110930453B (zh) * 2019-10-30 2023-09-08 北京迈格威科技有限公司 目标物体定位方法、装置及可读存储介质
CN110866496B (zh) * 2019-11-14 2023-04-07 合肥工业大学 基于深度图像的机器人定位与建图方法和装置
CN111199564B (zh) * 2019-12-23 2024-01-05 中国科学院光电研究院 智能移动终端的室内定位方法、装置与电子设备
CN111311758A (zh) * 2020-02-24 2020-06-19 Oppo广东移动通信有限公司 增强现实处理方法及装置、存储介质和电子设备
CN111833457A (zh) * 2020-06-30 2020-10-27 北京市商汤科技开发有限公司 图像处理方法、设备及存储介质
CN112637665B (zh) * 2020-12-23 2022-11-04 北京市商汤科技开发有限公司 增强现实场景下的展示方法、装置、电子设备及存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20120082319A (ko) * 2011-01-13 2012-07-23 주식회사 팬택 윈도우 형태의 증강현실을 제공하는 장치 및 방법
CN109087359A (zh) * 2018-08-30 2018-12-25 网易(杭州)网络有限公司 位姿确定方法、位姿确定装置、介质和计算设备
CN112197764A (zh) * 2020-12-07 2021-01-08 广州极飞科技有限公司 实时位姿确定方法、装置及电子设备
CN113190120A (zh) * 2021-05-11 2021-07-30 浙江商汤科技开发有限公司 位姿获取方法、装置、电子设备及存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116758157A (zh) * 2023-06-14 2023-09-15 深圳市华赛睿飞智能科技有限公司 一种无人机室内三维空间测绘方法、系统及存储介质
CN116758157B (zh) * 2023-06-14 2024-01-30 深圳市华赛睿飞智能科技有限公司 一种无人机室内三维空间测绘方法、系统及存储介质

Also Published As

Publication number Publication date
CN113190120B (zh) 2022-06-24
KR102464271B1 (ko) 2022-11-07
CN113190120A (zh) 2021-07-30
TW202244680A (zh) 2022-11-16

Similar Documents

Publication Publication Date Title
US11928800B2 (en) Image coordinate system transformation method and apparatus, device, and storage medium
WO2022237048A1 (zh) 位姿获取方法、装置、电子设备、存储介质及程序
US10810734B2 (en) Computer aided rebar measurement and inspection system
WO2020206903A1 (zh) 影像匹配方法、装置及计算机可读存储介质
EP3008694B1 (en) Interactive and automatic 3-d object scanning method for the purpose of database creation
JP6430064B2 (ja) データを位置合わせする方法及びシステム
JP5950973B2 (ja) フレームを選択する方法、装置、及びシステム
JP7017689B2 (ja) 情報処理装置、情報処理システムおよび情報処理方法
JP5722502B2 (ja) モバイルデバイスのための平面マッピングおよびトラッキング
JP6184271B2 (ja) 撮像管理装置、撮像管理システムの制御方法およびプログラム
US11094079B2 (en) Determining a pose of an object from RGB-D images
WO2019042426A1 (zh) 增强现实场景的处理方法、设备及计算机存储介质
CN110986969B (zh) 地图融合方法及装置、设备、存储介质
JP6571108B2 (ja) モバイル機器用三次元ジェスチャのリアルタイム認識及び追跡システム
KR20180005168A (ko) 로컬화 영역 설명 파일에 대한 프라이버시-민감 질의
CN111127524A (zh) 一种轨迹跟踪与三维重建方法、系统及装置
CN108958469B (zh) 一种基于增强现实的在虚拟世界增加超链接的方法
US10249058B2 (en) Three-dimensional information restoration device, three-dimensional information restoration system, and three-dimensional information restoration method
CN105809664B (zh) 生成三维图像的方法和装置
CN112750164B (zh) 轻量化定位模型的构建方法、定位方法、电子设备
JP2016021097A (ja) 画像処理装置、画像処理方法、およびプログラム
CN116243837A (zh) 一种画面显示方法、系统、设备及计算机可读存储介质
CN114694257A (zh) 多人实时三维动作识别评估方法、装置、设备及介质
CN112614166A (zh) 基于cnn-knn的点云匹配方法和装置
KR102249380B1 (ko) 기준 영상 정보를 이용한 cctv 장치의 공간 정보 생성 시스템

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2022528237

Country of ref document: JP

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21941614

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21941614

Country of ref document: EP

Kind code of ref document: A1