WO2022237048A1 - 位姿获取方法、装置、电子设备、存储介质及程序 - Google Patents
位姿获取方法、装置、电子设备、存储介质及程序 Download PDFInfo
- Publication number
- WO2022237048A1 WO2022237048A1 PCT/CN2021/121034 CN2021121034W WO2022237048A1 WO 2022237048 A1 WO2022237048 A1 WO 2022237048A1 CN 2021121034 W CN2021121034 W CN 2021121034W WO 2022237048 A1 WO2022237048 A1 WO 2022237048A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- information
- pose
- pose information
- matching
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 72
- 230000004044 response Effects 0.000 claims abstract description 30
- 230000001133 acceleration Effects 0.000 claims description 46
- 239000011159 matrix material Substances 0.000 claims description 45
- 230000005484 gravity Effects 0.000 claims description 28
- 238000004590 computer program Methods 0.000 claims description 8
- 238000005516 engineering process Methods 0.000 description 14
- 230000003190 augmentative effect Effects 0.000 description 11
- 238000001514 detection method Methods 0.000 description 8
- 238000004422 calculation algorithm Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 7
- 230000000694 effects Effects 0.000 description 7
- 238000012545 processing Methods 0.000 description 5
- 230000000007 visual effect Effects 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 230000003993 interaction Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000006978 adaptation Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 238000009877 rendering Methods 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000003416 augmentation Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000003064 k means clustering Methods 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T19/00—Manipulating 3D models or images for computer graphics
- G06T19/006—Mixed reality
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/462—Salient features, e.g. scale invariant feature transforms [SIFT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2203/00—Indexing scheme relating to G06F3/00 - G06F3/048
- G06F2203/01—Indexing scheme relating to G06F3/01
- G06F2203/012—Walk-in-place systems for allowing a user to walk in a virtual environment while constraining him to a given position in the physical environment
Definitions
- the present application relates to the technical field of object recognition, and in particular to a pose acquisition method, device, electronic equipment, storage medium and program.
- augmented reality Augmented Reality
- AR Augmented Reality
- 3D object recognition can present an augmented reality rendering effect based on the recognition results, but in related technologies, the use of augmented reality technology to recognize 3D objects has low efficiency and poor accuracy.
- the present application provides a pose acquisition method, device, electronic equipment, storage medium and program.
- a pose acquisition method including:
- the first pose information In response to the absence or invalidity of the first pose information, acquire a second image, and determine the first pose information according to the second image and the space model, wherein the second image is the The scanned image of the object to be scanned, the first pose information is the pose information of the electronic device and/or the object to be scanned;
- the method further includes: in response to the fact that the second pose information and the first pose information do not meet a preset first condition, determining that the first pose information is invalid. In this way, the efficiency and accuracy of pose information acquisition can be improved, which is also conducive to improving the efficiency and accuracy of recognizing three-dimensional objects using augmented reality technology.
- determining the first pose information according to the second image and the space model includes: acquiring at least one image frame corresponding to the second image in the space model, and determining the The first matching information between the feature points of the second image and the feature points of the at least one image frame; obtain the point cloud corresponding to the at least one image frame in the space model, and according to the first matching information, determining second matching information between the feature points of the second image and the three-dimensional points of the point cloud; and determining the first pose information according to the first matching information and the second matching information.
- the acquiring at least one image frame corresponding to the second image in the spatial model includes: determining the similarity between each image frame in the spatial model and the second image; An image frame whose similarity with the second image is higher than a preset similarity threshold is determined as an image frame corresponding to the second image. In this way, the image frames corresponding to the second image can be selected more accurately.
- the determining the first matching information between the feature points of the second image and the feature points of the at least one image frame includes: acquiring the feature points and descriptors of the second image, And the feature point and descriptor of the image frame; according to the descriptor of the second image and the descriptor of the image frame, determine the initial distance between the feature point of the second image and the feature point of the image frame matching information; determining a fundamental matrix and/or an essential matrix of the second image and the image frame according to the initial matching information; filtering the initial matching information according to the fundamental matrix and/or essential matrix, Obtain the first matching information.
- the initial matching information is filtered by using the fundamental matrix and/or the essential matrix, so that the inliers in the initial matching information can be completely preserved in the first matching information.
- the determining the second matching information between the feature point of the second image and the 3D point of the point cloud according to the first matching information includes: matching the feature point with the feature point of the image frame The feature points of the second image that are point-matched are matched with the three-dimensional points of the point cloud corresponding to the feature points of the image frame to obtain the second matching information. In this way, by using the feature points of the image frame as a medium, the matching of the feature points of the second image with the three-dimensional points of the point cloud is realized.
- the determining the first pose information according to the first matching information and the second matching information includes: acquiring the acceleration of gravity of the electronic device; according to the first matching information and the second matching information and the gravitational acceleration to determine the first pose information.
- the obtained first pose information is relatively accurate, and furthermore, the second pose information obtained based on the first pose information can be relatively accurate.
- the determining the second pose information according to the first image, the space model and the first pose information includes: according to the first pose information and the first image, determining third pose information corresponding to the first image, wherein the third pose information is the pose information of the electronic device relative to the object to be scanned; according to the third pose information, Determining third matching information between the feature points of the first image and the three-dimensional points of the point cloud of the space model; in response to the third matching information meeting the preset second condition, according to the third pose Information, determine the fourth matching information between the feature points of the first image and the feature points of at least one image frame of the space model; according to the third matching information and the fourth matching information, determine the fourth matching information Two pose information.
- the second pose information can be further accurately determined.
- the first pose information includes fourth pose information, wherein the fourth pose information is the pose information of the object to be scanned in the world coordinate system; according to the The first pose information and the first image, and determining the third pose information corresponding to the first image includes: acquiring fifth pose information from a positioning module according to the first image, wherein the first pose information is The five pose information is pose information of the electronic device in a world coordinate system; the third pose information is determined according to the fourth pose information and the fifth pose information. In this way, through the absolute poses of the object to be scanned and the electronic device in the unified coordinate system, the relative poses of the two can be quickly and accurately determined.
- the determining the third matching information between the feature points of the first image and the three-dimensional points of the point cloud of the space model according to the third pose information includes: according to the third pose information Three pose information, projecting the point cloud of the space model onto the first image to form a plurality of projection points, and extracting a descriptor of each projection point; extracting feature points of the first image frame and a descriptor; according to the descriptor corresponding to the feature point and the descriptor of the projection point, determine third matching information between the feature point and the three-dimensional point of the point cloud.
- the camera model can be used to project the point cloud onto the first image.
- the determining fourth matching information between the feature points of the first image and the feature points of at least one image frame of the space model according to the third pose information includes: according to the The third pose information, and the pose information of the image frame of the space model, determine at least one image frame matching the third pose information; acquire the feature points and descriptors of the first image, and The feature points and descriptors of the image frame matched with the third pose information; according to the descriptor of the first image and the descriptor of the image frame, determine the feature points of the first image and the image The fourth matching information between the feature points of the frame.
- the pose information of an image frame is the same or similar to that of a first image (for example, the angle difference is within a preset range)
- the determining the second pose information according to the third matching information and the fourth matching information includes: acquiring the acceleration of gravity of the electronic device; according to the third matching information , the fourth matching information and the acceleration of gravity to determine the second pose information. In this way, by introducing the acceleration of gravity, the second pose information can be determined more accurately.
- the second pose information and the first pose information meet a preset first condition, including: the error between the second pose information and the first pose information is smaller than a preset A preset error threshold; and/or, the third matching information meets the preset second condition, including: the number of matching combinations between the first image and the point cloud of the space model is greater than the preset number Threshold, wherein the matching combination includes feature points and three-dimensional points that match each other.
- the number of matching combinations between the first image and the point cloud of the space model is used to set the second condition, so that the matching degree of the third matching information can be judged more reasonably.
- the obtaining the space model of the object to be scanned includes: obtaining multiple frames of modeling images scanned by the electronic device for the object to be scanned, and synchronously obtaining the sixth pose corresponding to each frame of modeling images Information; matching the feature points of the multi-frame modeling image, and triangulating the feature points according to the matching result to form a point cloud; determining at least one image frame from the multi-frame modeling image, and Determine the point cloud corresponding to each image frame; construct the at least one image frame, the sixth pose information corresponding to each image frame, and the point cloud as a space model. In this way, the constructed spatial model has more detailed information.
- a pose acquisition device including:
- An acquisition module configured to acquire a first image and a spatial model of the object to be scanned, wherein the first image is an image scanned by the electronic device for the object to be scanned;
- the first pose module is configured to acquire a second image in response to missing or invalid first pose information, and determine the first pose information according to the second image and the space model, wherein the first pose information
- the second image is an image scanned by the electronic device for the object to be scanned, and the first pose information is pose information of the electronic device and/or the object to be scanned;
- the second pose module is configured to determine second pose information according to the first image, the space model, and the first pose information, wherein the second pose information is the electronic device and the /or pose information of the object to be scanned;
- An output module configured to output the second pose information in response to the second pose information and the first pose information meeting a preset first condition.
- the output module is further configured to: determine that the first pose information is invalid in response to the second pose information and the first pose information not meeting a preset first condition.
- the first pose module is further configured to:
- the first pose module when configured to acquire at least one image frame corresponding to the second image in the space model, it is further configured to:
- An image frame whose similarity with the second image is higher than a preset similarity threshold is determined as an image frame corresponding to the second image.
- the first pose module when the first pose module is configured to determine the first matching information between the feature points of the second image and the feature points of the at least one image frame, it is further configured to:
- the initial matching information is filtered according to the fundamental matrix and/or the essential matrix to obtain the first matching information.
- the first pose module when the first pose module is configured to determine the second matching information between the feature points of the second image and the three-dimensional points of the point cloud according to the first matching information, it is further configured for:
- the first pose module when the first pose module is configured to determine the first pose information according to the first matching information and the second matching information, it is further configured to:
- the second pose module is further configured to:
- the third pose information is the position of the electronic device relative to the object to be scanned Pose information
- the third pose information determine third matching information between the feature points of the first image and the three-dimensional points of the point cloud of the space model
- the second pose information is determined according to the third matching information and the fourth matching information.
- the first pose information includes fourth pose information, wherein the fourth pose information is pose information of the object to be scanned in a world coordinate system;
- the second pose module is configured to determine the third pose information corresponding to the first image according to the first pose information and the first image, it is further configured to:
- the fifth pose information is pose information of the electronic device in a world coordinate system
- the third pose information is determined according to the fourth pose information and the fifth pose information.
- the second pose module is configured to determine, according to the third pose information, third matching information between the feature points of the first image and the three-dimensional points of the point cloud of the space model, Also configured as:
- the third pose information project the point cloud of the space model onto the first image to form a plurality of projection points, and extract a descriptor of each projection point;
- Third matching information between the feature point and the 3D point of the point cloud is determined according to the descriptor corresponding to the feature point and the descriptor of the projection point.
- the second pose module is configured to determine fourth matching information between feature points of the first image and feature points of at least one image frame of the space model according to the third pose information , it is also configured as:
- the second pose module when the second pose module is configured to determine the second pose information according to the third matching information and the fourth matching information, it is further configured to:
- the second pose information is determined according to the third matching information, the fourth matching information and the gravitational acceleration.
- the second pose information and the first pose information meet a preset first condition, including:
- the error between the second pose information and the first pose information is smaller than a preset error threshold; and/or,
- the third matching information meets the preset second condition, including:
- the number of matching combinations between the first image and the point cloud of the space model is greater than a preset number threshold, wherein the matching combination includes a pair of feature points and three-dimensional points that match each other.
- the acquisition module when the acquisition module is configured to acquire the object space model to be scanned, it is also configured to:
- the at least one image frame, the sixth pose information corresponding to each image frame, and the point cloud are constructed as a space model.
- an electronic device the device includes a memory and a processor, the memory is configured to store computer instructions that can be run on the processor, and the processor is configured to execute the The method described in the first aspect is implemented when the computer instructions are described.
- a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, the method described in the first aspect is implemented.
- a computer program includes computer readable code, and when the computer readable code runs in an electronic device, a processor of the electronic device executes a configuration In order to realize the method described in the first aspect.
- the second image is acquired, according to the The second image and the space model determine the first pose information, and then determine the second pose information according to the first image, the space model and the first pose information, and finally respond to the second The pose information and the first pose information meet a preset first condition, and the second pose information is output.
- the first pose information is determined according to the second image scanned by the electronic device for the object to be scanned and the space model, and after the first pose information is determined, it can be continuously used to determine the first For the second pose information corresponding to the image, the first pose information is not updated until the second pose information and the first pose information do not meet the first condition, so the efficiency and accuracy of pose information acquisition can be improved, that is, the Efficiency and accuracy in recognizing 3D objects using augmented reality technology.
- FIG. 1A is a flowchart of a method for acquiring pose information shown in an embodiment of the present application
- FIG. 1B shows a schematic diagram of a system architecture to which the method for obtaining pose information according to an embodiment of the present disclosure can be applied;
- FIG. 2 is a schematic diagram of an image collected by an electronic device shown in an embodiment of the present application
- Fig. 3 is a schematic diagram of the acquisition process of the spatial model shown in the embodiment of the present application.
- FIG. 4 is a schematic structural diagram of a pose information acquisition device shown in an embodiment of the present application.
- FIG. 5 is a schematic structural diagram of an electronic device shown in an embodiment of the present application.
- first, second, third, etc. may be used in this application to describe various information, the information should not be limited to these terms. These terms are only used to distinguish information of the same type from one another. For example, without departing from the scope of the present application, first information may also be called second information, and similarly, second information may also be called first information. Depending on the context, the word “if” as used herein may be interpreted as “at” or “when” or “in response to a determination.”
- the electronic device when using augmented reality technology to identify three-dimensional objects, displays the space model and at the same time presents the preview image scanned for the object to be scanned.
- the angle of view so that the outline of the object to be scanned on the electronic device matches the outline of the space model, and on this basis, the object to be scanned can be tracked by scanning, and once the tracking fails, the user needs to return to the originally found suitable Viewing angle, re-align the spatial model and the preview image of the object to be scanned, so the efficiency and accuracy of tracking the object to be scanned are low, the user operation is difficult, and the user experience is poor.
- At least one embodiment of the present application provides a pose acquisition method. Please refer to FIG. 1A , which shows the flow of the method, including steps S101 to S103.
- the method may be performed by electronic equipment such as a terminal device or a server
- the terminal device may be user equipment (User Equipment, UE), mobile device, user terminal, terminal, cellular phone, cordless phone, personal digital assistant (Personal Digital Assistant, PDA) handheld devices, computing devices, vehicle-mounted devices, wearable devices, etc.
- the method can be implemented by calling the computer-readable instructions stored in the memory by the processor.
- the method may be executed by a server, and the server may be a local server, a cloud server, or the like.
- step S101 a first image and a spatial model of the object to be scanned are acquired, wherein the first image is an image obtained by scanning the object to be scanned by an electronic device.
- the electronic device may be a terminal device such as a mobile phone or a tablet computer, or may be an image acquisition device such as a camera or a scanning device.
- the acquisition of the first image in this step, the determination and output of the second pose information in the subsequent steps, and the determination and update of the first pose information may also be performed by the terminal device.
- the object to be scanned may be a three-dimensional object targeted by augmented reality technology.
- the electronic device When the electronic device scans the object to be scanned, it can continuously obtain multiple frames of the first image, that is, obtain an image sequence; the first image is any frame in the above image sequence, that is, the pose provided by the embodiment of the present application
- the acquisition method can be performed for any frame in the above image sequence; in some possible implementation manners, the method can be performed for each frame of the first image obtained when the electronic device scans the object to be scanned, namely The second pose information corresponding to the first image of each frame is obtained.
- the object to be scanned may be stationary, and the electronic device moves around the object to be scanned. For example, in the example shown in FIG.
- the electronic device moves around the object to be scanned 21 and
- the acquisition process of three image frames when acquiring an image that is, the electronic device acquires an image frame at the position of the previous image frame 22, then moves to the position of the previous image frame 23 to acquire an image frame, and then moves to the current image
- the position of frame 24 captures an image frame.
- the space model includes a point cloud of the object to be scanned, at least one image frame, and pose information corresponding to each image frame (such as the sixth pose information mentioned below).
- the image frame can be understood as an image captured by the electronic device under the corresponding sixth pose information of the object to be scanned.
- Each image frame corresponds to a part of the point cloud, and the corresponding relationship can be determined by the triangulation relationship of the image feature points during the modeling process, and can also be determined by the pose information.
- step S102 in response to the absence or invalidity of the first pose information, a second image is obtained, and the first pose information is determined according to the second image and the space model, wherein the second image is
- the electronic device scans the image obtained with respect to the object to be scanned, and the first pose information is pose information of the electronic device and/or the object to be scanned.
- the first pose information is missing, so the first pose information needs to be determined.
- the first pose information needs to be re-determined, that is, update first pose information.
- the pose information of the electronic device may be pose information (Tcw) of the electronic device in the world coordinate system, that is, pose information of the electronic device relative to the origin of the world coordinate system.
- the pose information of the object to be scanned may be the pose information (Tow) of the object to be scanned in the world coordinate system, that is, the pose information of the object to be scanned relative to the origin of the world coordinate system.
- the pose information of the electronic device and the object to be scanned may be pose information (Tco) of the electronic device relative to the object to be scanned.
- step S103 according to the first image, the space model and the first pose information, determine the second pose information, wherein the second pose information is the electronic device and/or the waiting The pose information of the scanned object.
- the first pose information For each frame of the first image, the first pose information must be used when determining the corresponding second pose information, and the first pose information can be reused until it is updated. Due to the use of the first pose information, the user can avoid the operation of manually aligning the model and the object to be scanned, thereby improving the efficiency and accuracy of obtaining the second pose information, thereby improving the efficiency and accuracy of tracking the object to be scanned.
- the first pose information can be determined by a detector or a detection module, and the detector or detection module is used to obtain an image scanned by the electronic device as a second image, and determine the first pose information according to the second image and the space model, that is, to detect
- the tracker or detection module is used to obtain the tracking starting point, that is, to guide the tracker to track the object to be scanned.
- the second pose information can be determined by a tracker or a tracking module, and the tracker or a tracking module is used to obtain an image scanned by the electronic device as the first image, and use the first image, the space model and the first pose information to determine the second pose information.
- Pose information that is, the tracker or tracking module is used to track the object to be scanned.
- the first pose information When determining the first pose information, only the first image and space model can be used, and there is no other guidance information.
- the second pose information on the basis of using the second image and space model, the first bit is also added. Therefore, the speed of determining the first pose information is slower than the speed of determining the second pose information, that is, the efficiency of determining the first pose information is lower than that of determining the second pose information, so the first position
- the determination of the pose information can improve the accuracy of the second pose information, and the reuse of the first pose information by the second pose information can improve efficiency.
- a frame of image scanned by the electronic device can be used not only as the first image, but also as the second image, or as the first image and the second image at the same time.
- the image scanned by the electronic device can be used as the first image; when the first pose information exists and is valid, there is no need to determine Or when updating the first pose information, the image scanned by the electronic device can be used as the second image; when a frame of image scanned by the electronic device is used as the first image to determine the first pose information, the electronic device has not yet scanned
- the next frame of image is obtained (for example, the electronic device has not moved relative to the object to be scanned or has not yet reached the period of collecting the next frame of image after moving), the frame of image can continue to be used as the second image for determining the second pose information.
- step S104 outputting the second pose information in response to the second pose information meeting the first preset condition with the first pose information.
- an error threshold may be preset, and a first condition may be preset that an error between the second pose information and the first pose information is smaller than the above error threshold.
- the same type of pose can be compared, that is, the pose information of the electronic device in the first pose information in the world coordinate system can be compared with the second pose information.
- the pose information of the electronic device in the world coordinate system in the pose information can also be compared with the pose information of the object to be scanned in the world coordinate system in the first pose information and the pose information of the object to be scanned in the second pose information.
- the second pose information and the first pose information meet the first condition, which means that the second pose information is consistent with the first pose information, and both pose information are valid poses, so the second pose information Outputting means outputting the second pose information of the first image of the frame, and meanwhile the first pose information can continue to be used to determine the second pose information of the next frame of the first image.
- the second pose information is more comprehensive than the first pose information, and has strong pertinence and high determination efficiency for each frame of the first image, so outputting the second pose information is more convenient for tracking the object to be scanned.
- the second pose information and the first pose information do not meet the first condition, which can indicate that the second pose information is inconsistent with the second pose information, then at least one of the two pose information is an invalid pose, so the first
- the second pose information cannot be output as a valid pose, that is, the first image of the frame does not obtain a valid pose, and the first pose information cannot continue to be used to determine the second pose information of the first image in the next frame, that is, it needs
- the first pose information is updated, and at this time it can be determined that the first pose information is invalid. Updating the first pose information refers to reacquiring the second image, using the reacquired second image to re-determine the first pose information, and deleting the original first pose information.
- a corresponding augmented reality rendering effect may be presented according to the second pose information.
- the second image is acquired, according to the The second image and the space model determine the first pose information, and then determine the second pose information according to the first image, the space model and the first pose information, and finally respond to the second The pose information and the first pose information meet a preset first condition, and the second pose information is output; otherwise, it is determined that the first pose information is invalid.
- the first pose information is determined according to the second image scanned by the electronic device for the object to be scanned and the space model, and after the first pose information is determined, it can be continuously used to determine the first
- the second pose information corresponding to the image the first pose information is not updated until the second pose information and the first pose information do not meet the first condition, so the efficiency and accuracy of pose information acquisition can be improved, that is, It is beneficial to improve the efficiency and accuracy of recognizing three-dimensional objects using the augmented reality technology.
- the first pose information may be determined according to the second image and the space model in the following manner: first, at least one image frame corresponding to the second image in the space model is obtained, And determine the first matching information between the feature points of the second image and the feature points of the at least one image frame (because the feature points of the second image and the image frame are two-dimensional points, the first matching information is two-dimensional -two-dimensional (2 Dimensional-2 Dimensional, 2D-2D) matching); Next, obtain the point cloud corresponding to the at least one image frame in the space model, and determine the described first matching information according to the first matching information The second matching information between the feature point of the second image and the three-dimensional point of the point cloud (because the feature point of the second image is a two-dimensional point, the second matching information is two-dimensional-three-dimensional (2 Dimensional-3 Dimensional, 2D-3D) matching); finally, according to the first matching information and the second matching information, determine the first pose information.
- FIG. 1B shows a schematic diagram of a system architecture to which the pose acquisition method of the embodiment of the present disclosure can be applied; as shown in FIG. 1B , the system architecture includes: a pose acquisition terminal 201 , a network 202 and an electronic device 203 .
- the pose acquisition terminal 201 and the electronic device 203 establish a communication connection through the network 202, and the electronic device 203 reports the image scanned for the object to be scanned to the pose acquisition terminal 201 through the network 202; the pose acquisition terminal 201 Acquire the first image and the space model of the object to be scanned.
- the pose acquisition terminal 201 uploads the output second pose information to the network 202 .
- the electronic device 203 may include an image acquisition device or an image scanning device, and the pose acquisition terminal 201 may include a vision processing device capable of processing visual information or a remote server.
- the network 202 may be connected in a wired or wireless manner.
- the electronic device 203 can communicate with the visual processing device through a wired connection, such as performing data communication through a bus;
- the electronic device 203 can perform data interaction with a remote server through a wireless network.
- the electronic device 203 may be a vision processing device with a video capture module, or a host with a camera.
- the pose acquisition method of the embodiment of the present disclosure may be executed by the electronic device 203, and the above-mentioned system architecture may not include the network 202 and the server.
- the similarity between each image frame in the space model and the second image can be determined first, and then the similarity with the second image
- An image frame whose similarity is higher than a preset similarity threshold is determined as an image frame corresponding to the second image.
- the similarity threshold is preset in advance, the higher the threshold, the fewer image frames corresponding to the second image will be screened out, and the lower the threshold, the more image frames corresponding to the second image will be screened out.
- the pose information of the image frame corresponding to the second image is the same as or similar to the pose information of the second image.
- the Euclidean distance between the feature points of the image frame and the feature points of the second image can be calculated, and then the similarity can be obtained according to the Euclidean distance.
- the image frame in the space model can be converted into image retrieval information, and enough feature points of the second image can be extracted, and then image retrieval can be used to find Image frames for similarity thresholding.
- Descriptors of all image frames can be clustered layer by layer through a clustering algorithm (such as a K-means clustering (k-means) algorithm), so as to obtain image retrieval information composed of words representing these descriptors.
- the image retrieval method refers to determining the condition that the similarity with the feature points of the second image exceeds the similarity threshold, and then using the above conditions to traverse each information in the image retrieval information, and filtering out the information that meets the above conditions , and use the image frame corresponding to the filtered information as the image frame whose similarity with the second image is higher than the similarity threshold.
- the initial matching information when determining the first matching information between the feature points of the second image and the feature points of the at least one image frame: first obtain the feature points and descriptors of the second image, and the Feature points and descriptors; then according to the descriptors of the second image and the descriptors of the image frame, determine the initial matching information between the feature points of the second image and the feature points of the image frame; then according to The initial matching information determines the fundamental matrix and/or essential matrix of the second image and the image frame; finally, according to the fundamental matrix and/or essential matrix, the initial matching information is filtered to obtain the the first matching information.
- the descriptor with the closest Hamming distance can be found in the image frame, and then conversely, for each descriptor in the image frame A descriptor finds the descriptor with the closest Hamming distance in the second image. If a descriptor in the second image and a descriptor in the image frame are the descriptors with the closest Hamming distance to each other, it is considered that the above two Descriptor matching, and then determine the matching of the two feature points corresponding to the above two descriptors, and all the matching feature points constitute the initial matching information.
- the fundamental matrix and/or the essential matrix when determining the fundamental matrix and/or the essential matrix, it may be calculated by a random sample consensus algorithm (Random Sample Consensus, RANSAC).
- RANSAC Random Sample Consensus
- multiple fundamental matrices and/or essential matrices can also be calculated by RANSAC and 5-point algorithm, and the interior points of each fundamental matrix and/or essential matrix are determined, and then the fundamental matrix and/or essential matrix with the largest number of interior points The matrix is determined as the final calculation result. If the two matching feature points conform to the fundamental matrix and/or essential matrix, then the two feature points are interior points; on the contrary, if the two matching feature points do not conform to the fundamental matrix and/or essential matrix, then the The two feature points are outliers. When the basic matrix and/or essential matrix is used to filter the initial matching information, the inliers in the initial matching information are retained, that is, the outliers in the initial matching information are deleted.
- the first matching information with the feature points of the image frame can be The feature points of the second image are matched with the 3D points of the point cloud corresponding to the feature points of the image frame to obtain the second matching information. That is to say, the feature points of the second image are matched with the 3D points of the point cloud by using the feature points of the image frame as a medium.
- the acceleration of gravity of the electronic device may be obtained first; then according to the first matching information and the second matching information, and matching the information with the gravitational acceleration to determine the first pose information.
- the electronic device may have an acceleration sensor and/or a gyroscope, and thus may obtain the acceleration of gravity from the acceleration sensor and/or the gyroscope.
- the PnP (pespective-n-point) algorithm can be used to solve the first pose information by using the first matching information, and the first pose information can be solved by using the second matching information by decomposing the fundamental matrix and/or essential matrix. pose information.
- the constraint condition of the acceleration of gravity can be added, that is, the acceleration of gravity is used to constrain the rotation angle (such as roll angle and pitch angle) in the pose of the electronic device.
- the above two solving processes can be combined in the Hybrid form to solve the first pose information, that is, the first pose information is solved by comprehensively using the first matching information, the second matching information and the acceleration of gravity.
- the first matching information can provide a constraint of 1 degree of freedom
- the second matching information can provide constraints of 2 degrees of freedom
- the acceleration of gravity provides 1 degree of freedom
- a certain number of first matching can be randomly selected Information
- a certain amount of second matching information and the acceleration of gravity are combined to form six degrees of freedom to solve the first pose information.
- the first matching information can be constructed through the relationship of the Plücke coordinate system to construct an equation, and the The first matching information constructs an equation through the camera projection matrix model, and then solves multiple simultaneous equations through a solver (such as Grobner Basis Solution); or uses the above two solving processes independently through RANSAC to solve the problem in a robust manner.
- the first pose information that is, according to different frequency ratios, alternately select the first matching information and the acceleration of gravity to solve the first pose information, and the second matching information and the acceleration of gravity to solve the first pose information, and the obtained Error calculation is performed between the first pose information and all matching information.
- the number of interior points is large enough (for example, exceeds a certain threshold)
- it is determined that the first pose information at this time is accurate, and the solution is ended.
- the obtained first pose information is more accurate, which in turn can make the The second pose information obtained from the first pose information is more accurate.
- the first pose information may be determined by the detector or the detection module for use by the tracker or the tracking module.
- the second pose information may be determined according to the first image, the space model, and the first pose information in the following manner: first, according to the first pose information and the the first image, and determine the third pose information corresponding to the first image, wherein the third pose information is the pose information of the electronic device relative to the object to be scanned; next, according to the The third pose information determines the third matching information between the feature points of the first image and the three-dimensional points of the point cloud of the space model (since the feature points of the first image are two-dimensional points, the third matching information 2D-3D matching); Next, in response to the third matching information meeting the preset second condition, according to the third pose information, determine the feature points of the first image and the spatial model The fourth matching information between the feature points of at least one image frame (because the feature points of the first image and the image frame are two-dimensional points, the fourth matching information is 2D-3D matching); finally, according to the third matching information and the fourth matching information to determine the second pose information.
- the first pose information may include fourth pose information
- the fourth pose information is coordinate information (Tow) of the object to be scanned in the world coordinate system.
- the fourth pose information remains unchanged.
- the fifth pose can be obtained from the positioning module first according to the first image information, wherein the fifth pose information is the pose information (Tcw) of the electronic device in the world coordinate system; then according to the fourth pose information and the fifth pose information, determine the Third pose information.
- the positioning module can be a Visual Inertial Simultaneous Localization and Mapping (VISLAM) module, and VISLAM can output the pose information of the electronic device in the world coordinate system in real time during operation.
- the pose information of the object to be scanned in the world coordinate system is the absolute pose of the object to be scanned, and the pose information of the electronic device in the world coordinate system is the absolute pose of the electronic device.
- the absolute pose in the unified coordinate system determines the relative pose of the two, that is, the pose information (Tco) of the electronic device relative to the object to be scanned, or the pose information (Toc) of the object to be scanned relative to the electronic device, the above
- the pose information (Tco) of the electronic device relative to the object to be scanned is selected as the third pose information, and of course the pose information (Toc) of the object to be scanned relative to the electronic device can also be selected as the third pose information.
- the third pose information when determining the third matching information between the feature points of the first image and the three-dimensional points of the point cloud of the space model: first, according to the third pose information, Projecting the point cloud of the space model onto the first image to form a plurality of projection points, and extracting a descriptor of each projection point; then extracting feature points and descriptors of the first image frame; Finally, according to the descriptor corresponding to the feature point and the descriptor of the projected point, third matching information between the feature point and the 3D point of the point cloud is determined.
- the third pose information can represent the relative pose of the electronic device that took the first image and the object to be scanned, that is, it can represent the direction and angle of the electronic device and the object to be scanned, so the camera model can be used to map the point cloud projection as the first on an image.
- each 3D point of the point cloud corresponds to at least one feature point of the image frame, and extracting a 3D point corresponding to The descriptors of all the feature points of the 3D point are obtained by fusing these descriptors to obtain the descriptors of the projection points of the 3D point.
- the third matching information when determining the third matching information, you can first find the descriptor of the projection point with the closest Hamming distance for the descriptor of each feature point, and then conversely, for the descriptor of each projection point Find the descriptor of the feature point with the closest Hamming distance. If the descriptor of a feature point and the descriptor of a projected point are the descriptors with the closest Hamming distance, the two descriptors above are considered to match, and then the above The feature points corresponding to the two descriptors are matched with the 3D points, and all the matched feature points and 3D points constitute the third matching information.
- the second condition may be that the number of matching combinations between the first image and the point cloud of the space model is greater than a preset number threshold.
- the matching combination includes a pair of feature points and three-dimensional points that match each other.
- the number of matching combinations represents the effectiveness of the first pose information to a certain extent. If the first pose information is invalid, the number of matching combinations will inevitably decrease or disappear. If the first pose information is valid, the matching combination The number must be more.
- the judgment of the second condition is a pre-judgment step before the validity of the first pose information is judged in step S104, if the third matching information does not meet the second condition, that is, the number of matching combinations is less than or equal to the preset number threshold, the first pose information and the second pose information must not meet the first condition, so there is no need to perform subsequent steps to solve the second pose information, and it can be directly determined that the first pose information is invalid, if the third matching information meets
- the second condition that is, the number of matching combinations is greater than the preset number threshold, it is not possible to directly determine whether the first pose information is valid, so continue to solve the second pose information, and based on the first pose information and the second pose information Whether the pose information meets the first condition is used to judge the validity of the first pose information.
- the fourth matching information between the feature points of the first image and the feature points of at least one image frame of the space model according to the third pose information, the third position pose information, and the pose information of each image frame of the space model determine at least one image frame that matches the third pose information; then acquire the feature points and descriptors of the first image, and The feature points and descriptors of each image frame matched by the third pose information; finally, according to the descriptors of the first image and the descriptors of the image frame, determine the feature points and the descriptors of the first image and the The fourth matching information between the feature points of the image frame.
- Each image frame has pose information (such as the sixth pose information below), which represents the relative pose of the electronic device that acquires the image frame and the object to be scanned, that is, the electronic device is in this relative pose , the image frame can be obtained; and the third pose information represents the relative pose of the electronic device that obtains the first image and the object to be scanned, that is, when the electronic device is in the relative pose, the first image can be obtained .
- pose information of an image frame is the same or similar to that of a first image (for example, the angle difference is within a preset range), it can be determined that the image frame matches the first image.
- the descriptor with the closest Hamming distance can be found in the image frame for each descriptor in the first image, and then conversely, for each descriptor in the image frame, in the first image Find the descriptor with the closest Hamming distance, if a certain descriptor in the first image and a certain descriptor in the image frame are the descriptors with the closest Hamming distance to each other, it is considered that the above two descriptors match, and then determine The two feature points corresponding to the above two descriptors are matched, and all the matched feature points form the fourth matching information.
- the gravitational acceleration of the electronic device may be obtained first; then according to the third matching information, the first 4. Match the information with the gravitational acceleration to determine the second pose information.
- the electronic device may have an acceleration sensor and/or a gyroscope, and thus may obtain the acceleration of gravity from the acceleration sensor and/or the gyroscope.
- the PnP algorithm can be used to obtain the second pose information by using the fourth matching information, and the second pose information can be obtained by using the third matching information by decomposing the fundamental matrix and/or essential matrix algorithm.
- the constraint condition of the acceleration of gravity can be added, that is, the acceleration of gravity is used to constrain the rotation angle (such as roll angle and pitch angle) in the pose of the electronic device.
- the above two solving processes can be combined in the Hybrid form to solve the second pose information, that is, the second pose information can be solved by comprehensively using the third matching information, the fourth matching information and the acceleration of gravity.
- the first matching information can provide a constraint of 1 degree of freedom
- the second matching information can provide constraints of 2 degrees of freedom
- the acceleration of gravity provides 1 degree of freedom
- a certain number of third matching can be randomly selected Information
- a certain amount of fourth matching information and gravitational acceleration are combined to form six degrees of freedom to solve the second pose information.
- the fourth matching information can be used to construct an equation through the relationship of the Pluck coordinate system, and the The third matching information constructs an equation through the camera projection matrix model, and then solves multiple simultaneous equations through a solver (such as Grobner Basis Solution); or uses the above two solving processes independently through RANSAC to solve the problem in a robust manner.
- the second pose information that is, according to different frequency ratios, alternately select the third matching information and the acceleration of gravity to solve the second pose information, and the fourth matching information and the acceleration of gravity to solve the second pose information, and the obtained Error calculation is performed between the second pose information and all matching information.
- the number of interior points is large enough (for example, exceeds a certain threshold)
- the second pose information may be determined by the tracker or the tracking module, and the first pose information obtained by the detector or the detection module is used in the determination process. Since the accuracy rate of the first pose information determined by the detector or the detection module is higher than that of the tracker or the tracking module, and the efficiency is lower than that of the tracker, the detector or the detection module is used to determine (reusable) first pose information, Using the tracker or tracking module to frequently output the second pose information can not only determine the tracking starting point of the tracker through the detector or detection module, thereby improving the accuracy of pose acquisition, but also avoiding the manual alignment of the spatial model and the object to be scanned. The cumbersome operation and inaccurate tracking can ensure the efficiency of pose acquisition.
- the spatial model of the object to be scanned can be obtained in the following manner: First, obtain multiple frames of modeling images scanned by the electronic device for the object to be scanned, and simultaneously obtain the sixth bit corresponding to each frame of modeling images posture information; Next, match the feature points of the multi-frame modeling images, and triangulate the feature points according to the matching results to form a point cloud; Next, from the multi-frame modeling images Determine at least one image frame, and determine the point cloud corresponding to each image frame; finally, construct the at least one image frame, the sixth pose information corresponding to each image frame, and the point cloud as a space model.
- the method of inter-frame descriptor matching or optical flow tracking matching can be used.
- the position of a certain landmark in the three-dimensional space can be tracked between consecutive frames through the matching between two frames. Through the matching relationship between these consecutive frames and the pose information of each frame, it can be Construct a system of equations, and by solving this system of equations, the depth information of the landmark position can be obtained.
- Electronic equipment scans modeling images at a high frequency (for example, 30 hertz (Hz) frequency), and when selecting image frames, only part of the modeling images can be selected, so that the file size of the entire model will not be too large, which is beneficial to Subsequent file sharing can also reduce the memory consumption of the model when running on the mobile phone.
- a high frequency for example, 30 hertz (Hz) frequency
- the acquisition process of the spatial model is shown in Figure 3.
- the user can obtain the three-dimensional bounding box surrounding the object through the application program interface, and guide the user to model around the selected three-dimensional object 301 .
- the system will establish point clouds and image key frame information of the model at various angles (for example, model image frames 31, 32 to model image frame 38 shown in FIG. 3 ).
- all the point cloud information in the 3D bounding box is saved, which is the 3D point cloud model of the object.
- the space model includes a point cloud in a three-dimensional frame and a modeling image frame, and each image frame is marked with sixth pose information.
- the sixth pose information can be the pose information of the electronic device relative to the object to be scanned.
- the information can first obtain the pose information of the electronic device in the world coordinate system from the positioning module in the electronic device, such as the VISLAM module, and then the above pose The information is combined with the pre-acquired pose information of the object to be scanned in the world coordinate system to obtain the sixth pose information.
- the positioning module in the electronic device such as the VISLAM module
- the terminal device can use the pose information acquisition method provided in this application to scan the product.
- the product comes with a certain product description and effect display, and the terminal device can be used to start the scanning program, which can run the pose acquisition method provided by this application, so that when the terminal device scans the product, the first pose information can be obtained and the second pose information can be output.
- Pose information when the second pose information is output, the program can present the corresponding product on the display screen of the terminal device using reality augmentation technology according to the mapping effect between the second pose information and the product description and/or effect display Description and/or effect display.
- display enhancement technology may be used to present an explanation and/or display effect of the interaction process.
- FIG. 4 shows a schematic structural diagram of the pose acquisition device 400, including:
- the obtaining module 401 is configured to obtain a first image and a spatial model of the object to be scanned, wherein the first image is an image scanned by the electronic device for the object to be scanned;
- the first pose module 402 is configured to acquire a second image in response to missing or invalid first pose information, and determine the first pose information according to the second image and the space model, wherein the The second image is an image scanned by the electronic device for the object to be scanned, and the first pose information is pose information of the electronic device and/or the object to be scanned;
- the second pose module 403 is configured to determine second pose information according to the first image, the space model, and the first pose information, wherein the second pose information is the electronic device And/or the pose information of the object to be scanned;
- An output module 404 configured to output the second pose information in response to the second pose information and the first pose information meeting a preset first condition; otherwise, determine the first pose information invalid
- the first pose module :
- the first pose module when configured to acquire at least one image frame corresponding to the second image in the space model, it is also configured to:
- An image frame whose similarity with the second image is higher than a preset similarity threshold is determined as an image frame corresponding to the second image.
- the first pose module when the first pose module is configured to determine the first matching information between the feature points of the second image and the feature points of the at least one image frame, it is also configured to:
- the initial matching information is filtered according to the fundamental matrix and/or the essential matrix to obtain the first matching information.
- the first pose module is configured to determine the second matching information between the feature points of the second image and the three-dimensional points of the point cloud according to the first matching information , also configured as:
- the first pose module when the first pose module is configured to determine the first pose information according to the first matching information and the second matching information, it is further configured to:
- the second pose module is further configured to:
- the third pose information is the position of the electronic device relative to the object to be scanned Pose information
- the third pose information determine third matching information between the feature points of the first image and the three-dimensional points of the point cloud of the space model
- the second pose information is determined according to the third matching information and the fourth matching information.
- the first pose information includes fourth pose information, wherein the fourth pose information is pose information of the object to be scanned in a world coordinate system;
- the second pose module is configured to determine the third pose information corresponding to the first image according to the first pose information and the first image, it is further configured to:
- the fifth pose information is pose information of the electronic device in a world coordinate system
- the third pose information is determined according to the fourth pose information and the fifth pose information.
- the second pose module is configured to determine a third match between the feature points of the first image and the three-dimensional points of the point cloud of the space model according to the third pose information
- information is also configured as:
- the third pose information project the point cloud of the space model onto the first image to form a plurality of projection points, and extract a descriptor of each projection point;
- Third matching information between the feature point and the 3D point of the point cloud is determined according to the descriptor corresponding to the feature point and the descriptor of the projection point.
- the second pose module is configured to, according to the third pose information, determine the first position between the feature point of the first image and the feature point of at least one image frame of the space model.
- matching information it is also configured as:
- the second pose module when the second pose module is configured to determine the second pose information according to the third matching information and the fourth matching information, it is further configured to:
- the second pose information is determined according to the third matching information, the fourth matching information and the gravitational acceleration.
- the second pose information and the first pose information meet a preset first condition, including:
- the error between the second pose information and the first pose information is smaller than a preset error threshold; and/or,
- the third matching information meets the preset second condition, including:
- the number of matching combinations between the first image and the point cloud of the space model is greater than a preset number threshold, wherein the matching combination includes a pair of feature points and three-dimensional points that match each other.
- the acquisition module when configured to acquire the object space model to be scanned, it is also configured to:
- the at least one image frame, the sixth pose information corresponding to each image frame, and the point cloud are constructed as a space model.
- At least one embodiment of the present application provides an electronic device. Please refer to FIG. 5, which shows the structure of the electronic device.
- the electronic device 500 includes a memory 501 and a processor 502.
- the memory uses Computer instructions that can be executed on a processor are stored, and the processor is configured to acquire pose information based on the method described in any one of the first aspect when executing the computer instructions.
- At least one embodiment of the present application provides a computer-readable storage medium, on which a computer program is stored, and when the program is executed by a processor, the method described in any one of the first aspect is implemented.
- Computer readable storage media may be volatile or nonvolatile computer readable storage media.
- At least one embodiment of the present application provides a computer program product, including computer readable codes, when the computer readable codes are run on a device, the processor in the device executes to implement any one of the first aspect Directives for the methods described in Item .
- the present application relates to a pose acquisition method, device, electronic equipment, and storage medium.
- the method includes: acquiring a first image, wherein the first image is an image scanned by the electronic equipment for the object to be scanned; responding to the first position If the pose information is missing or invalid, obtain the second image, and determine the first pose information according to the second image and the space model, wherein the second image is the image scanned by the electronic device for the object to be scanned, and the first pose information is the electronic The pose information of the device and/or the object to be scanned, wherein the second pose information is the pose information of the electronic device and/or the object to be scanned; according to the first image, the space model and the first pose information, determine the second Pose information: outputting the second pose information in response to the second pose information meeting the first preset condition with the first pose information.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- Computer Graphics (AREA)
- Computer Hardware Design (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
- Processing Or Creating Images (AREA)
- Television Signal Processing For Recording (AREA)
- Studio Devices (AREA)
Abstract
Description
Claims (31)
- 一种位姿获取方法,其中,所述方法包括:获取第一图像和待扫描对象的空间模型,其中,所述第一图像为电子设备针对待扫描对象扫描得到的图像;响应于第一位姿信息缺失或无效,获取第二图像,并根据所述第二图像和所述空间模型确定所述第一位姿信息,其中,所述第二图像为所述电子设备针对所述待扫描对象扫描得到的图像,所述第一位姿信息为所述电子设备和/或所述待扫描对象的位姿信息;根据所述第一图像、所述空间模型和所述第一位姿信息,确定第二位姿信息,其中,所述第二位姿信息为所述电子设备和/或所述待扫描对象的位姿信息;响应于所述第二位姿信息与所述第一位姿信息符合预设的第一条件,输出所述第二位姿信息。
- 根据权利要求1所述的位姿获取方法,其中,所述方法还包括;响应于所述第二位姿信息与所述第一位姿信息不符合预设的第一条件,确定所述第一位姿信息无效。
- 根据权利要求1或2所述的位姿获取方法,其中,所述根据所述第二图像和所述空间模型确定所述第一位姿信息,包括:获取所述空间模型中与所述第二图像对应的至少一个图像帧,并确定所述第二图像的特征点与所述至少一个图象帧的特征点间的第一匹配信息;获取所述空间模型中与所述至少一图像帧对应的点云,并根据所述第一匹配信息,确定所述第二图像的特征点与所述点云的三维点间的第二匹配信息;根据所述第一匹配信息和所述第二匹配信息,确定所述第一位姿信息。
- 根据权利要求3所述的位姿获取方法,其中,所述获取所述空间模型中与所述第二图像对应的至少一个图像帧,包括:确定所述空间模型中的每个图像帧与所述第二图像的相似度;将与所述第二图像的相似度高于预设的相似度阈值的图像帧,确定为与所述第二图像对应的图像帧。
- 根据权利要求3或4所述的位姿获取方法,其中,所述确定所述第二图像的特征点与所述至少一个图象帧的特征点间的第一匹配信息,包括:获取所述第二图像的特征点及描述子,以及所述图像帧的特征点及描述子;根据所述第二图像的描述子和所述图像帧的描述子,确定所述第二图像的特征点与所述图像帧的特征点间的初始匹配信息;根据所述初始匹配信息,确定所述第二图像与所述图像帧的基础矩阵和/或本质矩阵;根据所述基础矩阵和/或本质矩阵,对所述初始匹配信息进行过滤,得到所述第一匹配信息。
- 根据权利要求3至5任一项所述的位姿获取方法,其中,所述根据所述第一匹配信息,确定所述第二图像的特征点与所述点云的三维点间的第二匹配信息,包括:将与所述图像帧的特征点匹配的所述第二图像的特征点,和与所述图像帧的特征点对应的所述点云的三维点进行匹配,得到所述第二匹配信息。
- 根据权利要求3至6任一项所述的位姿获取方法,其中,所述根据所述第一匹配信息和所述第二匹配信息,确定所述第一位姿信息,包括:获取所述电子设备的重力加速度;根据所述第一匹配信息和所述第二匹配信息以及所述重力加速度,确定所述第一位姿信息。
- 根据权利要求1至7任一项所述的位姿获取方法,其中,所述根据所述第一图像、所述空间模型和所述第一位姿信息,确定第二位姿信息,包括:根据所述第一位姿信息和所述第一图像,确定所述第一图像对应的第三位姿信息,其中,所述第三位姿信息为所述电子设备相对所述待扫描对象的位姿信息;根据所述第三位姿信息,确定所述第一图像的特征点与所述空间模型的点云的三维点间的第三匹配信息;响应于所述第三匹配信息符合预设的第二条件,根据所述第三位姿信息,确定所述第一图像的特征点与所述空间模型的至少一个图像帧的特征点间的第四匹配信息;根据所述第三匹配信息和所述第四匹配信息,确定所述第二位姿信息。
- 根据权利要求8所述的位姿获取方法,其中,所述第一位姿信息包括第四位姿信息,其中,所述第四位姿信息为所述待扫描对象在世界坐标系内的位姿信息;所述根据所述第一位姿信息和所述第一图像,确定所述第一图像对应的第三位姿信息,包括:根据所述第一图像,从定位模块获取第五位姿信息,其中,所述第五位姿信息为所述电子设备在世界坐标系内的位姿信息;根据所述第四位姿信息和所述第五位姿信息,确定所述第三位姿信息。
- 根据权利要求8或9所述的位姿获取方法,其中,所述根据所述第三位姿信息,确定所述第一图像的特征点与所述空间模型的点云的三维点间的第三匹配信息,包括:根据所述第三位姿信息,将所述空间模型的点云投影至所述第一图像上,形成多个投影点,并提取每个所述投影点的描述子;提取所述第一图像帧的特征点和描述子;根据所述特征点对应的描述子和所述投影点的描述子,确定所述特征点与所述点云的三维点间的第三匹配信息。
- 根据权利要求8至10任一项所述的位姿获取方法,其中,所述根据所述第三位姿信息,确定所述第一图像的特征点与所述空间模型的至少一个图像帧的特征点间的第四匹配信息,包括:根据所述第三位姿信息,以及所述空间模型的图像帧的位姿信息,确定与所述第三位姿信息匹配的至少一个图像帧;获取所述第一图像的特征点和描述子,以及与所述第三位姿信息匹配的图像帧的特征点和描述子;根据所述第一图像的描述子和所述图像帧的描述子,确定所述第一图像的特征点与所述图像帧的特征点间的第四匹配信息。
- 根据权利要求8至11任一项所述的位姿获取方法,其中,所述根据所述第三匹配信息和所述第四匹配信息,确定所述第二位姿信息,包括:获取所述电子设备的重力加速度;根据所述第三匹配信息、所述第四匹配信息和所述重力加速度,确定所述第二位姿信息。
- 根据权利要求8至12任一项所述的位姿获取方法,其中,所述第二位姿信息与所述第一位姿信息符合预设的第一条件,包括:所述第二位姿信息与所述第一位姿信息的误差小于预设的误差阈值;和/或,所述第三匹配信息符合预设的第二条件,包括:所述第一图像与所述空间模型的点云间的匹配组合的数量,大于预设的数量阈值,其中,所述匹配组合包括相互匹配的一对特征点和三维点。
- 根据权利要求1至13任一项所述的位姿获取方法,其中,所述获取待扫描对象空间模型,包括:获取所述电子设备针对所述待扫描对象扫描得到多帧建模图像,并同步获取每帧建模图像对应的第六位姿信息;将所述多帧建模图像的特征点进行匹配,并根据匹配结果对所述特征点进行三角化,以形成点云;从所述多帧建模图像中确定至少一个图像帧,并确定每个图像帧对应的点云;将所述至少一个图像帧、每个图像帧对应的第六位姿信息以及所述点云构建为空间模型。
- 一种位姿获取装置,其中,包括:获取模块,配置为获取第一图像和待扫描对象的空间模型,其中,所述第一图像为电子设备针对待扫描对象扫描得到的图像;第一位姿模块,配置为响应于第一位姿信息缺失或无效,获取第二图像,并根据所述第二图像和所述空间模型确定所述第一位姿信息,其中,所述第二图像为所述电子设备针对所述待扫描对象扫描得到的图像,所述第一位姿信息为所述电子设备和/或所述待扫描对象的位姿信息;第二位姿模块,配置为根据所述第一图像、所述空间模型和所述第一位姿信息,确定第二位姿信息,其中,所述第二位姿信息为所述电子设备和/或所述待扫描对象的位姿信息;输出模块,配置为响应于所述第二位姿信息与所述第一位姿信息符合预设的第一条件,输出所述第二位姿信息。
- 根据权利要求15所述的位姿获取装置,其中,所述输出模块还配置为;响应于所述第二位姿信息与所述第一位姿信息不符合预设的第一条件,确定所述第一位姿信息无效。
- 根据权利要求15或16所述的位姿获取装置,其中,所述第一位姿模块还配置为:获取所述空间模型中与所述第二图像对应的至少一个图像帧,并确定所述第二图像的特征点与所述至少一个图象帧的特征点间的第一匹配信息;获取所述空间模型中与所述至少一图像帧对应的点云,并根据所述第一匹配信息,确定所述第二图像的特征点与所述点云的三维点间的第二匹配信息;根据所述第一匹配信息和所述第二匹配信息,确定所述第一位姿信息。
- 根据权利要求17所述的位姿获取装置,其中,所述第一位姿模块配置为获取所述空间模型中与所述第二图像对应的至少一个图像帧时,还配置为:确定所述空间模型中的每个图像帧与所述第二图像的相似度;将与所述第二图像的相似度高于预设的相似度阈值的图像帧,确定为与所述第二图像对应的图像帧。
- 根据权利要求17或18所述的位姿获取装置,其中,所述第一位姿模块配置为确定所述第二图像的特征点与所述至少一个图象帧的特征点间的第一匹配信息时,还配置为:获取所述第二图像的特征点及描述子,以及所述图像帧的特征点及描述子;根据所述第二图像的描述子和所述图像帧的描述子,确定所述第二图像的特征点与所述图像帧的特征点间的初始匹配信息;根据所述初始匹配信息,确定所述第二图像与所述图像帧的基础矩阵和/或本质矩阵;根据所述基础矩阵和/或本质矩阵,对所述初始匹配信息进行过滤,得到所述第一匹配信息。
- 根据权利要求17至19任一项所述的位姿获取装置,其中,所述第一位姿模块配置为根据所述第一匹配信息,确定所述第二图像的特征点与所述点云的三维点间的第二匹配信息时,还配置为:将与所述图像帧的特征点匹配的所述第二图像的特征点,和与所述图像帧的特征点对应的所述点云的三维点进行匹配,得到所述第二匹配信息。
- 根据权利要求17至20任一项所述的位姿获取装置,其中,所述第一位姿模块配置为根据所述第一匹配信息和所述第二匹配信息,确定所述第一位姿信息时,还配置为:获取所述电子设备的重力加速度;根据所述第一匹配信息和所述第二匹配信息以及所述重力加速度,确定所述第一位姿信息。
- 根据权利要求15至21任一项所述的位姿获取装置,其中,所述第二位姿模块还配置为:根据所述第一位姿信息和所述第一图像,确定所述第一图像对应的第三位姿信息,其中,所述第三位姿信息为所述电子设备相对所述待扫描对象的位姿信息;根据所述第三位姿信息,确定所述第一图像的特征点与所述空间模型的点云的三维点间的第三匹配信息;响应于所述第三匹配信息符合预设的第二条件,根据所述第三位姿信息,确定所述第一图像的特 征点与所述空间模型的至少一个图像帧的特征点间的第四匹配信息;根据所述第三匹配信息和所述第四匹配信息,确定所述第二位姿信息。
- 根据权利要求22所述的位姿获取装置,其中,所述第一位姿信息包括第四位姿信息,其中,所述第四位姿信息为所述待扫描对象在世界坐标系内的位姿信息;所述第二位姿模块配置为根据所述第一位姿信息和所述第一图像,确定所述第一图像对应的第三位姿信息时,还配置为:根据所述第一图像,从定位模块获取第五位姿信息,其中,所述第五位姿信息为所述电子设备在世界坐标系内的位姿信息;根据所述第四位姿信息和所述第五位姿信息,确定所述第三位姿信息。
- 根据权利要求22或23所述的位姿获取装置,其中,第二位姿模块配置为根据所述第三位姿信息,确定所述第一图像的特征点与所述空间模型的点云的三维点间的第三匹配信息时,还配置为:根据所述第三位姿信息,将所述空间模型的点云投影至所述第一图像上,形成多个投影点,并提取每个所述投影点的描述子;提取所述第一图像帧的特征点和描述子;根据所述特征点对应的描述子和所述投影点的描述子,确定所述特征点与所述点云的三维点间的第三匹配信息。
- 根据权利要求22至24任一项所述的位姿获取装置,其中,第二位姿模块配置为根据所述第三位姿信息,确定所述第一图像的特征点与所述空间模型的至少一个图像帧的特征点间的第四匹配信息时,还配置为:根据所述第三位姿信息,以及所述空间模型的图像帧的位姿信息,确定与所述第三位姿信息匹配的至少一个图像帧;获取所述第一图像的特征点和描述子,以及与所述第三位姿信息匹配的图像帧的特征点和描述子;根据所述第一图像的描述子和所述图像帧的描述子,确定所述第一图像的特征点与所述图像帧的特征点间的第四匹配信息。
- 根据权利要求22至25任一项所述的位姿获取装置,其中,第二位姿模块配置为根据所述第三匹配信息和所述第四匹配信息,确定所述第二位姿信息时,还配置为:获取所述电子设备的重力加速度;根据所述第三匹配信息、所述第四匹配信息和所述重力加速度,确定所述第二位姿信息。
- 根据权利要求22至26任一项所述的位姿获取装置,其中,所述第二位姿信息与所述第一位姿信息符合预设的第一条件,包括:所述第二位姿信息与所述第一位姿信息的误差小于预设的误差阈值;和/或,所述第三匹配信息符合预设的第二条件,包括:所述第一图像与所述空间模型的点云间的匹配组合的数量,大于预设的数量阈值,其中,所述匹配组合包括相互匹配的一对特征点和三维点。
- 根据权利要求15至27任一项所述的位姿获取装置,其中,所述获取模块配置为获取待扫描对象空间模型时,还配置为:获取所述电子设备针对所述待扫描对象扫描得到多帧建模图像,并同步获取每帧建模图像对应的第六位姿信息;将所述多帧建模图像的特征点进行匹配,并根据匹配结果对所述特征点进行三角化,以形成点云;从所述多帧建模图像中确定至少一个图像帧,并确定每个图像帧对应的点云;将所述至少一个图像帧、每个图像帧对应的第六位姿信息以及所述点云构建为空间模型。
- 一种电子设备,其中,所述设备包括存储器、处理器,所述存储器配置为存储可在处理器上运行的计算机指令,所述处理器配置为在执行所述计算机指令时实现权利要求1至14任一项所述的方法。
- 一种计算机可读存储介质,其上存储有计算机程序,其中,所述程序被处理器执行时实现权利要求1至14任一所述的方法。
- 一种计算机程序,所述计算机程序包括计算机可读代码,在所述计算机可读代码在电子设备中运行的情况下,所述电子设备的处理器执行配置为实现如权利要求1至14中任意一项所述的位姿获取方法。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020227017413A KR102464271B1 (ko) | 2021-05-11 | 2021-09-27 | 포즈 획득 방법, 장치, 전자 기기, 저장 매체 및 프로그램 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110510890.0 | 2021-05-11 | ||
CN202110510890.0A CN113190120B (zh) | 2021-05-11 | 2021-05-11 | 位姿获取方法、装置、电子设备及存储介质 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022237048A1 true WO2022237048A1 (zh) | 2022-11-17 |
Family
ID=76981167
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/121034 WO2022237048A1 (zh) | 2021-05-11 | 2021-09-27 | 位姿获取方法、装置、电子设备、存储介质及程序 |
Country Status (4)
Country | Link |
---|---|
KR (1) | KR102464271B1 (zh) |
CN (1) | CN113190120B (zh) |
TW (1) | TW202244680A (zh) |
WO (1) | WO2022237048A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116758157A (zh) * | 2023-06-14 | 2023-09-15 | 深圳市华赛睿飞智能科技有限公司 | 一种无人机室内三维空间测绘方法、系统及存储介质 |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113190120B (zh) * | 2021-05-11 | 2022-06-24 | 浙江商汤科技开发有限公司 | 位姿获取方法、装置、电子设备及存储介质 |
CN116352323B (zh) * | 2023-04-10 | 2024-07-30 | 深圳市晨东智能家居有限公司 | 一种交互式的焊接环境建模系统及方法 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20120082319A (ko) * | 2011-01-13 | 2012-07-23 | 주식회사 팬택 | 윈도우 형태의 증강현실을 제공하는 장치 및 방법 |
CN109087359A (zh) * | 2018-08-30 | 2018-12-25 | 网易(杭州)网络有限公司 | 位姿确定方法、位姿确定装置、介质和计算设备 |
CN112197764A (zh) * | 2020-12-07 | 2021-01-08 | 广州极飞科技有限公司 | 实时位姿确定方法、装置及电子设备 |
CN113190120A (zh) * | 2021-05-11 | 2021-07-30 | 浙江商汤科技开发有限公司 | 位姿获取方法、装置、电子设备及存储介质 |
Family Cites Families (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10515259B2 (en) * | 2015-02-26 | 2019-12-24 | Mitsubishi Electric Research Laboratories, Inc. | Method and system for determining 3D object poses and landmark points using surface patches |
US10970425B2 (en) * | 2017-12-26 | 2021-04-06 | Seiko Epson Corporation | Object detection and tracking |
CN109463003A (zh) * | 2018-03-05 | 2019-03-12 | 香港应用科技研究院有限公司 | 对象识别 |
CN109947886B (zh) * | 2019-03-19 | 2023-01-10 | 腾讯科技(深圳)有限公司 | 图像处理方法、装置、电子设备及存储介质 |
CN110930453B (zh) * | 2019-10-30 | 2023-09-08 | 北京迈格威科技有限公司 | 目标物体定位方法、装置及可读存储介质 |
CN110866496B (zh) * | 2019-11-14 | 2023-04-07 | 合肥工业大学 | 基于深度图像的机器人定位与建图方法和装置 |
CN111199564B (zh) * | 2019-12-23 | 2024-01-05 | 中国科学院光电研究院 | 智能移动终端的室内定位方法、装置与电子设备 |
CN111311758A (zh) * | 2020-02-24 | 2020-06-19 | Oppo广东移动通信有限公司 | 增强现实处理方法及装置、存储介质和电子设备 |
CN111833457A (zh) * | 2020-06-30 | 2020-10-27 | 北京市商汤科技开发有限公司 | 图像处理方法、设备及存储介质 |
CN112637665B (zh) * | 2020-12-23 | 2022-11-04 | 北京市商汤科技开发有限公司 | 增强现实场景下的展示方法、装置、电子设备及存储介质 |
-
2021
- 2021-05-11 CN CN202110510890.0A patent/CN113190120B/zh active Active
- 2021-09-27 KR KR1020227017413A patent/KR102464271B1/ko active IP Right Grant
- 2021-09-27 WO PCT/CN2021/121034 patent/WO2022237048A1/zh active Application Filing
-
2022
- 2022-03-22 TW TW111110513A patent/TW202244680A/zh unknown
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20120082319A (ko) * | 2011-01-13 | 2012-07-23 | 주식회사 팬택 | 윈도우 형태의 증강현실을 제공하는 장치 및 방법 |
CN109087359A (zh) * | 2018-08-30 | 2018-12-25 | 网易(杭州)网络有限公司 | 位姿确定方法、位姿确定装置、介质和计算设备 |
CN112197764A (zh) * | 2020-12-07 | 2021-01-08 | 广州极飞科技有限公司 | 实时位姿确定方法、装置及电子设备 |
CN113190120A (zh) * | 2021-05-11 | 2021-07-30 | 浙江商汤科技开发有限公司 | 位姿获取方法、装置、电子设备及存储介质 |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116758157A (zh) * | 2023-06-14 | 2023-09-15 | 深圳市华赛睿飞智能科技有限公司 | 一种无人机室内三维空间测绘方法、系统及存储介质 |
CN116758157B (zh) * | 2023-06-14 | 2024-01-30 | 深圳市华赛睿飞智能科技有限公司 | 一种无人机室内三维空间测绘方法、系统及存储介质 |
Also Published As
Publication number | Publication date |
---|---|
CN113190120A (zh) | 2021-07-30 |
TW202244680A (zh) | 2022-11-16 |
KR102464271B1 (ko) | 2022-11-07 |
CN113190120B (zh) | 2022-06-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11928800B2 (en) | Image coordinate system transformation method and apparatus, device, and storage medium | |
US10810734B2 (en) | Computer aided rebar measurement and inspection system | |
EP3008694B1 (en) | Interactive and automatic 3-d object scanning method for the purpose of database creation | |
WO2020206903A1 (zh) | 影像匹配方法、装置及计算机可读存储介质 | |
JP5950973B2 (ja) | フレームを選択する方法、装置、及びシステム | |
JP5722502B2 (ja) | モバイルデバイスのための平面マッピングおよびトラッキング | |
JP7017689B2 (ja) | 情報処理装置、情報処理システムおよび情報処理方法 | |
CN104885098B (zh) | 基于移动装置的文本检测及跟踪 | |
CN110986969B (zh) | 地图融合方法及装置、设备、存储介质 | |
JP6184271B2 (ja) | 撮像管理装置、撮像管理システムの制御方法およびプログラム | |
WO2022237048A1 (zh) | 位姿获取方法、装置、电子设备、存储介质及程序 | |
US11094079B2 (en) | Determining a pose of an object from RGB-D images | |
WO2019042426A1 (zh) | 增强现实场景的处理方法、设备及计算机存储介质 | |
JP6571108B2 (ja) | モバイル機器用三次元ジェスチャのリアルタイム認識及び追跡システム | |
KR20180005168A (ko) | 로컬화 영역 설명 파일에 대한 프라이버시-민감 질의 | |
JP2018523881A (ja) | データを位置合わせする方法及びシステム | |
CN111127524A (zh) | 一种轨迹跟踪与三维重建方法、系统及装置 | |
US10249058B2 (en) | Three-dimensional information restoration device, three-dimensional information restoration system, and three-dimensional information restoration method | |
CN110866497A (zh) | 基于点线特征融合的机器人定位与建图方法和装置 | |
CN105809664B (zh) | 生成三维图像的方法和装置 | |
CN115049731B (zh) | 一种基于双目摄像头的视觉建图和定位方法 | |
CN112750164B (zh) | 轻量化定位模型的构建方法、定位方法、电子设备 | |
JP6305856B2 (ja) | 画像処理装置、画像処理方法、およびプログラム | |
CN116243837A (zh) | 一种画面显示方法、系统、设备及计算机可读存储介质 | |
CN112614166A (zh) | 基于cnn-knn的点云匹配方法和装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
ENP | Entry into the national phase |
Ref document number: 2022528237 Country of ref document: JP Kind code of ref document: A |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21941614 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21941614 Country of ref document: EP Kind code of ref document: A1 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21941614 Country of ref document: EP Kind code of ref document: A1 |