CN114721511A

CN114721511A - Method and device for positioning three-dimensional object

Info

Publication number: CN114721511A
Application number: CN202210261706.8A
Authority: CN
Inventors: 葛凯麟
Original assignee: Pilosmart Technology Beijing LLC
Current assignee: Pilosmart Technology Beijing LLC
Priority date: 2019-04-24
Filing date: 2019-04-24
Publication date: 2022-07-08
Also published as: CN110134234B; CN110134234A

Abstract

The invention discloses a method for positioning a three-dimensional object, which comprises the following steps: acquiring an image of a three-dimensional object by using a single camera, and identifying the posture and the position of the three-dimensional object, wherein the outer surface of the three-dimensional object is divided into a plurality of areas, and the colors of adjacent areas are different; displaying a virtual object corresponding to the three-dimensional object; and after the change of the posture and/or the position of the three-dimensional object in the three-dimensional space is detected, adjusting the virtual object according to the change quantity of the posture and/or the position. Correspondingly, the embodiment of the invention provides a device for positioning a three-dimensional object, which solves the problem that the change of a virtual object cannot be adaptively adjusted according to the dynamic change of the three-dimensional object in a single-camera scene in the prior art.

Description

Method and device for positioning three-dimensional object

Technical Field

The invention belongs to the technical field of augmented reality, and particularly relates to a method and a device for positioning a three-dimensional object.

Background

In the current augmented reality application, a three-dimensional object may be acquired through a camera, the three-dimensional object is recognized, and a corresponding virtual three-dimensional object is displayed in a virtual scene, for example, an image of a hexahedron may be acquired, and a corresponding virtual object, such as a virtual hexahedron, a virtual character, a virtual globe, and the like, may be displayed in a display screen.

In the prior art, a virtual object can be displayed in a manner of acquiring a three-dimensional object, but the change of the virtual object cannot be adaptively adjusted according to the dynamic change of the three-dimensional object in a single-camera use scene, so that the cost is high, and the position and the posture of the three-dimensional object are difficult to finely recognize.

Disclosure of Invention

The invention provides a method and a device for positioning a three-dimensional object, which solve the problem that the change of a virtual object cannot be adaptively adjusted according to the dynamic change of the three-dimensional object in a single-camera scene in the prior art.

In order to achieve the above object, the present invention provides a method for positioning a three-dimensional object, comprising:

acquiring an image of a three-dimensional object by using a single camera, and identifying the posture and the position of the three-dimensional object, wherein the outer surface of the three-dimensional object is divided into a plurality of areas, and the colors of adjacent areas are different;

displaying a virtual object corresponding to the three-dimensional object;

and after the change of the posture and/or the position of the three-dimensional object in the three-dimensional space is detected, adjusting the virtual object according to the change quantity of the posture and/or the position.

In one embodiment, the adjusting the virtual object includes:

rotating the virtual object in a virtual space, wherein the rotation angle and the angular velocity of the virtual object correspond to the posture variation of the three-dimensional object, or,

and moving the position of the virtual object in the virtual space, wherein the displacement amount of the virtual object corresponds to the position variation amount of the three-dimensional object.

In one embodiment, after the adjusting the virtual object, the method further includes:

when the camera acquires different combinations of different color surfaces of the three-dimensional object, different virtual scenes are entered according to a preset instruction, or,

displaying a first virtual scene when the angular velocity of rotation of the virtual object exceeds a first preset threshold, or,

and when the rotation angular velocity of the virtual object is lower than a second preset threshold value, displaying a second virtual scene, wherein the first preset threshold value is larger than the second preset threshold value.

In one embodiment, the adjusting the virtual object according to the amount of change in the posture and/or the position includes:

and adjusting the size of the virtual object according to the depth-of-field distance between the three-dimensional object and the camera.

In one embodiment, recognizing the pose and position of the three-dimensional object comprises:

carrying out color block segmentation on an image, and decomposing the image into a plurality of areas with different colors;

averaging the colors of each region, and traversing all adjacent color block pairs;

screening the color block pairs by using a table look-up method, and screening out an area matched with a preset model;

and calculating orientation data of the matching area, and acquiring the position and the posture of the three-dimensional object.

In one embodiment, after the calculating the orientation data of the matching area, the method further includes:

calculating a candidate solution corresponding to the matching area;

pairwise comparing the compatibility of the candidate solutions, discarding any one of the two compatible candidate solutions;

identifying edge pixels between the pair of color patches using an edge detection algorithm;

optimizing the position and pose of the three-dimensional object using an optimization formula, the optimization formula being:

where P is a position and attitude parameter of the three-dimensional object, includingPosition coordinates (x, y, z) and attitude angle (q)_w，q_x，q_y，q_z) (ii) a f is a projection function for calculating the point X on the three-dimensional object surface when the three-dimensional object is in the P posture_iThe position of (a); e, calculating the difference between the projection position and the observation position by taking the cost function as an E; x is the number of_iAnd theta_iIs an edge pixel point, x, detected in the image_iIs the coordinate of the edge point in the image, θ_iIs the tangent angle of the edge point.

In one embodiment, after the color block segmentation is performed on the image, the method further comprises:

establishing a model of the three-dimensional object;

traversing adjacent surfaces in the model, and recording the color pairs and the surface orientations of the adjacent surfaces;

and recording the coordinate information of the boundary of all adjacent surfaces.

The embodiment of the invention also provides a method for positioning the three-dimensional object, which comprises the following steps:

acquiring an image of a three-dimensional object by using a single camera, wherein the outer surface of the three-dimensional object is divided into a plurality of areas, and the colors of the adjacent areas are different;

recording two or more adjacent surfaces in the three-dimensional object;

calculating orientation data of the matching area, and acquiring the position and the posture of the three-dimensional object;

and displaying a virtual object corresponding to the three-dimensional object.

calculating a candidate solution corresponding to the matching area;

where P is the position and attitude parameters of the three-dimensional object, including position coordinates (x, y, z) and attitude angle (q)_w，q_x，q_y，q_z) (ii) a f is a projection function for calculating a point X on the surface of the three-dimensional object when the three-dimensional object is in the P posture_iThe position of (a); e, calculating the difference between the projection position and the observation position by taking the cost function as an E; x is the number of_iAnd theta_iIs an edge pixel point, x, detected in the image_iIs the coordinate of the edge point in the image, θ_iIs the tangent angle of the edge point.

Embodiments of the present invention further provide a positioning apparatus for a three-dimensional object, the apparatus including a processor and a memory for storing a computer program capable of running on the processor; wherein the processor is configured to execute the method for positioning a three-dimensional object when running the computer program.

The embodiment of the invention also provides a computer-readable storage medium, on which computer-executable instructions are stored, and the computer-executable instructions are used for executing the method for positioning the three-dimensional object.

The embodiment of the invention provides a method and a device for positioning a three-dimensional object, wherein the method identifies different color combination areas of the three-dimensional object through a single camera, determines the spatial attitude and the position of the three-dimensional object, displays a corresponding virtual object, can also determine the spatial attitude and the position of the three-dimensional object in a refined manner, and can also measure the depth of field of the three-dimensional object through the single camera, so that the method and the device are low in cost and high in precision.

Drawings

FIG. 1 is a flow chart of a method for locating a three-dimensional object in an embodiment of the invention;

FIG. 2 is a flow chart of a method for recognizing the pose and position of a three-dimensional object according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of three-dimensional object recognition in an embodiment of the present invention;

FIG. 4 is a further illustration of three-dimensional object recognition in an embodiment of the present invention;

FIG. 5 is a schematic diagram of an embodiment of the present invention for optimizing pose by using an edge of a color block of a three-dimensional object;

FIG. 6 is a schematic structural diagram of a three-dimensional object positioning device according to an embodiment of the present invention;

FIG. 7 is a schematic diagram of a three-dimensional object positioning device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.

To achieve the above object, as shown in fig. 1, an embodiment of the present invention provides an application program recommendation method, including:

s101, acquiring an image of a three-dimensional object by using a single camera, and identifying the posture and the position of the three-dimensional object, wherein the outer surface of the three-dimensional object is divided into a plurality of areas, and the colors of adjacent areas are different;

in the embodiment of the present invention, the three-dimensional object positioning and the virtual object display may be implemented by a three-dimensional object positioning device, optionally, the positioning device includes a single camera, a processing unit, and a display unit, the single camera collects an image of the three-dimensional object in a detection area, the processing unit implements image processing and finally determines the posture and position of the three-dimensional object, and creates a virtual object identical to or corresponding to the three-dimensional object in a virtual scene by means of an augmented reality AR technology. The three-dimensional object may be a polyhedron, such as a sphere, cylinder, tetrahedron, hexahedron, or the like. In the embodiment of the invention, the orientation and the posture of the three-dimensional object surface facing the camera can be determined through the color combination of each surface in the rotating or moving process of the three-dimensional object. For convenience of description, the embodiments of the present invention are described by taking a hexahedron as an example, and the rest of the three-dimensional objects (with different color surface combinations) are also within the scope of the embodiments of the present invention.

As shown in fig. 2, in one embodiment, the recognizing the pose and the position of the three-dimensional object may specifically be:

s201, carrying out color block segmentation on the image, and decomposing the image into a plurality of areas with different colors;

as shown in fig. 3, the image may be a yellow color region, a green color region, a blue color region, a black color region, a white color region, and a red color region.

Furthermore, after color block segmentation, a model of the three-dimensional object may also be built; traversing adjacent surfaces in the model, and recording the color pairs and the surface orientations of the adjacent surfaces; and recording the coordinate information of the boundary of all adjacent surfaces. That is, in the embodiment of the present invention, adjacent faces in the three-dimensional object may be traversed, color pairs (such as "green-red", "red-yellow", and the like) of the adjacent faces and orientations of the faces may be recorded, and a table may be built for use in a subsequent algorithm; at the same time, the coordinate information of the boundary of all adjacent surfaces (represented by a series of discrete sampling points) is recorded for subsequent accurate measurement.

When the camera can see two or more surfaces simultaneously, the position and the posture (namely the orientation) of the mark can be preliminarily judged through color combination, and then more accurate position and posture data can be obtained through an iterative algorithm.

S202, averaging the colors of each region, and traversing all adjacent color block pairs (pair);

s203, screening the color block pairs by using a table look-up method, and screening out an area matched with a preset model;

and (3) screening color block pairs by using a table look-up method, for example, "red-white" is matched with the three-dimensional object model, but "red-blue" and "green-purple" are not matched with the model (the former is because red and blue are not adjacent in the model, and the latter is because purple is not available in the model), discarding the color block pairs which are not matched, and screening out an area (color block pair) matched with a preset model.

And S204, calculating orientation data of the matching area, and acquiring the position and the posture of the three-dimensional object.

After screening, some candidate color block pairs are left, for example, "red-white", "red-black", "black-white", and the like are obtained in fig. 4, and the orientation data of the surfaces can be found out by using a table look-up method, so that the approximate direction of the camera relative to the mark can be known; meanwhile, due to the fact that the data of the two surfaces exist simultaneously, the approximate rotation angle of the camera (the rotation angle with the connecting line of the camera and the mark as an axis) can be calculated, and further the approximate position and the posture of the three-dimensional object can be calculated.

In the embodiment of the invention, the position and the posture can be further optimized, so that the posture and the position of the three-dimensional object in the space can be obtained in a refined manner. The specific method comprises the following steps:

after calculating the orientation data of the matching area, the embodiment of the present invention further includes:

s2041, calculating a candidate solution corresponding to the matching area;

s2042, pairwise comparing the compatibility of the candidate solutions, and discarding any one of the two compatible candidate solutions;

when the positions of the markers obtained by the two candidate solutions are closer to the posture (the difference between the distance and the angle is smaller than a certain threshold), the two candidate solutions can be considered to be compatible, that is, the two candidate solutions are color block pairs on the same marker. One of the solutions may be discarded at this point.

S2043, identifying edge pixels between the color block pairs by using an edge detection algorithm;

s2044, optimizing the position and the posture of the three-dimensional object by using an optimization formula, wherein the optimization formula is as follows:

where P is the position and attitude parameters of the three-dimensional object, including position coordinates (x, y, z) and attitude angle (q)_w，q_x，q_y，q_z) (ii) a f is a projection function for calculating the point X on the three-dimensional object surface when the three-dimensional object is in the P posture_iThe position of (a); e, calculating the difference between the projection position and the observation position by taking the cost function as an E; x is the number of_iAnd theta_iIs an edge pixel point, x, detected in the image_iIs the coordinate of the edge point in the image, θ_iIs the tangent angle of the edge point. After the above formula is subjected to iterative optimization by using a Levenberg-Marquardt algorithm, a better P can be obtained^*And obtaining the optimized position and posture of the marker.

After the attitude optimization is completed through an iterative algorithm, accurate attitude data of the three-dimensional object can be obtained.

In addition, for video data, kalman filtering (kalman filter) can be used to smooth the poses of successive frames, and finally a relatively stable sequence of poses can be obtained.

S102, displaying a virtual object corresponding to the three-dimensional object;

in the embodiment of the invention, the AR technology can be used for displaying the virtual object corresponding to the three-dimensional object in the virtual scene after the posture and the position of the three-dimensional object are recognized. For example, a hexahedron having the same size and shape as the three-dimensional object is displayed, and the color of each surface is identical to that of the actual three-dimensional object.

S103, after the change of the posture and/or the position of the three-dimensional object in the three-dimensional space is detected, the virtual object is adjusted according to the change of the posture and/or the position.

In one embodiment, the adjusting the virtual object may specifically be:

displaying a first virtual scene when the rotation angular velocity of the virtual object exceeds a first preset threshold, or,

In one embodiment, the adjusting the virtual object according to the variation of the posture and/or the position includes:

For example, in the field of games, the three-dimensional object may be in the shape of a polyhedron such as a magic wand, and the three-dimensional positioning method can be used to identify combinations of different face colors to determine which particular pose is currently facing the camera. When the user operates the magic wand, the virtual magic wand can be adaptively changed, for example, when the user rotates the magic wand, the virtual magic wand also synchronously rotates, and when the user moves the magic wand, the virtual magic wand also synchronously moves. Meanwhile, according to different distances, the depth of field of the actual magic stick relative to the camera can be measured through a single camera, the depth of field is different, the size of the virtual magic stick changes, and a user can use the characteristics to carry out different game experiences.

Meanwhile, a user can design different game experiences according to different characteristics of the magic wand, for example, the magic wand is rotated, the magic wand is switched to different sides before the camera is in front of the camera (namely, the color combinations collected by the camera are different), new interactive experiences can be designed, for example, when the magic wand is switched to the side A, a certain game is started, when the magic wand is switched to the side B, another game is started, when the magic wand is switched to the side B, an interactive virtual scene is generated, when the magic wand is switched to the side B, another scene is generated, meanwhile, when the rotating speed and the rotating angle are different, different games can be designed, namely, different angles and different speeds are used for rotating, the games corresponding to different sides are different, or the corresponding game playing methods are different.

In addition, the embodiment of the invention also provides a method for positioning the three-dimensional object, wherein the marked information is recorded in advance by a positioning algorithm, and the marked information comprises coordinates of all vertexes, parameter equations of edges and the color of a surface. The model is input with the following points:

1. traversing adjacent faces in the model, recording the color pairs (such as 'green-red', 'red-yellow' and the like) of the adjacent faces and the orientation of the faces, and establishing a table for a subsequent algorithm;

2. the coordinate information of the boundary of all adjacent surfaces (represented by a series of discrete sample points) is recorded for subsequent accurate measurement.

When the camera can see two or more surfaces simultaneously, the positions and postures (namely the orientations) of the marks can be preliminarily judged through color combination, and then more accurate position and posture data can be obtained through an iterative algorithm.

The method specifically comprises the following steps:

s501, acquiring an image of a three-dimensional object by using a single camera, wherein the outer surface of the three-dimensional object is divided into a plurality of areas, and the colors of adjacent areas are different;

s502, recording two or more adjacent surfaces in the three-dimensional object;

s503, carrying out color block segmentation on the image, and decomposing the image into a plurality of areas with different colors;

s504, averaging the colors of each region, and traversing all adjacent color block pairs;

for example, "red-white" is consistent with the model, but "red-blue" and "green-violet" are inconsistent with the model (the former because red and blue are not adjacent in the model, and the latter because there is no violet in the model).

S505, screening the color block pairs by using a table look-up method, and screening out an area matched with a preset model;

for example, "red-white", "red-black", "black-white", etc. are obtained in fig. 4, and the orientation data of these faces can be found by using a table look-up method, so as to know the approximate orientation of the camera with respect to the mark; meanwhile, due to the fact that the data of the two surfaces exist simultaneously, the approximate rotation angle of the camera (the rotation angle with the connecting line of the camera and the mark as an axis) can be calculated, and the approximate position and the posture of the marker can be further calculated.

S506, calculating orientation data of the matching area, and acquiring the position and the posture of the three-dimensional object;

And S507, displaying the virtual object corresponding to the three-dimensional object.

s5061, calculating a candidate solution corresponding to the matching area;

s5062, pairwise comparing the compatibility of the candidate solutions, discarding any one of the two compatible candidate solutions;

s5063, identifying edge pixels between the color lump pairs by using an edge detection algorithm;

s5064, optimizing the position and the posture of the three-dimensional object by using an optimization formula, wherein the optimization formula is:

where P is the position and attitude parameters of the three-dimensional object, including the position coordinates (x, y, z) and attitude angle (q)_w，q_x，q_y，q_z) The attitude angle here is expressed by a quaternion (quaternion); f is a projection function for calculating the point X on the marker in P attitude_iWhere on the image captured by the camera; e is a cost function, the difference between the projection position and the observation position is calculated, and the cost is larger when the difference is larger; x is the number of_iAnd theta_iIs the edge pixel point, x, detected in the image_iIs the coordinate of the edge point in the image, θ_iIs the tangent angle of the edge point. After the Levenberg-Marquardt algorithm is used for carrying out iterative optimization on the formula, a better P can be obtained^*And obtaining the optimized position and posture of the marker.

In addition, for video data, kalman filtering (kalman filter) may be used to smooth the poses of successive frames, resulting in a relatively stable sequence of poses.

Fig. 5 is a schematic diagram illustrating a principle of posture optimization by using a three-dimensional object color block edge in an embodiment of the present invention, where a dotted line is a color block edge estimated from a current estimated posture, and a solid line is a color block edge obtained by actual shooting. And during optimization, the dotted line is closed to the solid line by adjusting the posture.

In addition, in the embodiment of the invention, the system can be combined with a remote control module to carry out human-computer interaction on the virtual object. A conventional remote controller has a modeled operation instruction set, and presses a button to generate an instruction signal and transmit the signal, so that a terminal such as a television, an air conditioner, or the like responds to the instruction. In the embodiment of the invention, a series of operation protocols can be set in a targeted manner, and the remote control command is combined with the space posture/position and the like of the three-dimensional object to form a set of new human-computer interaction protocols. For example, a series of instruction sets may be provided in the remote control module, and the resulting interactions at different spatial poses/positions may be differentiated. After a key moving to the right is pressed, if the position of the three-dimensional object is not detected to be updated, the virtual object moves to the right by one grid, and if the position of the three-dimensional object is detected to be updated, a key moving to the right is also pressed, and three grids can be moved to the right.

FIG. 6 is a schematic view of one of the three-dimensional positioning devices in an embodiment of the present invention; a user may operate an application with a device such as a cell phone, tablet, or computer. The device may include information entry devices such as cameras, microphones, application props, and/or AR devices. The apparatus may also include output devices such as a display device, speakers, etc.

The embodiment of the invention also provides a storage medium, which stores computer instructions, and the instructions are executed by a processor to realize the method for realizing the three-dimensional object positioning.

Fig. 7 is a schematic structural diagram of a system according to an embodiment of the present invention. The system 600 may include one or more Central Processing Units (CPUs) 610 (e.g., one or more processors) and memory 620, one or more storage media 630 (e.g., one or more mass storage devices) that store applications 632 or data 634. Memory 620 and storage medium 630 may be, among other things, transient or persistent storage. The program stored in the storage medium 630 may include one or more modules (not shown), each of which may include a series of instruction operations for the apparatus. Still further, the central processor 610 may be configured to communicate with the storage medium 630 to execute a series of instruction operations in the storage medium 630 on the device 600. The system 600 may further include one or more power supplies 640, one or more wired or wireless network interfaces 650, and one or more input/output interfaces 660. the steps performed by the above-described method embodiment for three-dimensional positioning may be based on the system architecture shown in fig. 7.

It should be understood that, in the various embodiments of the present application, the sequence numbers of the processes do not mean the execution sequence, and the execution sequence of the processes should be determined by the functions and the inherent logic thereof, and should not constitute any limitation to the implementation process of the embodiments of the present application.

Those of ordinary skill in the art will appreciate that the various illustrative modules and method steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application. It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the system, the apparatus and the module described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

All parts of the specification are described in a progressive mode, the same and similar parts of all embodiments can be referred to each other, and each embodiment is mainly introduced to be different from other embodiments. In particular, as to the apparatus and system embodiments, since they are substantially similar to the method embodiments, the description is relatively simple and reference may be made to the description of the method embodiments in relevant places.

Finally, it is to be noted that: the above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the scope of the application. To the extent that such modifications and variations of the present application fall within the scope of the claims and their equivalents, they are intended to be included within the scope of the present application.

Claims

1. A method of three-dimensional object localization, comprising:

displaying a virtual object corresponding to the three-dimensional object;

after the change of the posture and/or the position of the three-dimensional object in the three-dimensional space is detected, adjusting the virtual object according to the change quantity of the posture and/or the position;

receiving an input instruction and generating an instruction code;

and adjusting the virtual object according to the instruction code and the variation of the posture and/or the position.

2. The method of claim 1, wherein the adjusting the virtual object comprises:

3. The method of claim 2, wherein after the adjusting the virtual object, further comprising:

4. The method of claim 1, wherein the adjusting the virtual object according to the amount of change in the pose and/or position comprises:

5. The method of claim 1, wherein identifying the pose and position of the three-dimensional object comprises:

6. The method of claim 5, wherein after calculating orientation data for the matching region, further comprising:

calculating a candidate solution corresponding to the matching area;

optimizing the position and attitude of the three-dimensional object using an optimization formula, the optimization formula being:

where P is the position and attitude parameters of the three-dimensional object, including position coordinates (x, y, z) and attitude angle (q)_w,q_x,q_y,q_z) (ii) a f is a projection function for calculating the point X on the three-dimensional object surface when the three-dimensional object is in the P posture_iThe position of (a); e, calculating the difference between the projection position and the observation position by taking the cost function as an E; x is the number of_iAnd theta_iIs an edge pixel point, x, detected in the image_iIs the coordinate of the edge point in the image, θ_iIs the tangent angle of the edge point.

7. A three-dimensional object positioning apparatus, characterized by an apparatus processor and a memory for storing a computer program operable on the processor; wherein the processor is adapted to perform the method of three-dimensional object localization of any of claims 1 to 6 when running the computer program.