CN114004890B

CN114004890B - Attitude determination method and apparatus, electronic device, and storage medium

Info

Publication number: CN114004890B
Application number: CN202111300669.9A
Authority: CN
Inventors: 周杰; 饶童; 胡洋; 陶宁; 黄乘风
Original assignee: You Can See Beijing Technology Co ltd AS
Current assignee: You Can See Beijing Technology Co ltd AS
Priority date: 2021-11-04
Filing date: 2021-11-04
Publication date: 2023-03-24
Anticipated expiration: 2041-11-04
Also published as: CN114004890A

Abstract

The embodiment of the disclosure discloses a posture determining method and device, electronic equipment and a storage medium. The attitude determination method comprises the following steps: acquiring a target image shot by the image shooting equipment under the condition that the angle value between the first direction and the second direction is smaller than or equal to a preset angle threshold; wherein, the shooting entity indicated by the shooting object contained in the target image has a plane structure; the first direction represents a direction from the planar structure to the image capturing device, and the second direction represents an orientation of the planar structure; converting the target image into a camera model image under a preset camera model; based on the camera model image, relative pose information between a point in the planar structure and a corresponding pixel in the target image is determined. The embodiment of the disclosure can enrich the use scenes and the application range of the method for calculating the gesture of the shooting entity through the image, and is beneficial to improving the accuracy of gesture calculation.

Description

Attitude determination method and apparatus, electronic device, and storage medium

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a method and an apparatus for determining a gesture, an electronic device, and a storage medium.

Background

With the continuous development of computer technology, determining the relative posture information of one point in the world coordinate system and the corresponding pixel in the camera coordinate system has become a subject of extensive research in the fields of computer vision, image processing and the like in the prior art.

For example, during VR (Virtual Reality) data acquisition, the positioning functionality of a mobile computing device may be utilized to obtain the pose of an image capture device (e.g., a panoramic camera). In particular, the image capture device may be rigidly connected with the mobile computing device. However, the relative orientation of the rigid connection is not unique, and problems such as flipping may occur. Furthermore, the calculation of the relative attitude is difficult in some installations. Therefore, the implementation of the posture calculation in the prior art often needs to depend on a specific connection mode, and the use scene and the application range are limited.

Therefore, how to enrich the use scene and the application range of the posture calculation method and improve the accuracy of the posture calculation is a problem worthy of attention.

Disclosure of Invention

The embodiment of the disclosure provides a posture determining method and device, electronic equipment and a storage medium, so as to enrich the use scene and the application range of a posture calculating method and improve the accuracy of posture calculation.

According to a first aspect of the embodiments of the present disclosure, there is provided a method for determining an attitude, including:

acquiring a target image shot by the image shooting equipment under the condition that the angle value between the first direction and the second direction is smaller than or equal to a preset angle threshold; wherein a photographic entity indicated by a photographic object contained in the target image has a planar structure; the first direction represents a direction of the planar structure to the image capturing device, and the second direction represents an orientation of the planar structure;

converting the target image into a camera model image under a preset camera model;

determining, based on the camera model image, relative pose information between a point in the planar structure and a corresponding pixel in the target image, wherein the corresponding pixel in the target image is: an image of a point in the planar structure in the target image.

Optionally, in a method according to any embodiment of the present disclosure, the target image is converted into a camera model image under a preset camera model; determining, based on the camera model image, relative pose information between a point in the planar structure and a corresponding pixel in the target image, including:

s1: determining first rotation information;

s2: taking a predetermined initial focal length as a camera focal length;

s3: converting the target image into a camera model image under a camera model based on the first rotation information and the camera focal length;

s4: detecting the plane structure in the camera model image, and updating the camera focal length according to the detection result;

s5: s3, obtaining the updated image of the camera model;

s6: detecting the plane structure in the camera model image, and calculating second rotation information;

s7: determining pose information between a point in the planar structure and a corresponding pixel in the target image according to the first rotation information and the second rotation information.

Optionally, in the method according to any embodiment of the present disclosure, the acquiring a target image captured by an image capturing apparatus when an angle value between a first direction and a second direction is smaller than or equal to a preset angle threshold includes:

acquiring a target image shot by image shooting equipment under the condition that preset shooting conditions are met;

wherein the preset shooting condition includes: an angle value between the first direction and the second direction is less than or equal to a preset angle threshold, and a distance between the image capturing apparatus and the capturing entity is less than or equal to a preset distance threshold.

Optionally, in the method of any embodiment of the present disclosure, the preset camera model is a pinhole camera model; and

the converting the target image into a camera model image under a preset camera model includes:

and converting the target image into a camera model image under a pinhole camera model.

Optionally, in the method of any embodiment of the present disclosure, the determining the first rotation information includes:

determining first rotation information of the projection direction to the optical axis direction based on the projection direction of the shooting entity in a first camera coordinate system of the image shooting device and the Z-axis direction of the first camera coordinate system;

and converting the target image into a camera model image under a camera model based on the first rotation information and the camera focal length.

Optionally, in a method according to any embodiment of the present disclosure, the converting the target image into a camera model image under a camera model based on the first rotation information and the camera focal length includes:

determining homogeneous coordinates of pixels in the camera model image in a projection coordinate system based on coordinates of the pixels in the camera model image in a pixel coordinate system;

determining the camera coordinates of the homogeneous coordinate in the first camera coordinate system based on the first rotation information and the camera focal length;

and determining the pixel value of a pixel in a camera model image under a preset camera model based on the first camera coordinate to obtain the camera model image under the pinhole camera model.

Optionally, in the method according to any embodiment of the present disclosure, the determining, based on the first camera coordinates, a pixel value of a pixel in a camera model image under a preset camera model includes:

normalizing the first camera coordinate to a unit spherical surface to obtain a spherical coordinate;

determining floating point number pixel coordinates in the target image based on the spherical coordinates;

and determining the pixel value of a pixel in a camera model image under a preset camera model based on the floating point number pixel coordinate in the target image by adopting a bilinear interpolation algorithm.

Optionally, in the method of any embodiment of the present disclosure, the planar structure includes at least one quadrilateral mark, and the camera model image includes a mark object of at least a partial quadrilateral mark of the quadrilateral marks; and

the detecting the planar structure in the camera model image and updating the camera focal length according to the detection result includes:

detecting each corner object in the camera model image, wherein the corner objects are: an image of the corner points of the quadrilateral markers;

determining a distance with a maximum numerical value from distances from each corner point object in the camera model image to the center of the camera model image;

and updating the camera focal length of the pinhole camera model based on the distance with the maximum numerical value and the camera focal length.

the detecting the planar structure in the camera model image, and calculating second rotation information, including:

determining second rotation information between a corner point of each quadrilateral mark in the planar structure and a corresponding pixel in the camera model image based on the camera model image, wherein the corresponding pixel in the camera model image is: and in the camera model image, the image of the corner points of the quadrilateral marks.

Optionally, in the method of any embodiment of the present disclosure, the quadrilateral mark includes a direction identifier; and

the determining, based on the camera model image, second rotation information between a corner point of each quadrilateral mark in the planar structure and a corresponding pixel in the camera model image includes:

detecting the corner of each quadrilateral mark in the camera model image to obtain a corner coordinate sequence of the quadrilateral mark, wherein the corner coordinates in the corner coordinate sequence are arranged according to a predetermined sequence, and the first corner coordinate in the corner coordinate sequence is determined based on the direction indicated by the direction identifier contained in the quadrilateral mark;

calculating a homography transformation matrix from the corner point coordinate sequence of the quadrilateral mark to a preset corner point coordinate sequence under an ideal coordinate system;

second rotation information between the corner points of the quadrilateral marking in the planar structure and corresponding pixels in the camera model image is determined based on the homography transformation matrix.

Optionally, in the method according to any embodiment of the present disclosure, the planar structure is a display screen, the image presented by the display screen includes mark objects of quadrilateral marks arranged in rows and columns, and a size of the mark object in the image presented by the display screen is determined based on a resolution of the display screen.

Optionally, in the method according to any embodiment of the present disclosure, the determining, according to the first rotation information and the second rotation information, pose information between a point in the planar structure and a corresponding pixel in the target image includes:

pose information between a point in the planar structure and a corresponding pixel in the target image is a product of the first rotation information and the second rotation information.

According to a second aspect of the embodiments of the present disclosure, there is provided an attitude determination apparatus including:

an acquisition unit configured to acquire a target image captured by the image capturing apparatus in a case where an angle value between the first direction and the second direction is less than or equal to a preset angle threshold; wherein a photographic entity indicated by a photographic subject contained in the target image has a planar structure; the first direction represents a direction of the planar structure to the image capturing device, and the second direction represents an orientation of the planar structure;

a conversion unit configured to convert the target image into a camera model image under a preset camera model;

a determination unit configured to determine, based on the camera model image, relative pose information between a point in the planar structure and a corresponding pixel in the target image, wherein the corresponding pixel in the target image is: an image of a point in the planar structure in the target image.

Optionally, in the apparatus according to any embodiment of the present disclosure, the obtaining unit includes:

an acquisition subunit configured to acquire a target image captured by the image capturing apparatus in a case where a preset capturing condition is satisfied;

Optionally, in the apparatus according to any embodiment of the present disclosure, the preset camera model is a pinhole camera model; and

the conversion unit includes:

a conversion subunit configured to convert the target image into a camera model image under a pinhole camera model.

Optionally, in an apparatus according to any embodiment of the present disclosure, the converting subunit includes:

a first determination module configured to determine first rotation information of the projection direction to the optical axis direction based on a projection direction of the photographic entity in a first camera coordinate system of the image photographing apparatus and a Z-axis direction of the first camera coordinate system;

a conversion module configured to convert the target image into a camera model image under a camera model based on the first rotation information and the camera focal length.

Optionally, in an apparatus according to any embodiment of the present disclosure, the conversion module includes:

a first determination sub-module configured to determine homogeneous coordinates of pixels in the camera model image in a projection coordinate system based on coordinates of the pixels in the camera model image in a pixel coordinate system;

a second determination sub-module configured to determine camera coordinates of the homogeneous coordinates in the first camera coordinate system based on the first rotation information and the camera focal length;

and the third determining submodule is configured to determine pixel values of pixels in a camera model image under a preset camera model based on the first camera coordinates to obtain the camera model image under the pinhole camera model.

Optionally, in the apparatus according to any embodiment of the present disclosure, the third determining sub-module is specifically configured to:

normalizing the first camera coordinates to a unit spherical surface to obtain spherical coordinates;

Optionally, in the apparatus of any embodiment of the present disclosure,

the detecting the planar structure in the camera model image and updating the focal length of the camera according to the detection result includes:

detecting each corner object in the camera model image, wherein the corner objects are: an image of the corner points of the quadrilateral marks;

and updating the camera focal length of the pinhole camera model based on the distance with the maximum numerical value and the camera focal length. Optionally, in the apparatus according to any one of the embodiments of the present disclosure, the planar structure includes at least one quadrilateral mark, and the camera model image includes a mark object of at least a partial quadrilateral mark of the quadrilateral marks; and

determining second rotation information between the corner point of each quadrilateral mark in the planar structure and a corresponding pixel in the camera model image based on the camera model image, wherein the corresponding pixel in the camera model image is: and in the camera model image, the image of the corner point of the quadrilateral mark.

Optionally, in the apparatus of any embodiment of the present disclosure, the quadrilateral mark includes a direction identifier; and

the first determining subunit includes:

a detection module configured to detect corner points of each quadrilateral mark in the camera model image to obtain a corner point coordinate sequence of the quadrilateral mark, wherein the corner point coordinates in the corner point coordinate sequence are arranged according to a predetermined sequence, and a first corner point coordinate in the corner point coordinate sequence is determined based on a direction indicated by a direction identifier contained in the quadrilateral mark;

a calculation module configured to calculate a homography transformation matrix from the corner coordinate sequence of the quadrilateral marker to a preset corner coordinate sequence in an ideal coordinate system;

a second determination module configured to determine second rotation information between the corner points of the quadrilateral marking in the planar structure and corresponding pixels in the camera model image based on the homography transformation matrix.

Optionally, in the apparatus according to any embodiment of the present disclosure, the planar structure is a display screen, the image presented by the display screen includes mark objects of quadrilateral marks arranged in rows and columns, and a size of the mark object in the image presented by the display screen is determined based on a resolution of the display screen.

Optionally, in the apparatus according to any embodiment of the present disclosure, the determining, according to the first rotation information and the second rotation information, pose information between a point in the planar structure and a corresponding pixel in the target image includes:

According to a third aspect of the embodiments of the present disclosure, there is provided an electronic apparatus including:

a memory for storing a computer program;

a processor for executing the computer program stored in the memory, and when the computer program is executed, implementing the method of any embodiment of the attitude determination method of the first aspect of the present disclosure.

According to a fourth aspect of embodiments of the present disclosure, there is provided a computer-readable medium, wherein when being executed by a processor, the computer program implements the method according to any one of the embodiments of the gesture determination method according to the first aspect.

According to a fifth aspect of embodiments of the present disclosure, there is provided a computer program comprising computer readable code which, when run on a device, causes a processor in the device to execute instructions for implementing the steps in the method as in any of the embodiments of the pose determination method of the first aspect described above.

Based on the posture determining method, the posture determining device, the electronic device and the storage medium provided by the embodiment of the disclosure, a target image shot by the image shooting device under the condition that the angle value between the first direction and the second direction is less than or equal to a preset angle threshold value can be obtained; wherein, the shooting entity indicated by the shooting object contained in the target image has a plane structure; the first direction represents a direction from the planar structure to the image capturing device, the second direction represents an orientation of the planar structure, the target image is converted into a camera model image under a preset camera model, and relative posture information between a point in the planar structure and a corresponding pixel in the target image is determined based on the camera model image, wherein the corresponding pixel in the target image is: and in the target image, an image of a point in the planar structure. Therefore, the embodiment of the disclosure can determine the relative posture information between the point in the planar structure and the corresponding pixel in the target image based on the target image shot by the image shooting device under the condition that the angle value between the first direction and the second direction is less than or equal to the preset angle threshold, so that the use scenes and the application range of the method for calculating the posture of the shooting entity through the image are enriched, and the accuracy of posture calculation is improved.

The technical solution of the present disclosure is further described in detail by the accompanying drawings and examples.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description, serve to explain the principles of the disclosure.

The present disclosure may be more clearly understood from the following detailed description, taken with reference to the accompanying drawings, in which:

FIG. 1 is a flow chart of a first embodiment of the attitude determination method of the present disclosure.

FIG. 2 is a flow chart of a second embodiment of the attitude determination method of the present disclosure.

FIG. 3 is a flow chart of a third embodiment of the attitude determination method of the present disclosure.

Fig. 4A-4D are schematic diagrams of relationships between coordinate systems involved in the pose determination method of the present disclosure.

Fig. 4E-4G are schematic application scenarios of an embodiment of the gesture determination method of the present disclosure.

Fig. 5 is a schematic structural diagram of an embodiment of the attitude determination apparatus of the present disclosure.

Fig. 6 is a block diagram of an electronic device provided in an exemplary embodiment of the present disclosure.

Detailed Description

Various exemplary embodiments of the present disclosure will now be described in detail with reference to the accompanying drawings. It should be noted that: the relative arrangement of the components and steps, the numerical expressions, and numerical values set forth in these embodiments do not limit the scope of the present disclosure unless specifically stated otherwise.

It will be understood by those of skill in the art that the terms "first," "second," and the like in the embodiments of the present disclosure are used merely to distinguish one element from another, and are not intended to imply any particular technical meaning, nor is the necessary logical order between them.

It is also understood that in embodiments of the present disclosure, "a plurality" may refer to two or more and "at least one" may refer to one, two or more.

It is also to be understood that any reference to any component, data, or structure in the embodiments of the disclosure, may be generally understood as one or more, unless explicitly defined otherwise or stated otherwise.

In addition, the term "and/or" in the present disclosure is only one kind of association relationship describing the association object, and indicates that three relationships may exist, for example, a and/or B, may indicate: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" in the present disclosure generally indicates that the former and latter associated objects are in an "or" relationship.

It should also be understood that the description of the various embodiments of the present disclosure emphasizes the differences between the various embodiments, and the same or similar parts may be referred to each other, so that the descriptions thereof are omitted for brevity.

Meanwhile, it should be understood that the sizes of the respective portions shown in the drawings are not drawn in an actual proportional relationship for the convenience of description.

The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the disclosure, its application, or uses.

Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.

The disclosed embodiments may be applied to at least one of a terminal device, a computer system, and a server, which are operational with numerous other general purpose or special purpose computing system environments or configurations. Examples of well known computing systems, environments, and/or configurations that may be suitable for use with at least one electronic device of the group consisting of a terminal device, computer system, and server include, but are not limited to: personal computer systems, server computer systems, thin clients, thick clients, hand-held or laptop devices, microprocessor-based systems, set-top boxes, programmable consumer electronics, network personal computers, small computer systems, mainframe computer systems, and distributed cloud computing environments that include any of the above, and the like.

At least one of the terminal device, the computer system, and the server may be described in the general context of computer system-executable instructions, such as program modules, being executed by a computer system. Generally, program modules may include routines, programs, objects, components, logic, data structures, etc. that perform particular tasks or implement particular abstract data types. The computer system/server may be practiced in distributed cloud computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed cloud computing environment, program modules may be located in both local and remote computer system storage media including memory storage devices.

Referring to fig. 1, a flow 100 of a first embodiment of a gesture determination method according to the present disclosure is shown. The attitude determination method comprises the following steps:

and 101, acquiring a target image shot by the image shooting device under the condition that the angle value between the first direction and the second direction is smaller than or equal to a preset angle threshold value.

In this embodiment, an execution subject of the posture determination method (e.g., a server, a terminal device, an image processing unit having an image processing function, a posture determination apparatus in the present disclosure, etc.) may acquire, from other electronic devices or locally, a target image captured by the image capturing device in a case where an angle value between the first direction and the second direction is less than or equal to a preset angle threshold value, by a wired connection manner or a wireless connection manner.

In the present embodiment, the photographic subject indicated by the photographic subject included in the target image has a planar structure.

The photographic entity may be a physical entity having a planar structure existing in a three-dimensional space. As an example, the photographing entity may be a photo, a polyhedral shaped object, a mobile terminal having a display screen, and the like.

The subject may be a video of the subject included in the target image.

The target image may be an image captured via the image capturing apparatus in a case where an angle value between the first direction and the second direction is less than or equal to a preset angle threshold.

The image capturing apparatus may be an apparatus having an image capturing function. As an example, the image capturing apparatus may be a panoramic camera provided with two fisheye lenses, a panoramic camera provided with only one fisheye lens, a non-panoramic camera, or the like, and the present embodiment is not limited herein.

In this embodiment, the first direction represents a direction from the planar structure to the image capturing device. The second direction characterizes an orientation of the planar structure.

Optionally, the first direction and the second direction may be respectively characterized in a vector form. Here, the first direction may be a vector in which a predetermined position in the planar structure (for example, the center of the planar structure, or any predetermined point in the planar structure) points to a predetermined position in the image capturing apparatus (for example, the center of the image capturing apparatus, or any predetermined point in the image capturing apparatus). The second direction may be a direction perpendicular to the planar structure and directed toward the image capturing apparatus.

The preset angle threshold may be a predetermined angle value. As an example, the preset angle threshold is 60 degrees. If the angle between the first direction and the second direction exceeds a preset angle threshold, the image shooting device is easy to not shoot the image of the plane structure.

And 102, converting the target image into a camera model image under a preset camera model.

In this embodiment, the executing body may convert the target image into a camera model image under a preset camera model.

The preset camera model may include, but is not limited to: pinhole camera model, actual imaging model.

Here, the above 102 is exemplarily explained by taking a preset camera model as a pinhole camera model as an example, so that for each pixel coordinate in the camera model image, the RGB value of the pixel P corresponding to the pixel coordinate is calculated in the following manner to obtain the camera model image:

specifically, the pinhole camera in the pinhole camera model is an ideal virtual camera, and the projection direction d (i.e. the direction of the central line of sight of the virtual camera) may be first given, and the coordinate of the vector of d in the coordinate system of the second camera (e.g. the pinhole camera model) is set as (x) _d ，y _d ，z _d ) Z-axis unit vector e of coordinate system under world coordinate system _z Has coordinates of (0,0,1), and calculates vectors d to e according to the Rodrigues equation _z Of (3) a rotation matrix R _cd . Here, if the subject is located directly below the image capturing apparatus, the coordinate of d may be determined as (0,1,0). Therein, a world coordinate system, i.e. a coordinate system of a three-dimensional world, is introduced for describing the position of an object in the real world.

Then, given the target image initial focal length f and the width w of the resolution, the height h of the resolution, the pixel RGB (Red Green Blue) value of the pixel P in the camera model image is calculated. Wherein, let the coordinate of the pixel P in the camera model image under the pixel coordinate system in the camera model image be (u) _p ，v _p )，f _x Has a value of f, f _y Has a value of f, c _x Has a value of w/2,c _y Is h/2, and the matrix K is calculated by the following formula:

then, a homogeneous coordinate P 'of the pixel P in the projection coordinate system is calculated, wherein the coordinate of P' is (x) _p ，y _p ，z _p ). The projection coordinate system, also called the non-terrestrial projection coordinate system (notearth), or plane coordinates. If the world coordinate system is characterized by (x, y, z), the projection coordinate system can be characterized as (x, y).

Subsequently, P' is calculated in the image capturing apparatus according to the following formulaIs represented by coordinates P of a first camera coordinate system _d ＝(x _c ，y _c ，z _c ). The first camera coordinate system is a three-dimensional rectangular coordinate system established by taking the focusing center of the panoramic camera as an origin and taking the optical axis as a Z axis.

Next, P is added _d Normalizing to a unit spherical surface to obtain normalized P _c ：

P _c ＝P _d /||P _d ||

Finally, according to the spherical coordinate P _c The floating point number pixel coordinates (u ', v') in the target image are calculated, for example, if the spherical coordinates P _c Characterized by (a, b, 1), then the pixel coordinates (u ', v') are (a, b). Then, among the integer pixels of the adjacent 4 target images corresponding to the coordinates (u ', v'), RGB values, which are pixel values corresponding to the pixel P in the camera model image, are calculated using bilinear interpolation. For example, if the floating-point pixel coordinate in the target image obtained by the coordinate calculation of the pixel P in the camera model image is (10.4,3.75), the 4 integer pixel coordinates corresponding to (10.4,3.75) are (10,3), (10,4), (11,3) and (11,4), and the RGB values of the 4 integer pixels are used to perform bilinear interpolation to obtain the RGB value with the floating-point pixel coordinate of (10.4,3.75), which is used as the RGB value of the P point in the camera model image, and the RGB value of each pixel point P in the camera model image is obtained, so that the camera model image can be obtained.

Optionally, the executing body may further use a predetermined transformation matrix to transform the target image into a camera model image under a preset camera model (e.g., a pinhole camera model, an actual imaging model, a fish-eye camera model).

After the target image is converted into the camera model image under the preset camera model, the relationship between the shot object and the image of the shot object (for example, the pinhole imaging model, the shot object and the image of the shot object can satisfy the relationship of similar triangles) can be determined through the preset camera model, so that the calculation of determining the relative attitude information between the point in the planar structure and the corresponding pixel in the target image is simplified.

103, determining relative pose information between a point in the planar structure and a corresponding pixel in the target image based on the camera model image.

In this embodiment, the execution subject may determine relative pose information between a point in the planar structure and a corresponding pixel in the target image based on the camera model image. Wherein, the corresponding pixels in the target image are: and in the target image, the image of the point in the plane structure.

Here, the relative pose information may represent a relative pose between any point in the planar structure and a corresponding pixel of the point in the target image. The relative pose information may include, but is not limited to, at least one of: rotation information, distance information, orientation information, and the like.

For example, in a case where the relative orientation information includes distance information, the execution body may determine the distance information between the point in the planar structure and the corresponding pixel in the target image based on the number of the corresponding pixels in the camera model image and the target image or a ratio of the corresponding pixels in the camera model image, thereby obtaining the relative orientation information.

As yet another example, in a case where the relative posture information includes the orientation information, the executing body may determine the orientation information between the point in the planar structure and the corresponding pixel in the target image based on the position of the corresponding pixel in the camera model image, thereby obtaining the relative posture information.

In some optional implementations of the embodiment, the pose information between a point in the planar structure and a corresponding pixel in the target image is a product of the first rotation information and the second rotation information.

According to the posture determining method provided by the embodiment of the disclosure, a target image shot by the image shooting device under the condition that the angle value between the first direction and the second direction is smaller than or equal to the preset angle threshold value can be obtained; wherein, the shooting entity indicated by the shooting object contained in the target image has a plane structure; the first direction represents a direction from the planar structure to the image capturing device, the second direction represents an orientation of the planar structure, the target image is converted into a camera model image under a preset camera model, and relative posture information between a point in the planar structure and a corresponding pixel in the target image is determined based on the camera model image, wherein the corresponding pixel in the target image is: and in the target image, the image of the point in the plane structure. Therefore, the embodiment of the disclosure can determine the relative posture information between the point in the planar structure and the corresponding pixel in the target image based on the target image shot by the image shooting device under the condition that the angle value between the first direction and the second direction is less than or equal to the preset angle threshold, so that the use scenes and the application range of the method for calculating the posture of the shooting entity through the image are enriched, and the accuracy of posture calculation is improved.

In some optional implementations of this embodiment, the executing body may execute the step 101 to obtain the target image captured by the image capturing apparatus when the angle value between the first direction and the second direction is smaller than or equal to the preset angle threshold value:

and acquiring a target image shot by the image shooting equipment under the condition of meeting a preset shooting condition. Wherein, the preset shooting condition comprises: an angle value between the first direction and the second direction is less than or equal to a preset angle threshold, and a distance between the image capture device and the capture entity is less than or equal to a preset distance threshold.

It is to be understood that, in the above alternative implementation, the image capturing device needs to capture the target image not only to satisfy "the angle value between the first direction and the second direction is less than or equal to the preset angle threshold", but also to satisfy "the distance between the image capturing device and the capturing entity is less than or equal to the preset distance threshold". In this way, the texture information in the target image and the camera model image can be made richer and more accurate by defining the distance between the image capturing device and the capturing entity. And further the accuracy of attitude calculation can be improved.

In some optional implementations of this embodiment, the planar structure is a display screen, the image presented by the display screen includes mark objects of quadrilateral marks arranged in rows and columns, and the size of the mark objects in the image presented by the display screen is determined based on a resolution of the display screen.

It is to be understood that, in the above alternative implementation, the brightness of the display screen may be adjusted to control the contrast of the target image and the camera model image, the number of the mark objects is determined by the resolution of the display screen, and then the sharpness in the target image and the camera model image is adjusted, and the mark objects of the plurality of quadrilateral marks are presented by the display screen in a row-column arrangement, so as to ensure that the image of the mark object including at least one quadrilateral mark in the target image and the camera model image, and thus the relative posture information between the point in the planar structure and the corresponding pixel in the target image is determined by the geometric relationship presented by the image of the mark object of the at least one quadrilateral mark, thereby further improving the accuracy of the posture calculation.

In some optional implementation manners of this embodiment, the target image may be captured by the image capturing device in a case that an angle value between the first direction and the second direction is any angle value smaller than or equal to a preset angle threshold.

It can be understood that, in the above alternative implementation manner, the angle value between the first direction and the second direction does not need to be fixed, and only the angle value between the first direction and the second direction is less than or equal to the preset angle threshold, so that the use scene and the application range of the method for calculating the gesture of the shooting entity through the image can be improved, and the problem of gesture calculation error caused by the fact that the relative position between the image shooting device and the shooting entity does not meet the requirement is avoided to a certain extent.

With further reference to fig. 2, fig. 2 is a flow chart of a second embodiment of the pose determination method of the present disclosure. The process 200 of the gesture determination method includes:

and 201, acquiring a target image shot by the image shooting device under the condition that the angle value between the first direction and the second direction is less than or equal to a preset angle threshold value.

In this embodiment, an execution subject of the posture determination method (e.g., a server, a terminal device, an image processing unit having an image processing function, a posture determination device, etc.) may acquire, from other electronic devices or locally, a target image captured by the image capturing device in a case where an angle value between the first direction and the second direction is less than or equal to a preset angle threshold value, by a wired connection manner or a wireless connection manner.

Wherein the photographic subject indicated by the photographic subject included in the target image has a planar structure. The first direction represents a direction from the planar structure to the image capture device, and the second direction represents an orientation of the planar structure

In this embodiment, 201 is substantially the same as 101 in the corresponding embodiment of fig. 1, and is not described here again.

202, converting the target image into a camera model image under a pinhole camera model.

In this embodiment, the executing subject may convert the target image into a camera model image under a pinhole camera model.

In some optional implementations of the embodiment, the executing body may execute the step 202 in the following manner to convert the target image into a camera model image under a pinhole camera model:

the first step is to determine first rotation information from the projection direction to the optical axis direction based on the projection direction of the shooting entity in a first camera coordinate system of the image shooting device and the Z-axis direction of the first camera coordinate system.

As an example, the first rotation information of the projection direction to the optical axis direction may be calculated according to the rodgers formula.

And secondly, converting the target image into a camera model image under a pinhole camera model based on the first rotation information.

In some application scenarios of the above alternative implementation, the executing body may execute the second step in the following manner to convert the target image into a camera model image under a pinhole camera model based on the first rotation information:

a first substep of determining homogeneous coordinates of pixels in the camera model image in a projection coordinate system based on coordinates of the pixels in the camera model image in a pixel coordinate system. Wherein, the origin of the pixel coordinate system x-y is the midpoint of the pixel coordinate system. The abscissa u and the ordinate v of the pixel coordinate system u-v are the row and the column, respectively, in which the image is located. In the visual processing library OpenCV, u corresponds to x, and v corresponds to y.

A second substep of determining the camera coordinates of the homogeneous coordinates in the first camera coordinate system based on the first rotation information and the camera focal length.

And a third substep of determining a pixel value of a pixel in the camera model image under the preset camera model based on the first camera coordinates to obtain the camera model image under the preset camera model.

It is to be understood that, in the above application scenario, the camera model image under the pinhole camera model may be obtained based on the first rotation information and the coordinates of the pixels in the camera model image under the pixel coordinate system.

In some cases of the application scenario, the executing body may execute the third substep as follows to determine pixel values of pixels in a camera model image under a preset camera model based on the first camera coordinates:

firstly, normalizing the first camera coordinates to a unit spherical surface to obtain spherical coordinates.

Then, floating point pixel coordinates in the target image are determined based on the spherical coordinates.

And finally, determining the pixel value of the pixel in the camera model image under the preset camera model based on the floating point number pixel coordinate in the target image by adopting a bilinear interpolation algorithm.

It is to be understood that, in the above case, the pixel values of the pixels in the camera model image under the preset camera model may be determined through the normalization process and the bilinear interpolation algorithm. In this way, the accuracy of the pose estimation can be further improved by performing the pose estimation based on the camera model image.

In some application scenarios of the above alternative implementation, the planar structure includes at least one quadrilateral marking, and the camera model image includes marking objects of at least part of the quadrilateral marking. On this basis, the execution subject may detect the planar structure in the camera model image, and update the camera focal length according to the detection result as follows:

firstly, detecting each corner object in the camera model image, wherein the corner objects are: an image of the corner points of the quadrilateral marking.

Then, a distance having the largest numerical value is determined from the distances from the respective corner point objects in the camera model image to the center of the camera model image.

And then updating the camera focal length of the pinhole camera model based on the distance with the maximum numerical value and the camera focal length.

Here, the target focal length f' may be calculated by the following formula:

f’＝(h÷2)÷r×f

wherein h represents the high resolution of the target image, r represents the largest of the distances, and f represents the initial focal length.

And finally, converting the target image into a camera model image under a pinhole camera model based on the target focal length.

Here, the manner of converting the target image into the camera model image under the pinhole camera model based on the target focal length may be implemented by referring to the manner of converting the target image into the camera model image under the pinhole camera model based on the initial focal length, that is, the initial focal length is updated to the target focal length obtained here, so that the camera model image under the pinhole camera model here may be obtained.

It can be understood that, in the application scenario, the target focal length of the pinhole camera model may be recalculated based on the initial focal length, and the target focal length is taken as the optimal focal length, so as to convert the target image into the camera model image under the pinhole camera model. In this way, the accuracy of the pose estimation can be further improved by performing the pose estimation based on the camera model image.

Here, in the above alternative implementation, the target image may be converted into a camera model image under a pinhole camera model based on a projection direction of the shooting entity in a first camera coordinate system of the image shooting device and a Z-axis direction of the first camera coordinate system.

And 203, determining relative attitude information between the point in the plane structure and the corresponding pixel in the target image based on the camera model image.

In this embodiment, step 203 is substantially the same as step 103 in the embodiment corresponding to fig. 1, and is not described herein again.

It should be noted that, besides the above-mentioned contents, the embodiment of the present application may further include the same or similar features and effects as the embodiment corresponding to fig. 1, and details are not repeated herein.

As can be seen from fig. 2, the process 200 of the gesture determining method in this embodiment may convert the target image into a camera model image under the pinhole camera model, thereby simplifying the computational complexity of obtaining the camera model image, reducing the consumption of computational resources, and increasing the gesture computation speed.

Continuing to refer to fig. 3, fig. 3 is a flowchart of a third embodiment of the gesture determining method of the present disclosure. The process 300 of the gesture determination method includes:

301, a target image captured by the image capturing apparatus in a case where an angle value between the first direction and the second direction is less than or equal to a preset angle threshold is acquired.

In this embodiment, an execution subject (e.g., a server, a terminal device, an image processing unit with an image processing function, an attitude determination device, etc.) of the attitude determination method may acquire, from other electronic devices or locally, a target image captured by the image capturing device in a case where an angle value between the first direction and the second direction is less than or equal to a preset angle threshold value, by a wired connection manner or a wireless connection manner. Wherein, the shooting entity indicated by the shooting object contained in the target image has a plane structure; the first direction represents a direction from the planar structure to the image capturing device, and the second direction represents an orientation of the planar structure.

In some optional implementations of this embodiment, the planar structure includes at least one quadrilateral marking, and the camera model image includes marking objects of at least part of the quadrilateral marking; and

detecting each corner object in the camera model image, wherein the corner objects are: an image of the corner points of the quadrilateral marking.

And determining the distance with the maximum numerical value from the distances from each corner point object in the camera model image to the center of the camera model image.

Here, the target focal length f' may be calculated by the following formula:

f’＝(h÷2)÷r×f

wherein h represents the high resolution of the target image, r represents the distance with the maximum value, and f represents the focal length of the camera.

It can be understood that, in the application scenario, the target focal length of the pinhole camera model may be recalculated based on the initial focal length, and the target focal length is taken as the optimal focal length, so as to convert the target image into the camera model image under the pinhole camera model. In this way, the attitude estimation is performed based on the camera model image, and the accuracy of the attitude estimation can be further improved.

In some optional implementations of this embodiment, the planar structure includes at least one quadrilateral mark, and the camera model image includes a mark object of at least a partial quadrilateral mark of the quadrilateral marks; and

second rotation information between the corner points of each quadrilateral mark in the planar structure and the corresponding pixels in the camera model image is determined based on the camera model image. Wherein the corresponding pixels in the camera model image are: and in the camera model image, the image of the corner point of the quadrilateral mark.

Preferably, the quadrilateral marking contains a directional marker. On this basis, the determining, based on the camera model image, second rotation information between the corner point of each quadrilateral mark in the planar structure and the corresponding pixel in the camera model image includes:

firstly, detecting the corner points of each quadrilateral mark in the camera model image to obtain a corner point coordinate sequence of the quadrilateral mark. The first corner point coordinate in the corner point coordinate sequence is determined based on the direction indicated by the direction identifier contained in the quadrilateral mark;

and secondly, calculating a homography transformation matrix from the corner point coordinate sequence of the quadrilateral mark to a preset corner point coordinate sequence under an ideal coordinate system.

And thirdly, determining second rotation information between the corner points of the quadrilateral marks in the planar structure and corresponding pixels in the camera model image based on the homography transformation matrix.

In this embodiment, the planar structure includes at least one quadrangular marker, and the camera model image includes a marker object of at least a part of the quadrangular markers. Otherwise, step 301 may be substantially the same as step 101 in the corresponding embodiment of fig. 1, and is not described herein again.

302, first rotation information is determined.

In this embodiment, the execution body may determine the first rotation information.

In some optional implementations of this embodiment, the determining the first rotation information includes:

and determining first rotation information of the projection direction to the optical axis direction based on the projection direction of the shooting entity in a first camera coordinate system of the image shooting device and the Z-axis direction of the first camera coordinate system.

303, the predetermined initial focal distance is used as the camera focal distance.

In this embodiment, the execution subject may take a predetermined initial focal distance as the camera focal distance.

And 304, converting the target image into a camera model image under a camera model based on the first rotation information and the camera focal length.

In this embodiment, the executing body may convert the target image into a camera model image under a camera model based on the first rotation information and the camera focal length.

In some optional implementations of this embodiment, the converting the target image into a camera model image under a camera model based on the first rotation information and the camera focal length includes:

first, homogeneous coordinates of pixels in the camera model image in a projection coordinate system are determined based on coordinates of the pixels in the camera model image in a pixel coordinate system. Wherein, the origin of the pixel coordinate system x-y is the midpoint of the pixel coordinate system. The abscissa u and the ordinate v of the pixel coordinate system u-v are the row and the column, respectively, in which the image is located. In the visual processing library OpenCV, u corresponds to x, and v corresponds to y.

And then, determining first camera coordinates of the homogeneous coordinates in the first camera coordinate system based on the first rotation information and the camera focal length.

And finally, determining the pixel value of a pixel in the camera model image under the preset camera model based on the first camera coordinate to obtain the camera model image under the preset camera model.

As an example, determining pixel values of pixels in a camera model image under the preset camera model based on the first camera coordinates includes: normalizing the first camera coordinates to a unit spherical surface to obtain spherical coordinates; determining floating point number pixel coordinates in the target image based on the spherical coordinates; and determining the pixel value of the pixel in the camera model image under the preset camera model based on the floating point number pixel coordinate in the target image by adopting a bilinear interpolation algorithm. In this way, the pixel value of the pixel in the camera model image under the preset camera model can be determined through the normalization processing and the bilinear interpolation algorithm. In this way, the accuracy of the pose estimation can be further improved by performing the pose estimation based on the camera model image.

It is understood that, in the application scenario, the camera model image under the pinhole camera model may be obtained based on the first rotation information and the coordinates of the pixels in the target image under the pixel coordinate system.

And 305, detecting the plane structure in the camera model image, and updating the camera focal length according to the detection result.

In this embodiment, the execution subject may detect the planar structure in the camera model image, and update the camera focal length according to the detection result.

As an example, a distance having the largest numerical value may be determined from distances from each corner object in the camera model image to the center of the camera model image. Wherein, the corner object is: and in the camera model image, the corner points of the quadrilateral marks are imaged. Then, based on the distance with the largest value and the initial focal length, a target focal length of the pinhole model is calculated.

Here, the camera focal length f' can be calculated by the following formula:

f’＝(h÷2)÷r×f

wherein h represents the high resolution of the target image, r represents the distance with the maximum value, and f represents the initial focal length.

And 306, converting the target image into a camera model image under a camera model based on the first rotation information and the camera focal length to obtain the updated camera model image.

In this embodiment, the executing body may convert the target image into a camera model image under a camera model based on the first rotation information and the camera focal length, so as to obtain the updated camera model image.

307, detecting the planar structure in the camera model image, and calculating second rotation information.

In this embodiment, the executing body may detect the planar structure in the camera model image and calculate second rotation information.

As an example, the execution subject may determine second rotation information between a corner point of each quadrangular mark in the planar structure and a corresponding pixel in the camera model image based on the camera model image.

Wherein, the corresponding pixels in the camera model image are: in the camera model image, the corner points of the quadrilateral marks are imaged.

308, determining pose information between a point in the planar structure and a corresponding pixel in the target image according to the first rotation information and the second rotation information.

In this embodiment, the execution body may determine the pose information between the point in the planar structure and the corresponding pixel in the target image according to the first rotation information and the second rotation information.

In this embodiment, the execution subject may take a predetermined initial focal length as the camera focal length.

In some alternative implementations of this embodiment, the quadrilateral indicia contains a direction identifier. On this basis, the executing body may execute 303 to determine second rotation information between the corner point of each quadrilateral mark in the planar structure and the corresponding pixel in the camera model image based on the camera model image as follows:

firstly, detecting the corner points of each quadrilateral mark in the camera model image to obtain a corner point coordinate sequence of the quadrilateral mark. Wherein the corner coordinates in the sequence of corner coordinates are arranged in a predetermined order. The first corner coordinate in the sequence of corner coordinates is determined based on the direction indicated by the direction identifier contained in the quadrilateral marking.

Here, for example, the direction flag may be used to indicate an initial direction of the quadrangular mark. For example, the direction indicator may be provided below the quadrangular marker (or may be provided at another position). Therefore, when the direction mark is positioned below the quadrilateral mark, the direction of the quadrilateral mark can be determined to be the initial direction, and the direction mark of the shooting entity is realized. Further, the upper left corner point (or other corner points) of the quadrangular marker when the quadrangular marker is oriented in the initial direction may be used as the first corner point coordinate in the corner point coordinate sequence. And then arranging the coordinates of each corner point from the first corner point coordinate in a clockwise (or anticlockwise) direction to obtain a corner point coordinate sequence.

Then, the corner point coordinate sequence of the quadrilateral mark is calculated to a Homography (homograph) transformation matrix of a preset corner point coordinate sequence under an ideal coordinate system. Wherein the ideal coordinate system is the coordinate system without distortion.

As an example, please refer to fig. 4A-4C. Fig. 4A-4C are schematic diagrams of relationships between coordinate systems involved in the pose determination method of the present disclosure. The above-mentioned world coordinate system, image coordinate system, pixel coordinate system, camera coordinate system, and origin and coordinate axis of the ideal coordinate system are set as follows:

in fig. 4A, M is a three-dimensional space point, and M is an image point projected by M on an image plane.

Wherein:

the world coordinate system, i.e. the panoramic camera coordinate system (first camera coordinate system), is an absolute coordinate system of the objective three-dimensional world, also called objective coordinate system. Because the camera is placed in three-dimensional space, we need this reference coordinate system, the world coordinate system, to describe the position of the camera, and use it to describe the position of any other object placed in this three-dimensional environment, by (x) _c ，y _c ，z _c ) Indicating the coordinate values thereof.

A virtual camera coordinate system (optical center coordinate system, second camera coordinate system) with the optical center of the virtual camera as the origin of coordinates, the X-axis and the Y-axis being parallel to the X-axis and the Y-axis of the pixel coordinate system, respectively, and the optical axis of the camera being the Z-axis, using (X) _d ，y _d ，z _d ) Indicating the coordinate values thereof.

And (3) an image coordinate system, wherein the center of the image plane is taken as a coordinate origin, and the X axis and the Y axis are respectively parallel to two vertical edges of the image plane, and the coordinate values are expressed by (X, Y).

And (c) a pixel coordinate system, which takes the vertex at the upper left corner of the image plane as an origin, and the X axis and the Y axis are respectively parallel to the X axis and the Y axis of the image coordinate system, and the coordinate values are expressed by (u, v).

Referring to fig. 4B, fig. 4B is a diagram of a relationship between a pixel coordinate system and an image coordinate system. Wherein (u) ₀ ,v ₀ ) Is the coordinate of the origin of the image coordinate system in the pixel coordinate system.

With continued reference to FIG. 4C, the ideal perspective model is a pinhole imaging model, and objects and images would satisfy the relationship of similar triangles. However, in practice, due to processing and assembling errors of the camera optical system, the lens cannot satisfy the relation of the object and the image similar to a triangle, so that distortion exists between the image actually formed on the camera image plane and the ideal image. The distortion belongs to imaging geometric distortion, and is the phenomenon of picture distortion deformation formed by different areas on a focal plane according to different image amplification rates, the degree of the deformation is gradually increased from the center of a picture to the edge of the picture, and the distortion is mainly reflected obviously at the edge of the picture. Here, the ideal coordinate system, i.e. the coordinate system without distortion, m _r (x _r ,y _r ) Image plane representing actual proxelsPhysical coordinates under the system, m _i (x _i ,y _i ) Representing the physical coordinates in the image plane coordinate system of the ideal proxel.

Here, the coordinates (x) in the panoramic camera coordinate system _c ，y _c ，z _c ) Coordinate (x) in the virtual camera coordinate system _d ，y _d ，z _d ) The relationship between them can be as shown in fig. 4D.

In the above embodiments, (u, v) is any one pixel coordinate, which takes P in the virtual camera normalization plane _d For representation, the conversion process can refer to the above description, and is not described in detail here.

And then, determining second rotation information between the corner points of the quadrilateral marks in the planar structure and corresponding pixels in the camera model image based on the homography transformation matrix.

Here, the execution body may decompose the second rotation information from the homography transformation matrix.

It can be understood that, in the above alternative implementation manner, the orientation of the shooting entity can be determined through the direction identifier, so that a subsequent attitude calculation result can be more accurate.

And 304, determining relative pose information between the point in the planar structure and the corresponding pixel in the target image based on the determined second rotation information.

In this embodiment, the execution body may determine the relative posture information between the point in the planar structure and the corresponding pixel in the target image based on the determined respective pieces of second rotation information.

Wherein, the corresponding pixels in the target image are: and in the target image, the image of the point in the plane structure. The second rotation information may be characterized by a matrix form.

As an example, the executing entity may first determine a mean value of the determined second rotation information, and obtain a rotation matrix after fusion. Relative pose information between a point in the planar structure and a corresponding pixel in the target image is then determined based on the rotation matrix.

As a further example, the executing entity may first determine a mean value of the determined second rotation information, and obtain a rotation matrix after fusion. Then, the result of multiplying the first rotation matrix obtained in the manner described above and the rotation matrix after the fusion is calculated as the relative orientation information between the point in the planar structure and the corresponding pixel in the target image.

It should be noted that, besides the above-mentioned contents, the embodiment of the present application may further include the same or similar features and effects as those of the embodiment corresponding to fig. 1 or fig. 2, and details are not repeated herein.

As can be seen from fig. 3, in the flow 300 of the pose determination method in this embodiment, the relative pose information between the point in the planar structure and the corresponding pixel in the target image is determined based on the second rotation information between the corner point of each quadrilateral mark in the planar structure and the corresponding pixel in the camera model image. Therefore, the texture information in the camera model image is fully utilized for attitude estimation, and the accuracy of attitude estimation can be improved.

As an example, please refer to fig. 4E-4G. Fig. 4E-4G are schematic application scenarios of an embodiment of the gesture determination method of the present disclosure.

In the application scenarios shown in fig. 4E-4G, the executing entity determines the relative pose information between the point in the planar structure and the corresponding pixel in the target image as follows:

the method comprises a first step of acquiring a target image shot by an image shooting device under the condition that an angle value between a first direction and a second direction is smaller than or equal to a preset angle threshold value.

The shooting entity (in the figure, a mobile computing device, such as a mobile phone) indicated by the shooting object contained in the target image has a planar structure, i.e., a display screen. The first direction represents a direction of the planar structure to the image capturing device. The second direction represents an orientation of the planar structure. The planar structure comprises at least one quadrilateral mark. The camera model image includes a marker object quadrangle marker including a direction indicator, at least a part of the quadrangle markers. The planar structure is a display screen. The image displayed by the display screen comprises marked objects of quadrilateral marks arranged in rows and columns. The size of the marker object in the image presented by the display screen is determined based on the size of the display screen.

Here, referring to fig. 4E, fig. 4E shows a rigid connection diagram of an image capture device (e.g., a panoramic camera) and a mobile computing device (i.e., a capture entity). Wherein the display screen of the mobile computing device 402 is oriented in a second direction d ₁ The direction of the connection point from the mobile computing device 402 to the panoramic camera (i.e., image capture device) 401 is a first direction d ₂ . Here, the second direction d ₁ And a first direction d ₂ The included angle a therebetween does not exceed 70 degrees (i.e., the above-mentioned preset angle threshold). The greater the degree of the included angle a is, the less image information is included in a target image obtained by subsequent shooting, and the greater the difficulty in detecting the mark is.

Thereafter, the corresponding quadrilateral indicia may be designed according to the aspect ratio of the display screen of the mobile computing device 402 and stored graphically in the mobile computing device 402. The quadrilateral mark can be one of the quadrilateral marks, such as a mark graph which can be identified and subjected to posture calculation, such as an acuco, apriltag, a custom image, a custom text and the like.

Referring to fig. 4F, taking text as an example, in the application scenario, the outer frame of the single mark is a square, the background is a black text graphic, and a white bar mark (in the drawing, reference numeral 403 is a white bar mark in one of the quadrilateral marks) is placed under the content inside the outer frame for identifying the direction. The display screen is provided with 8 quadrilateral marks, and the multi-mark arrangement mode can ensure that enough marks can be detected when the installation position of the panoramic camera is improper and the splicing problem is caused. Also, this multi-tag arrangement ensures adequate recall. Since the panorama camera is mostly linear in shape, when any one of the straight lines cuts the marks arranged as above, at least one of the marks is not cut and is detected in a subsequent process.

Next, after the rigid connection is completed in the shooting site and before the panoramic camera shoots the target image, the brightness of the display screen needs to be adjusted to an appropriate value, and the currently appropriate brightness is 70% of the maximum brightness of the display screen. Here, too much brightness will result in overexposure of subsequently obtained images; too little brightness will result in a lower contrast in subsequently acquired images.

After the setting is performed, the image including the 8 quadrilateral marks may be displayed on a display screen of the mobile computing device. And then, the user can send a shooting instruction to the panoramic camera, and can remove the image displayed on the display screen after the shooting is finished, so that the shooting process is finished.

Thus, the target image can be obtained.

A second step of determining first rotation information of the projection direction to the optical axis direction based on a projection direction of the photographic entity in a first camera coordinate system of the image photographing apparatus and a Z-axis direction of the first camera coordinate system.

Specifically, a projection direction d of the capturing object in a first camera coordinate system of the image capturing device may be given, and a coordinate of d in a second camera (e.g., a pinhole camera model) coordinate system may be set as (x) _d ，y _d ，z _d ) Unit vector e in the Z-axis direction of the first camera coordinate system _z Has coordinates of (0,0,1), and calculates vector d vector e according to the Rodrigues equation _z First rotation information R of _cd . The coordinate of d is taken to be (0,1,0).

Thus, the first rotation information can be obtained.

And a third step of determining homogeneous coordinates of the pixels in the camera model image in a projection coordinate system based on the coordinates of the pixels in the camera model image in a pixel coordinate system.

Specifically, the initial focal length f and the width w of the resolution of the target image and the height h of the resolution may be set first. Calculating the pixel P (u) in all the camera model images _p ，v _p ) Image ofThe pixel RGB value. Will f is _x Is determined as f, f _y Is determined as f, c _x Is determined as w/2,c _y Is determined as h/2, and the matrix K is calculated using the following formula:

then, homogeneous coordinates P' in the projection coordinate system are calculated. Here, the inverse matrix and the matrix (u) based on K may be obtained by the following formula _p ，v _p 1) determining the coordinate (x) of P' as a result of the multiplication _p ，y _p ，z _p ) A corresponding matrix.

A fourth step of calculating a rotation matrix R based on the first rotation matrix R _cd Determining the first camera coordinate P of the homogeneous coordinate in the first camera coordinate system based on the result of the matrix multiplication corresponding to the P' coordinate _d . Wherein, let P _d The coordinates in the first camera coordinate system are (x) _c ，y _c ，z _c )。

A fifth step of calculating the first camera coordinate P _d Normalizing to unit sphere to obtain spherical coordinate P _c Based on the spherical coordinate P _c And determining the coordinates (u ', v') of the floating-point number pixels in the target image.

Wherein, P _c ＝P _d ÷||P _d ||

And a sixth step of determining a pixel value of a pixel in a preset camera model image under the camera model by using a bilinear interpolation algorithm based on the floating point number pixel coordinates (u ', v') in the target image to obtain an initial image under the pinhole camera model (namely, the camera model image obtained by calculation according to the initial focal length f).

Specifically, the RGB value of the floating-point pixel coordinate (u ', v') can be calculated by using bilinear interpolation in the adjacent 4 integer pixel coordinates corresponding to the floating-point pixel coordinate (u ', v'), where the coordinate to be calculated for the initial image is P (u ', v') _p ，v _p ) The pixel value of the pixel point.

And a seventh step of detecting the corner points of each quadrilateral mark in the initial image to obtain a corner point coordinate sequence of the quadrilateral mark.

Wherein the corner coordinates in the sequence of corner coordinates are arranged in a predetermined order. The first corner coordinate in the sequence of corner coordinates is determined based on the direction indicated by the direction identifier contained in the quadrilateral marking.

Specifically, a parallelogram in the initial image may be detected, and 4 coordinates { Px } arranged in order may be obtained with any vertex of the parallelogram as a starting point. Thereafter, the position of the white reference line in the square image is detected. And determining the side to which the reference line is attached as the bottom side of the square. And determining the vertex at the upper left corner as a first point according to the position of the bottom edge, and arranging 4 vertices in a clockwise manner to obtain a new arrangement of { Px } and obtain a corner coordinate sequence { Px' }.

And an eighth step of calculating a homography transformation matrix H from the corner point coordinate sequence of the quadrilateral mark to a preset corner point coordinate sequence under an ideal coordinate system.

Specifically, the homography transformation matrix from the corner coordinate sequence { Px } to the square vertex sequence under the ideal coordinate system (i.e., the above-mentioned preset corner coordinate sequence) { Px0} is calculated, and the area pixels surrounded by the quadrangle are mapped to the square image of a fixed size. Among them, it can be specified that the square vertex sequences under the ideal coordinate system are { [0,1], [1,1], [1,0], [0,0] }.

A ninth step of determining second rotation information between the corner points of the quadrilateral marks in the planar structure and the corresponding pixels in the camera model image based on the homography transformation matrix H. Wherein, the corresponding pixels in the camera model image are: and in the camera model image, the corner points of the quadrilateral marks are imaged.

Here, the rotation information Rm from the quadrilateral mark in the planar structure to the virtual pinhole model camera, that is, the above-described second rotation information, may be decomposed from the homography transformation matrix H.

A tenth step of determining a distance r having a maximum value among distances from each corner object in the camera model image to the center of the initial image.

Wherein, the corner object is: in the initial image, the corner points of the quadrilateral marks are imaged.

An eleventh step of updating a camera focal length f' of the pinhole camera model based on the distance with the largest numerical value and the camera focal length f.

Specifically, the target focal length f' may be calculated with reference to fig. 4G. Fig. 4G shows a schematic diagram of the geometric relationship between the target focal length f', the initial focal length f, the maximum distance r, and the resolution h. Here, triangle ABC is similar to triangle OBD, as shown in FIG. 4G. Angles of the angles DOB and A are both a, a point D is on a line segment BD, a point O is on a line segment AB, the length of the line segment AB is f', the length of the line segment OB is f, the length of the line segment BD is r, and the length of the line segment BC is h/2. Thus, a virtual pinhole model photograph, i.e., the above-mentioned camera model image, can be obtained by the method described above, and the angular point coordinate sequence { Px' } is extracted according to the detection method described above. Then, the target focal length f' can be calculated by the following formula:

f’＝(h÷2)÷r×f

and a twelfth step of converting the target image into a camera model image under the pinhole camera model based on the target focal length f'.

Specifically, the value of the initial focal length f may be replaced with the value of the target focal length f' obtained here, so that the camera model image under the pinhole camera model is obtained by using the method for obtaining the initial image. That is, what is obtained based on the initial focal length using the above method is an initial image; the camera model image under the pinhole camera model is obtained based on the target focal length.

Here, it is also possible to calculate rotation information Rm marked to the virtual pinhole model camera again according to the method described above, and update the Rm calculated previously with the Rm calculated here.

A thirteenth step of determining relative orientation information between the point in the planar structure and the corresponding pixel in the target image based on the determined pieces of second rotation information (i.e., the updated pieces of second rotation information Rm). Wherein, the corresponding pixels in the target image are: and in the target image, an image of a point in the planar structure.

Specifically, the average of all the second rotation information may be calculated, resulting in a rotation matrix Rm _ fuse after fusion, which is the final value of the relative rotation marked to the virtual pinhole camera. Then, the relative pose information R between a point in the planar structure and a corresponding pixel in the above target image can be calculated according to the following formula:

R＝R _cd ×Rm_fuse

in the application scene, a novel method for detecting the characteristic mark of the screen of the control device from the picture of the panoramic camera and a method for calculating the relative pose are provided. In addition, the mark arrangement method can enable the control device and the panoramic camera to work normally within a large range, and the marks on the screen of the device can be detected at a high probability. According to the focal length adjusting method, the markers can be detected as much as possible in a large range of the connection distance between the panoramic camera and the mobile computing device, and the field installation requirement is reduced.

With further reference to fig. 5, as an implementation of the method shown in the above figures, the present disclosure provides an embodiment of an attitude determination apparatus, which corresponds to the method embodiment shown in fig. 1 to 3, and which may include the same or corresponding features as the method embodiment shown in fig. 1 to 3, in addition to the features described below, and produce the same or corresponding effects as the method embodiment shown in fig. 1 to 3. The device can be applied to various electronic equipment in particular.

As shown in fig. 5, the posture determining apparatus 500 of the present embodiment includes: an acquisition unit 501, a conversion unit 502, and a determination unit 503. The acquiring unit 501 is configured to acquire a target image captured by the image capturing apparatus in a case where an angle value between the first direction and the second direction is smaller than or equal to a preset angle threshold; wherein, the shooting entity indicated by the shooting object contained in the target image has a plane structure; the first direction represents a direction from the planar structure to the image capture device, and the second direction represents an orientation of the planar structure. A conversion unit 502 configured to convert the target image into a camera model image under a preset camera model. A determining unit 503 configured to determine, based on the camera model image, relative pose information between a point in the planar structure and a corresponding pixel in the target image, where the corresponding pixel in the target image is: and in the target image, the image of the point in the plane structure.

In this embodiment, the obtaining unit 501 of the posture determination apparatus 500 may obtain a target image captured by the image capturing device in a case where an angle value between the first direction and the second direction is less than or equal to a preset angle threshold; wherein, the shooting entity indicated by the shooting object contained in the target image has a plane structure; the first direction represents a direction from the planar structure to the image capturing device, and the second direction represents an orientation of the planar structure.

In this embodiment, the conversion unit 502 may convert the target image into a camera model image under a preset camera model.

In this embodiment, the determining unit 503 may determine the relative posture information between the point in the planar structure and the corresponding pixel in the target image based on the camera model image. Wherein, the corresponding pixels in the target image are: and in the target image, an image of a point in the planar structure.

In some optional implementation manners of this embodiment, the obtaining unit 501 includes:

an acquisition subunit (not shown in the drawings) configured to acquire a target image captured by the image capturing apparatus in a case where a preset capturing condition is satisfied;

wherein, the preset shooting condition comprises: an angle value between the first direction and the second direction is less than or equal to a preset angle threshold, and a distance between the image capture device and the capture entity is less than or equal to a preset distance threshold.

In some optional implementation manners of this embodiment, the preset camera model is a pinhole camera model; and

the conversion unit 502 includes:

a conversion subunit (not shown in the figure) configured to convert the above target image into a camera model image under the pinhole camera model.

In some optional implementations of this embodiment, the converting subunit includes:

a first determining module (not shown in the figure) configured to determine first rotation information of the projection direction to the optical axis direction based on a projection direction of the photographic entity in a first camera coordinate system of the image capturing apparatus and a Z-axis direction of the first camera coordinate system;

and a conversion module (not shown in the figure) configured to convert the target image into a camera model image under a pinhole camera model based on the first rotation information.

In some optional implementations of this embodiment, the converting module 502 includes:

a first determination submodule (not shown in the drawings) configured to determine homogeneous coordinates of pixels in the camera model image in a projection coordinate system based on coordinates of the pixels in the camera model image in a pixel coordinate system;

a second determining sub-module (not shown in the figure) configured to determine the camera coordinates of the homogeneous coordinates in the first camera coordinate system based on the first rotation information and the camera focal length;

and a third determining submodule (not shown in the figure) configured to determine pixel values of pixels in a camera model image under a preset camera model based on the first camera coordinates, so as to obtain the camera model image under the pinhole camera model.

In some optional implementations of this embodiment, the converting the target image into a camera model image under a preset camera model; determining, based on the camera model image, relative pose information between a point in the planar structure and a corresponding pixel in the target image, including:

s1: determining first rotation information;

s2: taking a predetermined initial focal length as a camera focal length;

s5: s3, obtaining the updated camera model image;

In some optional implementations of this embodiment, the determining the first rotation information is specifically configured to:

and determining the pixel value of the pixel in the camera model image under the preset camera model based on the floating point number pixel coordinate in the target image by adopting a bilinear interpolation algorithm.

In some optional implementations of the embodiment, the planar structure includes at least one quadrilateral mark, and the camera model image includes a mark object of at least a partial quadrilateral mark of the quadrilateral marks; and

the conversion module comprises:

a first conversion sub-module (not shown in the figure) configured to detect corner objects in the camera model image, wherein the corner objects are: an image of the corner points of the quadrilateral markers;

a fourth determining submodule (not shown in the drawings) configured to determine a distance with a largest value from distances from each corner object in the initial image to the center of the camera model image;

a computation submodule (not shown in the figure) configured to update a camera focal length of the pinhole camera model based on the distance at which the numerical value is maximum and the camera focal length;

and a second conversion sub-module (not shown in the figure) configured to convert the target image into a camera model image under the pinhole camera model based on the target focal length.

the detecting the planar structure in the camera model image and calculating second rotation information includes:

a second determining subunit (not shown in the figure) configured to determine, based on the determined respective second rotation information, relative pose information between a point in the planar structure and a corresponding pixel in the target image.

In some optional implementations of this embodiment, the quadrilateral indicia includes a direction identifier; and

the first determining subunit includes:

a detection module (not shown in the figures) configured to detect a corner of each quadrilateral mark in the camera model image to obtain a corner coordinate sequence of the quadrilateral mark, wherein corner coordinates in the corner coordinate sequence are arranged according to a predetermined sequence, and a first corner coordinate in the corner coordinate sequence is determined based on a direction indicated by a direction identifier included in the quadrilateral mark;

a calculation module (not shown in the figures) configured to calculate a homography transformation matrix of the corner coordinate sequence of the quadrilateral marking to a preset corner coordinate sequence under an ideal coordinate system;

a second determining module (not shown in the figures) configured to determine second rotation information between the corner points of the quadrilateral mark in the planar structure and the corresponding pixels in the camera model image based on the homography transformation matrix.

In some optional implementations of this embodiment, the determining, from the first rotation information and the second rotation information, pose information between a point in the planar structure and a corresponding pixel in the target image includes:

In the posture determination apparatus 500 provided in the above embodiment of the present disclosure, the obtaining unit 501 may obtain a target image that is captured by the image capturing device in a case where an angle value between the first direction and the second direction is less than or equal to a preset angle threshold; wherein, the shooting entity indicated by the shooting object contained in the target image has a plane structure; the first direction represents a direction from the planar structure to the image capturing apparatus, the second direction represents an orientation of the planar structure, the converting unit 502 may convert the target image into a camera model image under a preset camera model, and the determining unit 503 may determine relative pose information between a point in the planar structure and a corresponding pixel in the target image based on the camera model image, where the corresponding pixel in the target image is: and in the target image, the image of the point in the plane structure. Therefore, the embodiment of the disclosure can determine the relative posture information between the point in the planar structure and the corresponding pixel in the target image based on the target image shot by the image shooting device under the condition that the angle value between the first direction and the second direction is less than or equal to the preset angle threshold, so that the use scenes and the application range of the method for calculating the posture of the shooting entity through the image are enriched, and the accuracy of posture calculation is improved.

Next, an electronic apparatus according to an embodiment of the present disclosure is described with reference to fig. 6. The electronic device may be either or both of the first device and the second device, or a stand-alone device separate from them, which stand-alone device may communicate with the first device and the second device to receive the acquired input signals therefrom.

FIG. 6 illustrates a block diagram of an electronic device in accordance with an embodiment of the disclosure.

As shown in fig. 6, the electronic device 6 includes one or more processors 601 and memory 602.

The processor 601 may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device to perform desired functions.

Memory 602 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, random Access Memory (RAM), cache memory (cache), and/or the like. The non-volatile memory may include, for example, read Only Memory (ROM), hard disk, flash memory, etc. One or more computer program instructions may be stored on the computer-readable storage medium and executed by the processor 601 to implement the attitude determination methods of the various embodiments of the present disclosure described above and/or other desired functions. Various contents such as an input signal, a signal component, a noise component, etc. may also be stored in the computer-readable storage medium.

In one example, the electronic device may further include: an input device 603 and an output device 604, which are interconnected by a bus system and/or other form of connection mechanism (not shown).

For example, when the electronic device is a first device or a second device, the input device 603 may be the microphone or the microphone array described above for capturing the input signal of the sound source. When the electronic device is a stand-alone device, the input means 603 may be a communication network connector for receiving the acquired input signals from the first device and the second device.

The input device 603 may also include, for example, a keyboard, a mouse, and the like. The output device 604 may output various information including the determined distance information, direction information, and the like to the outside. The output devices 604 may include, for example, a display, speakers, a printer, and a communication network and remote output devices connected thereto, among others.

Of course, for simplicity, only some of the components of the electronic device relevant to the present disclosure are shown in fig. 6, omitting components such as buses, input/output interfaces, and so forth. In addition, the electronic device may include any other suitable components, depending on the particular application.

In addition to the above-described methods and apparatus, embodiments of the present disclosure may also be a computer program product comprising computer program instructions that, when executed by a processor, cause the processor to perform the steps in the pose determination method according to various embodiments of the present disclosure described in the "exemplary methods" section of this specification above.

The computer program product may write program code for carrying out operations for embodiments of the present disclosure in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server.

The computer readable storage medium may take any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may include, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The foregoing describes the general principles of the present disclosure in conjunction with specific embodiments, however, it is noted that the advantages, effects, etc. mentioned in the present disclosure are merely examples and are not limiting, and they should not be considered essential to the various embodiments of the present disclosure. Furthermore, the foregoing disclosure of specific details is for the purpose of illustration and description and is not intended to be limiting, since the disclosure is not intended to be limited to the specific details so described.

In the present specification, the embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts in the embodiments are referred to each other. For the system embodiment, since it basically corresponds to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiment.

Those of ordinary skill in the art will understand that: all or part of the steps for implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer readable storage medium, and when executed, the program performs the steps including the method embodiments; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.

The methods and apparatus of the present disclosure may be implemented in a number of ways. For example, the methods and apparatus of the present disclosure may be implemented by software, hardware, firmware, or any combination of software, hardware, and firmware. The above-described order for the steps of the method is for illustration only, and the steps of the method of the present disclosure are not limited to the order specifically described above unless specifically stated otherwise. Further, in some embodiments, the present disclosure may also be embodied as programs recorded in a recording medium, the programs including machine-readable instructions for implementing the methods according to the present disclosure. Thus, the present disclosure also covers a recording medium storing a program for executing the method according to the present disclosure.

The description of the present disclosure has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the disclosure in the form disclosed. Many modifications and variations will be apparent to practitioners skilled in this art. The embodiment was chosen and described in order to best explain the principles of the disclosure and the practical application, and to enable others of ordinary skill in the art to understand the disclosure for various embodiments with various modifications as are suited to the particular use contemplated.

Claims

1. A method of attitude determination, the method comprising:

acquiring a target image shot by image shooting equipment; wherein a photographic entity indicated by a photographic subject contained in the target image has a planar structure; the direction from the planar structure to the image shooting equipment is a first direction, the orientation of the planar structure is a second direction, and the angle value between the first direction and the second direction is smaller than or equal to a preset angle threshold value;

2. The method according to claim 1, wherein the converting the target image into a camera model image under a preset camera model; determining, based on the camera model image, relative pose information between a point in the planar structure and a corresponding pixel in the target image, including:

s1: determining first rotation information;

s2: taking a predetermined initial focal length as a camera focal length;

s5: s3, obtaining the updated camera model image;

3. The method of claim 2, wherein determining the first rotation information comprises:

4. The method of claim 3, wherein the converting the target image into a camera model image under a camera model based on the first rotation information and the camera focal length comprises:

determining the homogeneous coordinates of the pixels in the camera model image in a projection coordinate system based on the coordinates of the pixels in the camera model image in a pixel coordinate system;

determining first camera coordinates of the homogeneous coordinates in the first camera coordinate system based on the first rotation information and the camera focal length;

determining a pixel value of a pixel in a camera model image under the preset camera model based on the first camera coordinates to obtain the camera model image under the preset camera model.

5. The method of claim 4, wherein determining pixel values of pixels in a camera model image under the preset camera model based on the first camera coordinates comprises:

6. The method according to any one of claims 2 to 5, wherein the planar structure includes at least one quadrangular marker, and the camera model image includes marker objects of at least part of the quadrangular markers; and

7. The method according to any one of claims 2 to 5, wherein the planar structure includes at least one quadrangular marker, and the camera model image includes marker objects of at least part of the quadrangular markers; and

determining second rotation information between the corner point of each quadrilateral mark in the planar structure and a corresponding pixel in the camera model image based on the camera model image, wherein the corresponding pixel in the camera model image is: in the camera model image, an image of an angular point of the quadrilateral mark;

the quadrilateral mark comprises a direction mark; and

8. The method according to any one of claims 1 to 5, wherein the planar structure is a display screen presenting an image comprising tagged objects of quadrilateral indicia arranged in rows and columns.

9. The method of claim 2, wherein determining pose information between a point in the planar structure and a corresponding pixel in the target image from the first rotation information and the second rotation information comprises:

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method of any one of the preceding claims 1 to 9.