CN111754558A

CN111754558A - Matching method for RGB-D camera system and binocular imaging system, system and computing system thereof

Info

Publication number: CN111754558A
Application number: CN201910231993.6A
Authority: CN
Inventors: 丁建雄; 倪志钢; 张本好
Original assignee: Sunny Optical Zhejiang Research Institute Co Ltd
Current assignee: Sunny Optical Zhejiang Research Institute Co Ltd
Priority date: 2019-03-26
Filing date: 2019-03-26
Publication date: 2020-10-09
Anticipated expiration: 2039-03-26
Also published as: CN111754558B

Abstract

A matching method for an RGB-D camera system and a binocular imaging system, a system and a computing system thereof are provided. The matching method for the RGB-D camera system and the binocular imaging system comprises the following steps: acquiring RGB depth image data acquired by the RGB-D camera system, wherein the RGB depth image data comprises RGB monocular image data acquired by an RGB monocular camera system of the RGB-D camera system and depth image data acquired by a depth camera system of the RGB-D camera system; performing monocular matching on a left eye image system of the binocular image system and the RGB monocular image data based on the RGB depth image data by using a monocular matching model to obtain RGB left eye image data; and mapping the RGB left eye image data to the right eye imaging system by a second binocular mapping model based on the base line of the binocular imaging system to obtain RGB right eye image data, so that the RGB left eye image data and the RGB right eye image data form the RGB binocular image data of the binocular imaging system.

Description

Matching method for RGB-D camera system and binocular imaging system, system and computing system thereof

Technical Field

The invention relates to the technical field of 3D vision, in particular to a matching method for an RGB-D camera system and a binocular imaging system, a system and a computing system thereof.

Background

In recent years, with the increasing maturity of 3D technology, the 3D movie industry has also been rapidly developed. Since the 3D movie is finally presented to the human eye for viewing, i.e., the 3D movie playback system (imaging system) is a binocular system (binocular imaging system), the existing 3D space shooting-playback system is mainly a binocular camera-binocular imaging system, i.e., the 3D movie shooting system is a binocular camera system. In other words, in order to facilitate matching of the frame and the depth information, a binocular imaging system strictly matched with the binocular imaging system is usually selected for shooting, so as to minimize unnecessary format and parameter conversion, and further facilitate direct presentation at the imaging end. In addition, according to different visualization technologies, the visualization technologies can be divided into 3D presentation forms such as multiplexing and polarization, and different visualization technologies have different requirements on the binocular camera system, which puts forward a stricter requirement on selecting a proper and matched binocular camera system.

However, the existing matching method between the camera system and the display system is usually a controllable mode for a specific application scene, and the camera end can be required to meet some parameter requirements from the display end, that is, the parameters of the camera system to be matched are determined according to the parameters of the display system. However, for an uncontrollable large-scale client application scene, the existing matching method is not suitable due to the diversification of the shooting means. For example, the camera end uses an RGB-D camera system (e.g., an RGB-D camera) to capture a 3D scene, and the image display end uses a binocular image display system to display the scene for the human eyes to watch. Therefore, it is a difficult problem how to match the RGB-D camera system with the binocular imaging system so that the image captured by the RGB-D camera can be displayed in the binocular imaging system.

Disclosure of Invention

An object of the present invention is to provide a matching method for an RGB-D camera system and a binocular imaging system, and a system and a computing system thereof, which can solve the problem of conversion between RGB depth image data and RGB binocular image data, so that an image photographed by the RGB-D camera system can be displayed in the binocular imaging system.

Another object of the present invention is to provide a matching method for an RGB-D camera system and a binocular imaging system, and a system and a computing system thereof, wherein in an embodiment of the present invention, the matching method for the RGB-D camera system and the binocular imaging system can ensure that depth information of a virtual scene displayed by the binocular imaging system is consistent with depth information of a three-dimensional scene photographed by the RGB-D camera system, which is helpful for improving the viewing experience of a user.

Another object of the present invention is to provide a matching method for an RGB-D camera system and a binocular imaging system, and a system and a computing system thereof, wherein in an embodiment of the present invention, the matching method for the RGB-D camera system and the binocular imaging system can ensure that a virtual scene displayed by the binocular imaging system based on RGB depth image data captured by the RGB-D camera system is not distorted in a depth direction, which is helpful for improving a user's viewing experience.

Another object of the present invention is to provide a matching method for an RGB-D camera system and a binocular imaging system, and a system and a computing system thereof, wherein in an embodiment of the present invention, the matching method for the RGB-D camera system and the binocular imaging system can complete conversion between RGB depth image data and RGB binocular image data through only one mapping, which is helpful for simplifying a matching process between the RGB-D camera system and the binocular imaging system.

Another object of the present invention is to provide a matching method for an RGB-D camera system and a binocular imaging system, and a system and a computing system thereof, wherein, in an embodiment of the present invention, the matching method for the RGB-D camera system and the binocular imaging system can ensure that a three-dimensional scene photographed by the binocular imaging system does not appear to be shifted in the binocular imaging system through two mappings.

Another object of the present invention is to provide a matching method for an RGB-D camera system and a binocular imaging system, and a system and a computing system thereof, wherein in an embodiment of the present invention, the matching method for the RGB-D camera system and the binocular imaging system can complete system matching only through an algorithm, and does not require any hardware matching operation, which is helpful for reducing matching cost.

Another objective of the present invention is to provide a matching method for an RGB-D camera system and a binocular imaging system, and a system and a computing system thereof, which can be widely applied to matching any depth camera and the binocular imaging system, and have strong universality.

To achieve at least one of the above objects or other objects and advantages, the present invention provides a matching method for an RGB-D camera system and a binocular imaging system, comprising the steps of:

acquiring RGB depth image data acquired by the RGB-D camera system, wherein the RGB depth image data comprises RGB monocular image data acquired by an RGB monocular camera system of the RGB-D camera system and depth image data acquired by a depth camera system of the RGB-D camera system;

performing monocular matching on a monocular developing reference system and the RGB monocular image data based on the RGB depth image data by using a monocular matching model to obtain RGB depth image reference data, wherein the RGB depth image reference data comprises RGB monocular image reference data and depth image reference data;

mapping the RGB monocular image reference data to an image plane of a left eye imaging system of the binocular imaging system based on the pose of the left eye imaging system relative to the monocular imaging reference system by using a first binocular mapping model to obtain RGB left eye image data; and

and mapping the RGB monocular image reference data to an image plane of the binocular imaging system by using a second binocular mapping model based on the pose of the right eye imaging system of the binocular imaging system relative to the monocular imaging reference system to obtain RGB right eye image data, so that the RGB left eye image data and the RGB right eye image data form the RGB binocular image data of the binocular imaging system.

In an embodiment of the present invention, the monocular matching between a monocular image reference system and the RGB monocular image data is performed based on the RGB depth image data by a monocular matching model to obtain RGB depth image reference data, where the RGB depth image reference data includes the RGB monocular image reference data and the depth image reference data, including the steps of:

setting the monocular visualization reference system based on the internal parameters of the binocular visualization system so that the internal parameters of the monocular visualization reference system are equal to the internal parameters of the left eye visualization system of the binocular visualization system; and

and performing monocular matching on the monocular imaging reference system and the RGB monocular camera system by virtue of the monocular matching model to obtain the RGB depth image reference data, so that the three-dimensional scene shot by the RGB-D camera system does not appear to have depth distortion in the monocular imaging reference system.

In an embodiment of the present invention, the optical center of the monocular visualization reference system is centered between the optical center of the left eye visualization system and the optical center of the right eye visualization system.

In one embodiment of the present invention, the monocular matching model is

Wherein U ' and V ' are pixel coordinates of pixel points in the monocular visualization reference system on the x-axis and the y-axis, α ' and β ' are scale ratios of the monocular visualization reference system on the x-axis and the y-axis, and c '_xAnd c'_yThe pixel translation amount of the monocular image reference system on the x axis and the y axis, f' is the focal length of the monocular image reference system, U and V are the pixel coordinates of the corresponding pixel point in the RGB monocular image pickup system on the x axis and the y axis, α and β are the scale scaling ratio of the RGB monocular image pickup system on the x axis and the y axis, c_xAnd c_yThe pixel translation amount of the RGB monocular camera system on the x axis and the y axis is obtained; f is the focal length of the RGB monocular camera system; z' and Z are eachAnd the depth values corresponding to the pixel points in the monocular visualization reference system and the RGB monocular camera system are obtained.

In one embodiment of the present invention, the monocular matching model is

Wherein U ' and V ' are pixel coordinates of pixel points in the monocular visualization reference system on the x-axis and the y-axis, α ' and β ' are scale ratios of the monocular visualization reference system on the x-axis and the y-axis, and c '_xAnd c'_yThe translation amount of the monocular vision reference system on the x axis and the y axis is defined as f 'and theta' which are the focal length and the half field angle of the monocular vision reference system, U and V which are the pixel coordinates of the pixel point in the RGB monocular camera system on the x axis and the y axis, α and β which are the scale scaling ratio of the RGB monocular camera system on the x axis and the y axis, c_xAnd c_yThe pixel translation amount of the RGB monocular camera system on the x axis and the y axis is obtained; f and theta are the focal length and the half field angle of the RGB monocular camera system; z' and Z are the depth values of corresponding pixel points in the monocular imaging reference system and the RGB monocular camera system respectively; z₀The depth value of the reference datum in the RGB monocular camera system.

In one embodiment of the present invention, the first bi-directional mapping model is

Wherein U is^LAnd V^Lα representing the coordinates of the pixels in the left eye image display system^LAnd β^LScaling the left eye visualization system in the x-axis and the y-axis;

and

the translation amount of the left eye visualization system on the x axis and the y axis is obtained; b^LIs a base line between the left eye image display system and the monocular image display reference system, U 'and V' are pixel coordinates of pixel points in the monocular image display reference system on an x-axis and a y-axis, α 'and β' are pixel coordinates of the monocular image display reference systemScaling of the target image reference system in the x-axis and the y-axis; c'_xAnd c'_yThe translation amount of the monocular visualization reference system on the x axis and the y axis; f' is the focal length of the monocular visualization reference system; and Z' is the depth value corresponding to the pixel point in the monocular imaging reference system.

In an embodiment of the present invention, the second binocular mapping model is

Wherein U is^RAnd V^Rα representing the coordinates of the pixels in the right eye imaging system^RAnd β^RScaling the right eye visualization system in the x-axis and y-axis;

and

the amount of pixel translation in the x-axis and y-axis for the right eye visualization system; b^RIs a base line between the right-eye visualization system and the monocular visualization reference system, U ' and V ' are pixel coordinates of pixel points in the monocular visualization reference system on the x-axis and the y-axis, α ' and β ' are scale ratios of the monocular visualization reference system on the x-axis and the y-axis, c '_xAnd c'_yPixel translation amounts in the x-axis and y-axis for the monocular visualization reference system; f' is the focal length of the monocular visualization reference system; and Z' is the depth value corresponding to the pixel point in the monocular imaging reference system.

According to another aspect of the present invention, the present invention further provides a matching method for an RGB-D camera system and a binocular imaging system, comprising the steps of:

performing monocular matching on a left eye image system of the binocular image system and the RGB monocular image data based on the RGB depth image data by using a monocular matching model to obtain RGB left eye image data; and

and mapping the RGB left eye image data to the right eye imaging system by a second binocular mapping model based on the base line of the binocular imaging system to obtain RGB right eye image data, so that the RGB left eye image data and the RGB right eye image data form RGB binocular image data of the binocular imaging system.

Wherein U is^RAnd V^RPixel coordinates of pixel points of the right eye visualization system on the x-axis and the y-axis α^RAnd β^RScaling the right eye visualization system in the x-axis and y-axis;

and

the translation amount of the right eye visualization system on the x axis and the y axis; u shape^LAnd V^Lα pixel coordinates of the corresponding pixel point of the left eye image display system on the x-axis and the y-axis^LAnd β^LScaling the left eye visualization system in the x-axis and the y-axis;

and

the translation amount of the left eye visualization system on the x axis and the y axis is obtained; b and f' are a baseline and a focal length between the binocular imaging systems, respectively; and Z' is the depth value corresponding to the pixel point in the binocular imaging system.

According to another aspect of the present invention, there is also provided a matching system for an RGB-D camera system and a binocular imaging system, comprising:

the system comprises an acquisition module, a display module and a display module, wherein the acquisition module is used for acquiring RGB depth image data acquired by the RGB-D camera system, and the RGB depth image data comprises RGB monocular image data acquired by the RGB monocular camera system of the RGB-D camera system and depth image data acquired by the depth camera system of the RGB-D camera system;

the monocular matching module is used for performing monocular matching on a monocular imaging reference system and the RGB monocular image data based on the RGB depth image data by a monocular matching model so as to obtain RGB depth image reference data, wherein the RGB depth image reference data comprises RGB monocular image reference data and depth image reference data;

the left eye mapping module is used for mapping the RGB monocular image reference data to an image plane of the left eye imaging system based on the pose of the left eye imaging system of the binocular imaging system relative to the monocular imaging reference system by virtue of a first binocular mapping model so as to obtain RGB left eye image data; and

and the right eye mapping module is used for mapping the RGB monocular image reference data to the image plane of the right eye imaging system based on the pose of the right eye imaging system of the binocular imaging system relative to the monocular imaging reference system by virtue of a second binocular mapping model so as to obtain RGB right eye image data, so that the RGB left eye image data and the RGB right eye image data form RGB binocular image data of the binocular imaging system.

In an embodiment of the present invention, the monocular matching module is further configured to set the monocular visualization reference system based on the internal parameters of the binocular visualization system, so that the internal parameters of the monocular visualization reference system are equal to the internal parameters of the left eye visualization system of the binocular visualization system; and performing monocular matching on the monocular vision reference system and the RGB monocular camera system by the monocular matching model to obtain the RGB depth image reference data, so that the three-dimensional scene shot by the RGB-D camera system does not appear to have depth distortion in the monocular vision reference system.

the monocular matching module is used for performing monocular matching on the left eye image system of the binocular image system and the RGB monocular image data based on the RGB depth image data by virtue of a monocular matching model so as to obtain the RGB left eye image data; and

and the left eye mapping module is used for mapping the RGB left eye image data to the right eye imaging system based on the base line of the binocular imaging system by using a second binocular mapping model so as to obtain RGB right eye image data, so that the RGB left eye image data and the RGB right eye image data form RGB binocular image data of the binocular imaging system.

According to another aspect of the present invention, the present invention also provides a computing system comprising:

a logic machine; and

a storage machine, wherein the storage machine is configured to store instructions executed by the logic machine to implement any of the above-described matching methods for an RGB-D camera system and a binocular visualization system.

According to another aspect of the present invention, there is also provided a computer readable storage medium having stored thereon computer program instructions operable to, when executed by a computing device, perform any of the above-described matching methods for an RGB-D camera system and a binocular visualization system.

Further objects and advantages of the invention will be fully apparent from the ensuing description and drawings.

These and other objects, features and advantages of the present invention will become more fully apparent from the following detailed description, the accompanying drawings and the claims.

Drawings

Fig. 1 shows a schematic diagram of the principle of pinhole imaging.

Fig. 2 shows a schematic view of a binocular imaging.

Fig. 3 is a flowchart illustrating a matching method for an RGB-D camera system and a binocular imaging system according to an embodiment of the present invention.

Fig. 4 is a schematic flowchart illustrating a monocular matching step of the matching method for the RGB-D camera system and the binocular imaging system according to the above embodiment of the present invention.

Fig. 5 shows an example of the principle of singleton matching in the matching method according to the above-described embodiment of the present invention.

Fig. 6 shows another example of the simplex matching principle in the matching method according to the above-described embodiment of the present invention.

Fig. 7 illustrates an example of the binocular mapping principle in the matching method according to the above-described embodiment of the present invention.

Fig. 8 is a block diagram illustrating a matching system for an RGB-D camera system and a binocular imaging system according to the above embodiment of the present invention.

Fig. 9 is a flowchart illustrating a matching method for an RGB-D camera system and a binocular imaging system according to another embodiment of the present invention.

Fig. 10 is a block diagram of a matching system for an RGB-D camera system and a binocular imaging system according to another embodiment of the present invention.

FIG. 11 illustrates a block diagram of a computing system in accordance with the present invention.

Detailed Description

The following description is presented to disclose the invention so as to enable any person skilled in the art to practice the invention. The preferred embodiments in the following description are given by way of example only, and other obvious variations will occur to those skilled in the art. The basic principles of the invention, as defined in the following description, may be applied to other embodiments, variations, modifications, equivalents, and other technical solutions without departing from the spirit and scope of the invention.

In the present invention, the terms "a" and "an" in the claims and the description should be understood as meaning "one or more", that is, one element may be one in number in one embodiment, and the element may be more than one in number in another embodiment. The terms "a" and "an" should not be construed as limiting the number unless the number of such elements is explicitly recited as one in the present disclosure, but rather the terms "a" and "an" should not be construed as being limited to only one of the number.

In the description of the present invention, it is to be understood that the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. In the description of the present invention, it should be noted that, unless explicitly stated or limited otherwise, the terms "connected" and "connected" are to be interpreted broadly, and may be, for example, fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; they may be directly connected or indirectly connected through an intermediate. The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations.

In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.

In order to accurately describe a three-dimensional scene in a 3D space, not only color information and plane position information of a certain point in the three-dimensional scene but also depth information of the point is required. However, the monocular camera maps a scene in a three-dimensional space to an image in a two-dimensional space by using a principle of pinhole imaging, which inevitably has a lack of dimension (depth information), and cannot accurately describe the scene in a 3D space. Therefore, an RGB monocular camera system (e.g., an RGB monocular camera) that describes color information of a point P in the three-dimensional scene by RGB data corresponding to the point P and describes plane position information of the point P in the three-dimensional scene by a pixel position corresponding to the point P is combined with a depth camera system (e.g., a depth camera) that supplements missing depth information of the RGB monocular camera by obtaining a depth value of the point P, so as to form a complete RGB-D camera system. In other words, the RGB depth image data obtained by the RGB-D camera system includes RGB image data (i.e., color information and position information) acquired by the RGB monocular camera system, and depth image data (i.e., depth information) acquired by the depth camera system. It is understood that the depth camera system in the RGB-D camera system of the present invention may be implemented with various types of depth cameras such as a TOF camera or a structured light camera, but is not limited thereto.

Illustratively, as shown in fig. 1, for an RGB monocular camera, if a camera coordinate system is taken as a spatial coordinate system O-XYZ, a pixel plane coordinate system is taken as a plane coordinate system O-xy, and a distance (i.e., a focal length) between a pixel plane and an optical center of the camera is f, a point on an image plane mapped by imaging a spatial point P (X, Y, Z) in a three-dimensional scene through a pinhole is P (X, Y, f).

Therefore, from the trigonometric relationship:

finishing to obtain:

for the binocular imaging system, the reason that the depth information of the virtual object can be obtained is that parallax exists between the left and right imaging systems of the binocular imaging system, and when the parameters of the binocular imaging system are fixed, the depth of the virtual object in the 3D space is determined by the size of the parallax. Therefore, the binocular imaging system requires the left and right RGB images to be able to present the depth information, in other words, the RGB binocular image data required by the binocular imaging system includes the RGB left eye image data corresponding to the left eye imaging system and the RGB right eye image data corresponding to the right eye imaging system, so that the depth information is determined by the parallax between the RGB left eye image data and the RGB right eye image data.

Exemplarily, as shown in fig. 2, AR glasses are taken as the binocular imaging system, wherein O^LAnd O^RThe optical centers of the left and right glasses in the AR glasses are respectively; p' is a virtual object point which is displayed in the three-dimensional space and corresponds to the real object point P; p^LAnd P^RRespectively are imaging points of a point P' on left and right image surfaces in the AR glasses; f' is the focal length of the AR glasses; b is the baseline of the AR glasses. Furthermore, point P^LAnd point P^RIn the AR glasses, the coordinate offset on the left and right image surfaces is respectively u^LAnd u^RIndicating that and specifying a positive rightward shift and a negative leftward shift, point P^LAnd point P^RThe offset on the left and right image planes in the AR glasses with respect to the respective optical axes is given as: u. of^LAnd-u^R. As such, the disparity of the binocular visualization system may be written as: d ═ u^L-u^R(3)。

As is apparent from FIG. 2,. DELTA.P' P^LP^RAnd Δ P' O^LO^RIf the triangle is similar, then the following can be known according to the similarity relationship of the triangles:

the relation between the depth and the parallax of the binocular imaging system is obtained by sorting according to the formula (3) and the formula (4):

wherein d ═ u^L-u^R(5)

In the formula: z 'is the distance between a virtual object point P' displayed by the binocular imaging system and the optical center plane of the binocular imaging system; f' is the focal length of the binocular imaging system; b is a baseline of the binocular imaging system; d is the parallax of the binocular imaging system.

Because the RGB depth image data that RGB-D camera system gathered with binocular image system shows required RGB binocular image data and does not match, consequently need right RGB-D camera system with binocular image system matches for through the RGB depth image data that RGB-D camera system gathered can by binocular image system uses, in order to pass through binocular image system presents corresponding three-dimensional image.

Referring to fig. 3 and 4, a matching method for an RGB-D camera system and a binocular imaging system according to an embodiment of the present invention is illustrated. Specifically, as shown in fig. 3, the matching method for the RGB-D camera system and the binocular imaging system includes the steps of:

s110: acquiring RGB depth image data acquired by an RGB-D camera system, wherein the RGB depth image data comprises RGB monocular image data acquired by an RGB monocular camera system of the RGB-D camera system and depth image data acquired by a depth camera system of the RGB-D camera system;

s120: performing monocular matching on the RGB-D camera system and a monocular image reference system based on the RGB depth image data by using a monocular matching model to obtain RGB depth image reference data, wherein the RGB depth image reference data comprises RGB monocular image reference data and depth image reference data;

s130: mapping the RGB monocular image reference data to an image plane of a monocular imaging system by a first binocular mapping model based on the pose of the binocular imaging system relative to the monocular imaging reference system to obtain RGB monocular image data; and

s140: by a second binocular mapping model, based on in the binocular vision system the position appearance of the monocular vision reference system relative to, will RGB monocular image reference data map to the binocular vision system to obtain RGB right eye image data, so that RGB left eye image data with RGB right eye image data constitutes binocular vision system's RGB binocular image data.

It will be appreciated that after the RGB binocular image data is obtained, the binocular visualization system is capable of rendering a three-dimensional scene captured by the RGB-D camera system based on the RGB binocular image data for viewing by a user.

Specifically, in the step S120 of the matching method of the above embodiment of the present invention, the monocular matching model may be implemented as, but is not limited to:

wherein U ' and V ' are pixel coordinates of pixel points in the monocular visualization reference system on an x axis and a y axis, α ' and β ' are scale ratios of the monocular visualization reference system on the x axis and the y axis, and c '_xAnd c'_yThe pixel translation amount of the monocular imaging reference system on the x axis and the y axis, f' is the focal length of the monocular imaging reference system, U and V are the pixel coordinates of corresponding pixel points in the RGB monocular camera system on the x axis and the y axis, α and β are the scale scaling ratio of the RGB monocular camera system on the x axis and the y axis, c_xAnd c_yThe pixel translation amount of the RGB monocular camera system on the x axis and the y axis is obtained; f is the focal length of the RGB monocular camera system; and Z' and Z are depth values corresponding to pixel points in the monocular visualization reference system and the RGB monocular camera system respectively.

It is to be noted that the RGB depth image reference data obtained by matching through the monocular matching model also includes RGB image reference data and corresponding depth image reference data. In particular, since Z' is Z in the monocular matching model, the depth image reference data in the RGB depth image reference data and the depth image data in the RGB depth image data remain the same.

Illustratively, as shown in fig. 5, a point O is an optical center of the monocular vision reference system and the RGB monocular camera system in the RGB-D camera system, and θ 'and f' are a half field angle and a focal length of the monocular vision reference system; θ and f are the focal lengths of the RGB monocular camera system. The point P is an object point with a depth value Z in the three-dimensional space of the RGB monocular camera system, and the point P 'is a virtual object point with a depth value Z' in the three-dimensional space of the monocular imaging reference system, wherein the point P coincides with the point P ', and Z is Z'; the point p and the point p' are image points on the image planes of the RGB monocular imaging system and the monocular visualization reference system, respectively.

Then, according to the trigonometric relationship and formula (1):

the coordinate of the X-axis distance on the image plane of the monocular imaging reference system is obtained by arranging that the point P coincides with the point P ', that is, X is equal to X':

in the same way, the distance coordinate of the y axis on the image plane of the monocular visualization reference system is as follows:

further, the pixel coordinates on the image plane of the RGB monocular imaging system satisfy:

wherein U and V are pixel coordinates of the RGB monocular camera system in the x-axis and the y-axis, α and β are scaling ratios of the RGB monocular camera system in the x-axis and the y-axis, c_xAnd c_yAnd the translation amounts of the RGB monocular camera system on the x axis and the y axis are obtained. The pixel coordinates on the image plane of the monocular visualization reference system satisfy:

wherein U ' and V ' are pixel coordinates of the monocular vision reference system on the x-axis and the y-axis, α ' and β ' are scale ratios of the monocular vision reference system on the x-axis and the y-axis, and c '_xAnd c'_yAnd translating the monocular visualization reference system on an x axis and a y axis. Therefore, the mapping relationship between the RGB monocular camera system and the monocular visualization reference system, i.e., the monocular matching model, can be obtained by combining equation (6) or equation (7):

It should be noted that, since the viewing angle of the RGB monocular camera system in the RGB-D camera system is usually larger than the viewing angle of the monocular visualization reference system, i.e., θ > θ', the monocular visualization reference system cannot completely display the image captured by the RGB monocular camera system. But the image matched with the monocular visualization reference system is directly intercepted from the image shot by the RGB monocular camera system through the monocular matching model, and any distortion (such as stretching or compression) does not occur in the depth direction. Of course, after the monocular matching model of the present invention is mapped, the image frame presented by the monocular visualization reference system will not be distorted in the x-axis and the y-axis.

It is worth mentioning that in some cases, the user wants to view as complete an image as possible through the monocular visualization reference system, and the monocular matching model is not suitable for mapping by using the interception method, so in other examples of the present invention, the monocular matching model may be implemented as follows:

wherein U ' and V ' are pixel coordinates of pixel points in the monocular visualization reference system on an x axis and a y axis, α ' and β ' are scale ratios of the monocular visualization reference system on the x axis and the y axis, and c '_xAnd c'_yThe translation amount of the monocular vision reference system on the x axis and the y axis, f 'and theta' are the focal length and the half field angle of the monocular vision reference system, U and V are the pixel coordinates of a pixel point in the RGB monocular camera system on the x axis and the y axis, α and β are the scale scaling ratio of the RGB monocular camera system on the x axis and the y axis, c_xAnd c_yThe pixel translation amount of the RGB monocular camera system on the x axis and the y axis is obtained; f and theta are the focal length and the half field angle of the RGB monocular camera system; z' and Z are the depth values of corresponding pixel points in the monocular imaging reference system and the RGB monocular camera system respectively; z₀And the depth value of the reference datum plane in the RGB monocular camera system is obtained.

Illustratively, as shown in fig. 6, a point O is an optical center of the monocular vision reference system and the RGB monocular camera system in the RGB-D camera system, and θ 'and f' are a half field angle and a focal length of the monocular vision reference system; θ and f are the focal lengths of the RGB monocular camera system. Point P₀A field of view edge at a reference datum 101 of the RGB monocular camera system, wherein a depth value of the reference datum 101 is Z₀Then the half frame L at the reference base plane 101₀＝Z₀tan θ; and point P₀Corresponding point P₀' edge of field of view at matching reference surface 201 of the monocular visualization reference system, wherein the depth value of the matching reference surface 201 is Z₀' half picture width L ' at the matching reference surface 201 '₀＝Z'₀tanθ'。

Furthermore, due to the corresponding point P₀' AND Point P₀Is equal in height, i.e. L'₀＝L₀Thus, the depth value Z of the matching reference surface 201₀' depth value from the reference base plane 101 is Z₀Satisfies the relationship:

whereas if a direct compression method is used to map a point P on the plane to be calibrated 102 with a depth value Z in the RGB monocular camera system directly to a point P "on the mapping plane 202 in the monocular visualization reference system, where the mapping plane 202 has a depth value Z". Then from fig. 6 and the geometrical relationship it is easy to see that: the difference in depth Δ X ″ -Z between the mapping plane 202 and the matching reference plane 201 of the monocular visualization reference system₀' is not equal to the depth difference Δ X between the plane to be calibrated 102 and the reference base plane 101 of the RGB monocular camera system ═ Z-Z₀This means that if the monocular visualization reference system and the RGB monocular camera system are matched using a direct compression method, the image displayed by the monocular visualization reference system will inevitably be distorted in the depth direction with respect to the image taken by the RGB monocular camera system (when θ > θ ', a tensile distortion occurs; when θ < θ', a compressive distortion occurs).

Therefore, in order to ensure that the image displayed by the monocular visualization reference system is not distorted in the depth direction, a point P located on the plane to be calibrated 102 with the depth value Z in the RGB monocular imaging system is mapped to a point P ' (i.e., the object height L of the point P is equal to the object height L ' of the point P ', i.e., L ' ═ L) on the calibration plane 202 with the depth value Z ' in the monocular visualization reference system, and the calibration plane 203 of the monocular visualization reference system and the calibration plane 203 are aligned with each otherThe difference in depth Δ X '═ Z' -Z between the matching reference surfaces 201₀' equal to the depth difference Δ X between the plane to be calibrated 102 of the RGB monocular camera system and the reference base plane 101 ═ Z-Z₀I.e., Δ X' ═ Δ X. Thus, the depth values of the calibration plane 203 in the monocular visualization reference system are:

Z'＝Z'₀+Z-Z₀(9)

the following formula (8) and formula (9) can be collated:

further, in the RGB monocular camera system, a point P corresponds to a point P on the imaging surface 100 of the RGB monocular camera system, and a pixel distance coordinate of the point P on the x-axis is x; in the RGB monocular visualization reference system, the point P 'corresponds to the point P' on the visualization surface 200 of the RGB monocular visualization reference system, and the pixel distance coordinate of the point P 'on the x-axis is x'. Therefore, from the similar triangle:

the x-axis distance coordinate on the imaging surface of the monocular imaging reference system is obtained by arranging the L' ═ L and the formula (10):

in the same way, the y-axis distance coordinate on the image display surface of the monocular imaging reference system is as follows:

and because of, the pixel coordinate on the image pick-up surface of the RGB monocular image pick-up system satisfies:

wherein U and V are pixel coordinates of the RGB monocular camera system on the x-axis and the y-axis, α and β are the RGB monocular cameraScaling of the image system in the x-axis and y-axis; c. C_xAnd c_yAnd the translation amounts of the RGB monocular camera system on the x axis and the y axis are obtained. The pixel coordinates on the image plane of the monocular visualization reference system satisfy:

wherein U ' and V ' are pixel coordinates of the monocular vision reference system on the x-axis and the y-axis, α ' and β ' are scale ratios of the monocular vision reference system on the x-axis and the y-axis, and c '_xAnd c'_yAnd translating the monocular visualization reference system on an x axis and a y axis. Therefore, combining equation (11) or equation (12), and equation (10), a mapping relationship between the RGB monocular camera system and the monocular visualization reference system, i.e., the monocular matching model, can be obtained:

It should be noted that the monocular matching model of the present invention maps the RGB depth image data collected by the RGB monocular imaging system in the RGB-D imaging system to the imaging plane of the monocular imaging reference system to obtain the corresponding RGB depth image reference data. In this way, compared with the image shot by the RGB monocular camera system, the image displayed by the monocular visualization reference system does not have any distortion in the depth direction, which is beneficial to improving the perception experience of the user. In other words, since the difference in depth between the image displayed by the monocular visualization reference system on the basis of the RGB depth image reference data on the calibration plane and the matching reference plane is equal to the difference in depth of the object photographed by the RGB monocular camera system at the plane to be calibrated and the reference plane, after the monocular visualization reference system is matched with the RGB monocular camera system by the matching method, the image displayed by the monocular visualization reference system is not distorted in the depth direction compared to the object photographed by the RGB monocular camera system.

In particular, when the parameters (i.e., the field angle and the focal length) of the RGB monocular camera system in the RGB-D camera system are equal to the parameters (i.e., the field angle and the focal length) of the left eye or the right eye imaging system in the binocular imaging system, i.e., the parameters of the monocular imaging reference system are equal to the parameters of the RGN monocular camera system, the RGB depth image reference data mapped by the monocular matching model is identical to the RGB depth image data. In other words, when the internal reference of the RGB monocular camera system is equal to the internal reference of the left eye or right eye imaging system, the step S120 in the above-described matching method of the present invention need not be performed, but the RGB depth image data is directly defined as the RGB depth image reference data.

It is worth mentioning that, according to the above-mentioned embodiment of the present invention, in order to simplify the subsequent mapping process, as shown in fig. 4, the step S120 of the matching method for the RGB-D camera system and the binocular imaging system preferably includes the steps of,

s121: setting the monocular visualization reference system based on the internal parameters of the binocular visualization system so that the internal parameters of the monocular visualization reference system are equal to the internal parameters of the left eye visualization system in the binocular visualization system; and

s122: performing monocular matching on the monocular vision reference system and the RGB-D camera system by virtue of the monocular matching model to obtain the RGB depth image reference data, so that the three-dimensional scene shot by the RGB-D camera system does not appear to have depth distortion in the monocular vision reference system.

It is noted that the parameters of the monocular visualization reference system and the left or right visualization system of the binocular visualization system may include, but are not limited to, a field angle and a focal length, that is, the field angle and the focal length of the monocular visualization reference system and the field angle and the focal length of the left visualization system (or the right visualization system). After the RGB depth image reference data is obtained in step S120, the RGB image reference data in the RGB depth image reference data is matched to the image planes of the left eye and right eye imaging systems based on the poses of the left eye and right eye imaging systems of the binocular imaging system with respect to the monocular imaging reference system, respectively, to obtain RGB left eye image data and RGB right eye image data.

In addition, although the position appearance of monocular visualization reference system can be any state, nevertheless in order to simplify calculation process, set for usually the visualization plane of monocular visualization reference system with the image plane coincidence of binocular visualization system, and the optical center of binocular visualization system with the optical center of left eye and right eye visualization system all is located the x-axis, so that binocular visualization system left eye and right eye visualization system can respectively with monocular visualization reference system constitutes a new binocular visualization system. In other words, when the left eye visualization system and the monocular visualization reference system are used as a new binocular visualization system, the pose of the left eye visualization system relative to the monocular visualization reference system is equivalent to the baseline between the left eye visualization system and the monocular visualization reference system; in a similar way, the right-eye visualization system and the monocular visualization reference system are used as a new binocular visualization system, and the pose of the right-eye visualization system relative to the monocular visualization reference system is equivalent to the baseline between the right-eye visualization system and the monocular visualization reference system.

Specifically, in the step S130 of the matching method of the above embodiment of the present invention, the first bilateral mapping model may be implemented as, but is not limited to:

wherein: u shape^LAnd V^Lα representing the pixel coordinates of the pixel points in the left eye image display system on the x axis and the y axis^LAnd β^LScaling the left eye visualization system in the x-axis and the y-axis;

and

the translation amount of the left eye visualization system on an x axis and a y axis is obtained; b^LIs a base line between the left-eye visualization system and the monocular visualization reference system, U ' and V ' are pixel coordinates of pixel points in the monocular visualization reference system on an x-axis and a y-axis, α ' and β ' are scale ratios of the monocular visualization reference system on the x-axis and the y-axis, c '_xAnd c'_yTranslating the monocular visualization reference system in an x-axis and a y-axis; f' is the focal length of the monocular visualization reference system; and Z' is the depth value corresponding to the pixel point in the monocular visualization reference system.

Illustratively, as shown in fig. 7, the binocular imaging system is represented by a solid line, and the RGB monocular imaging reference system is represented by a dotted line, where O^LAnd O^RThe optical centers of a left eye imaging system and a right eye imaging system in the binocular imaging system are respectively; o' is the optical center of the RGB monocular imaging reference system; the point P 'is a virtual object point with the depth value Z' in the monocular visualization reference system; the point P 'corresponds to a point P' on the image plane of the RGB monocular imaging reference system. Further, b is a baseline of the binocular visualization system; b^LAnd b^RThe baselines between the left eye and right eye visualization systems and the RGB monocular visualization reference system, respectively. For the RGB monocular displayFor the binocular imaging system composed of the image reference system and the left eye imaging system, the pixel distance coordinate of the point P 'on the x axis is x' and is equal to the coordinate offset u 'of the point P' on the image plane of the RGB monocular imaging reference system, and the coordinate offset u 'is recorded as u'. Therefore, the parallax d between the left eye image display system and the RGB monocular image display reference system can be obtained by combining the formula (3)^L＝u^L+u'；

Further, the depth value of the point P' can be obtained according to the formula (5):

finishing to obtain:

notably, the coordinate offset u at the image plane of the left eye visualization system due to point P' is^LIs equal to point p^LPixel distance coordinate x on x-axis^LThus, the point p is obtained from the formula (13)^LThe pixel distance coordinate on the x-axis is:

furthermore, since the binocular imaging system has no parallax on the y-axis, the point p^LThe pixel distance coordinate on the y-axis is: y is^L＝y'(15)。

And because of, the pixel coordinate on the image plane of the left eye visualization system satisfies:

wherein U is^LAnd V^LPixel coordinates in the x-axis and y-axis for the left eye visualization system α^LAnd β^LScaling the left eye visualization system in the x-axis and the y-axis;

and

for the left eye imaging system in the x-axis and y-axisThe amount of translation of (a). The pixel coordinates on the image plane of the monocular visualization reference system satisfy:

wherein U ' and V ' are pixel coordinates of the monocular vision reference system on the x-axis and the y-axis, α ' and β ' are scale ratios of the monocular vision reference system on the x-axis and the y-axis, and c '_xAnd c'_yAnd translating the monocular visualization reference system on an x axis and a y axis. Therefore, the matching relationship between the left eye image system and the monocular image reference system can be obtained by combining the formula (14) or the formula (15), that is, the first binocular mapping model is:

and

Similarly, the parallax d between the right eye visualization system and the RGB monocular visualization reference system^R＝u'-u^R(ii) a Further, the depth value of the point P' can be obtained according to the formula (5):

finishing to obtain:

and obtaining the second binocular mapping model as follows:

wherein: u shape^RAnd V^Rα representing the pixel coordinates of the pixel points in the right eye image display system on the x axis and the y axis^RAnd β^RScaling the right eye visualization system in the x-axis and y-axis;

and

the pixel translation amount of the right eye visualization system on the x axis and the y axis is obtained; b^RIs a base line between the right-eye visualization system and the monocular visualization reference system, U ' and V ' are pixel coordinates of pixel points in the monocular visualization reference system on an x-axis and a y-axis, α ' and β ' are scale ratios of the monocular visualization reference system on the x-axis and the y-axis, c '_xAnd c'_y(ii) pixel translation amounts in the x-axis and y-axis for the monocular visualization reference system; f' is the focal length of the monocular visualization reference system; and Z' is the depth value corresponding to the pixel point in the monocular visualization reference system.

In the above-described embodiments of the present invention, in order to ensure that the three-dimensional scene photographed by the RGB-D camera system does not appear to be shifted in the horizontal direction (i.e., x-axis) in the binocular visualization system, in the above-described embodiments of the present invention, the monocular visualization reference system is preferably located at the very middle of the left and right visualization systems of the binocular visualization system, and the baseline between the monocular visualization reference system and the left visualization system is equal to the monocular visualization reference system and the right visualization systemA base line between the right eye visualization systems, and equal to half of the base line of the binocular visualization systems, i.e. 2b^L＝2b^R＝b。

It should be noted that, in another example of the present invention, after the matching is performed by the first binocular mapping model to obtain the RGB left eye image data, the disparity d ═ u between the right eye visualization system and the left eye visualization system may be further determined according to the disparity d ═ u between the right eye visualization system and the left eye visualization system^L-u^RAnd an

The following can be obtained:

then, another second binocular mapping model can be obtained:

wherein: u shape^RAnd V^Rα pixel coordinates of pixel points of the right eye visualization system on the x-axis and the y-axis^RAnd β^RScaling the right eye visualization system in the x-axis and y-axis;

and

the translation amount of the right eye visualization system on an x axis and a y axis is obtained; u shape^LAnd V^Lα pixel coordinates of corresponding pixel points of the left eye image display system on the x axis and the y axis^LAnd β^LScaling the left eye visualization system in the x-axis and the y-axis;

and

the translation amount of the left eye visualization system on an x axis and a y axis is obtained; b and f' are each independentlyA baseline and a focal length between the binocular visualization systems; and Z' is the depth value corresponding to the pixel point in the binocular imaging system.

According to another aspect of the present invention, the above embodiments of the present invention further provide a matching system for an RGB-D camera system and a binocular imaging system, which is used for matching the RGB-D camera system and the binocular imaging system, so that RGB depth image data collected by the RGB-D camera system can be converted into RGB binocular image data used in the binocular imaging system, and a three-dimensional scene photographed by the RGB-D camera system can be presented in the binocular imaging system.

Specifically, as shown in fig. 8, the matching system of the RGB-D camera system and the binocular imaging system 30 includes an obtaining module 31, a monocular matching module 32, a left eye mapping module 33, and a right eye mapping module 34, where the obtaining module 31 is configured to obtain RGB depth image data collected by the RGB-D camera system, where the RGB depth image data includes RGB monocular image data collected by the RGB monocular camera system of the RGB-D camera system and depth image data collected by the depth camera system of the RGB-D camera system; the monocular matching module 32 is configured to match the RGB-D camera system and a monocular imaging reference system based on the RGB depth image data by using a monocular matching model to obtain RGB depth image reference data; the left eye mapping module 33 is configured to map, by using a first binocular mapping model, the RGB monocular image reference data to the left eye imaging system based on a pose of the left eye imaging system in the binocular imaging system relative to the monocular imaging reference system, so as to obtain RGB left eye image data in RGB binocular image data; and the right eye mapping module 34 is used for mapping the pose of the right eye image display system relative to the monocular image display reference system in the binocular image display system based on a second binocular mapping model to obtain the RGB right eye image data in the RGB binocular image data, so that the binocular image data based on the RGB binocular image data can present a three-dimensional scene shot by the RGB-D camera system.

Further, in the above embodiment of the present invention, the monocular matching module 32 may be configured to set the monocular visualization reference system based on the internal parameters of the binocular visualization system, wherein the internal parameters of the monocular visualization reference system are equal to the internal parameters of the left-eye or right-eye visualization system in the binocular visualization system; and for performing monocular matching of the monocular visualization reference system and the RGB-D camera system by means of the monocular matching model to obtain the RGB depth image reference data such that a three-dimensional scene captured by the RGB-D camera system does not appear to be depth distorted in the monocular visualization reference system.

It should be noted that, in another embodiment of the present invention, the left eye image system of the binocular image system can also be directly used as the monocular image reference system, i.e. b^L＝0；b^RB to minimize the amount of computation in the matching method. In other words, since the monocular image reference system is the left eye image system, the first binocular mapping model is not needed, and only RGB monocular image reference data in RGB depth image reference data of the monocular image reference system need be used as the RGB left eye image data, and then RGB right eye image data is obtained according to the second binocular mapping model.

Exemplarily, as shown in fig. 9, the above another embodiment of the present invention provides a matching method for an RGB-D camera system and a binocular imaging system, comprising the steps of:

s210: acquiring RGB depth image data acquired by an RGB-D camera system, wherein the RGB depth image data comprises RGB monocular image data acquired by an RGB monocular camera system of the RGB-D camera system and depth image data acquired by a depth camera system of the RGB-D camera system;

s220: mapping the RGB depth image data to a left eye imaging system in a binocular imaging system by using a monocular matching model to obtain RGB left eye image data; and

s230: by means of a second binocular mapping model, based on the base line of the binocular imaging system, the RGB left eye image data is mapped to the right eye imaging system to obtain RGB right eye image data, so that the RGB left eye image data and the RGB right eye image data form RGB binocular image data, and the binocular imaging system can present a three-dimensional scene shot by the RGB-D camera system based on the RGB binocular image data.

Specifically, in the step S230, the second binocular mapping model is implemented as:

and

and

the translation amount of the left eye visualization system on an x axis and a y axis is obtained; b and f' are a baseline and a focal length between the binocular imaging systems respectively; and Z' is the depth value corresponding to the pixel point in the binocular imaging system.

It is to be noted that, in other examples of the present invention, in the step S220, the RGB depth image data may also be mapped to a right eye imaging system in a binocular imaging system to obtain RGB right eye image data in the RGB binocular image data, and then the RGB left eye image data in the RGB binocular image data is obtained by using a first binocular mapping model, so that the RGB binocular image data of the binocular imaging system can also be obtained, which is not described herein again.

According to another aspect of the present invention, as shown in fig. 10, the above another embodiment of the present invention further provides a matching system 40 for an RGB-D camera system and a binocular imaging system, which includes an obtaining module 41, a monocular matching module 42 and a right eye mapping module 43, wherein the obtaining module 41 is used for obtaining RGB depth image data collected by the RGB-D camera system, wherein the RGB-D camera system includes an RGB monocular camera system and a depth camera system, and the RGB depth image data includes RGB monocular image data collected by the RGB monocular camera system and depth image data collected by the depth camera system; the monocular matching module 42 is configured to map the RGB depth image data to a left eye imaging system in a binocular imaging system by using a monocular matching model to obtain RGB left eye image data; and the right eye mapping module 43 is used for mapping the model by a second binocular based on the right eye image system in the binocular image system for the baseline of the left eye image system will the RGB left eye image data map to the right eye image system to obtain the RGB right eye image data, so as to pass through the RGB left eye image data with the RGB right eye image data constitute the RGB binocular image data, make the binocular image system based on the RGB binocular image data can present the three-dimensional scene shot through the RGB-D camera system.

Illustrative computing System

FIG. 11 illustrates a non-limiting embodiment of a computing system 900 that can perform one or more of the above-described methods or processes, and illustrates a computing system 900 in simplified form. The computing system 900 may take the form of: one or more head mounted display devices, or one or more devices cooperating with a head mounted display device (e.g., personal computers, server computers, tablet computers, home entertainment computers, network computing devices, gaming devices, mobile computing devices, mobile communication devices (e.g., smart phones), and/or other computing devices).

As shown in fig. 11, the computing system 900 includes a logic machine 901 and a storage machine 902, wherein the logic machine 901 is configured to execute instructions; the storage machine 902 is configured to store machine readable instructions executable by the logic machine 901 to implement any of the above-described matching methods for RGB-D camera systems and binocular visualization systems.

Of course, the computing system 900 may optionally include a display subsystem 903, an input subsystem 904, a communication subsystem 905, and/or other components not shown in fig. 11.

The logic machine 901 includes one or more physical devices configured to execute instructions. For example, the logic machine 901 may be configured to execute instructions that are part of: one or more applications, services, programs, routines, libraries, objects, components, data structures, or other logical constructs. Such instructions may be implemented to perform a task, implement a data type, transform the state of one or more components, implement a technical effect, or otherwise arrive at a desired result.

The logic machine 901 may include one or more processors configured to execute software instructions. Additionally or alternatively, the logic machine 901 may include one or more hardware or firmware logic machines configured to execute hardware or firmware instructions. The processors of the logic machine 901 may be single core or multicore, and the instructions executed thereon may be configured for serial, parallel, and/or distributed processing. The various components of the logic machine 901 may optionally be distributed over two or more separate devices, which may be remotely located and/or configured for coordinated processing. Aspects of the logic machine 901 may be virtualized and executed by remotely accessible networked computing devices configured in a cloud computing configuration.

The storage machine 902 comprises one or more physical devices configured to hold machine-readable instructions executable by the logic machine 901 to implement the methods and processes described herein. In implementing these methods and processes, the state of the storage machine 902 may be transformed (e.g., to hold different data).

The storage machine 902 may include removable and/or built-in devices. The storage machine 902 may include optical memory (e.g., CD, DVD, HD-DVD, blu-ray disc, etc.), semiconductor memory (e.g., RAM, EPROM, EEPROM, etc.), and/or magnetic memory (e.g., hard-disk drive, floppy-disk drive, tape drive, MRAM, etc.), among others. The storage machine 902 may include volatile, nonvolatile, dynamic, static, read/write, read-only, random-access, sequential-access, location-addressable, file-addressable, and/or content-addressable devices.

It is understood that the storage machine 902 includes one or more physical devices. However, aspects of the instructions described herein may alternatively be propagated by a communication medium (e.g., an electromagnetic signal, an optical signal, etc.) that is not held by a physical device for a limited period of time.

Aspects of the logic machine 901 and the storage machine 902 may be integrated together into one or more hardware logic components. These hardware logic components may include, for example, Field Programmable Gate Arrays (FPGAs), program and application specific integrated circuits (PASIC/ASIC), program and application specific standard products (PSSP/ASSP), system on a chip (SOC), and Complex Programmable Logic Devices (CPLDs).

Notably, when the computing system 900 includes the display subsystem 903, the display subsystem 903 can be used to present a visual representation of data held by the storage machine 902. The visual representation may take the form of a Graphical User Interface (GUI). As the herein described methods and processes change the data held by the storage machine, and thus transform the state of the storage machine 902, the state of the display subsystem 903 may likewise be transformed to visually represent changes in the underlying data. The display subsystem 903 may include one or more display devices utilizing virtually any type of technology. Such display devices may be combined with the logic machine 901 and/or the storage machine 902 in a shared enclosure, or such display devices may be peripheral display devices.

Further, when the computing system 900 includes the input subsystem 904, the input subsystem 904 may include or interface with one or more user input devices such as a keyboard, mouse, touch screen, or game controller. In some embodiments, the input subsystem 904 may include or interface with selected Natural User Input (NUI) components. Such component parts may be integrated or peripheral and the transduction and/or processing of input actions may be processed on-board or off-board. Example NUI components may include a microphone for speech and/or voice recognition; infrared, color, stereo display and/or depth camera for machine vision and/or gesture recognition; a head tracker, eye tracker, accelerometer and/or gyroscope for motion detection and/or intent recognition; and an electric field sensing component for assessing brain activity and/or body movement; and/or any other suitable sensor.

When the computing system 900 includes the communication subsystem 905, the communication subsystem 905 may be configured to communicatively couple the computing system 900 with one or more other computing devices. The communication subsystem 905 may include wired and/or wireless communication devices compatible with one or more different communication protocols. As a non-limiting example, the communication subsystem may be configured for communication via a wireless telephone network or a wired or wireless local or wide area network. In some embodiments, the communication subsystem 905 may allow the computing system 900 to send and/or receive messages to/from other devices via a network, such as the internet.

It will be appreciated that the configurations and/or approaches described herein are exemplary in nature, and that these specific embodiments or examples are not to be considered in a limiting sense, because numerous variations are possible. The specific routines or methods described herein may represent one or more of any number of processing strategies. As such, various acts illustrated and/or described may be performed in the sequence illustrated and/or described, in other sequences, in parallel, or omitted. Also, the order of the above-described processes may be changed.

Illustrative computing program product

In addition to the above-described methods and apparatus, embodiments of the present invention may also be a computer program product comprising computer program instructions that, when executed by a processor, cause the processor to perform the steps in the methods according to various embodiments of the present invention described in the "exemplary methods" section above of this specification.

The computer program product may write program code for carrying out operations for embodiments of the present invention in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the C language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server.

Furthermore, an embodiment of the present invention may also be a computer-readable storage medium having stored thereon computer program instructions, which, when executed by a processor, cause the processor to perform the steps of the above-described method of the present specification.

The computer readable storage medium may take any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may include, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The basic principles of the present invention have been described above with reference to specific embodiments, but it should be noted that the advantages, effects, etc. mentioned in the present invention are only examples and are not limiting, and the advantages, effects, etc. must not be considered to be possessed by various embodiments of the present invention. Furthermore, the foregoing disclosure of specific details is for the purpose of illustration and description and is not intended to be limiting, since the invention is not limited to the specific details described above.

The block diagrams of devices, apparatuses, systems involved in the present invention are only given as illustrative examples and are not intended to require or imply that the connections, arrangements, configurations, etc. must be made in the manner shown in the block diagrams. These devices, apparatuses, devices, systems may be connected, arranged, configured in any manner, as will be appreciated by those skilled in the art. Words such as "including," "comprising," "having," and the like are open-ended words that mean "including, but not limited to," and are used interchangeably therewith. The words "or" and "as used herein mean, and are used interchangeably with, the word" and/or, "unless the context clearly dictates otherwise. The word "such as" is used herein to mean, and is used interchangeably with, the phrase "such as but not limited to".

It should also be noted that in the apparatus, devices and methods of the present invention, the components or steps may be broken down and/or re-combined. These decompositions and/or recombinations are to be regarded as equivalents of the present invention.

The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the invention. Thus, the present invention is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

It will be appreciated by persons skilled in the art that the embodiments of the invention described above and shown in the drawings are given by way of example only and are not limiting of the invention. The objects of the invention have been fully and effectively accomplished. The functional and structural principles of the present invention have been shown and described in the examples, and any variations or modifications of the embodiments of the present invention may be made without departing from the principles.

Claims

1. A matching method for an RGB-D camera system and a binocular imaging system is characterized by comprising the following steps:

2. The matching method for the RGB-D camera system and the binocular imaging system as set forth in claim 1, wherein the step of performing monocular matching on a monocular imaging reference system and the RGB monocular image data based on the RGB depth image data by a monocular matching model to obtain RGB depth image reference data, wherein the RGB depth image reference data includes RGB monocular image reference data and depth image reference data, comprises the steps of:

3. The matching method for the RGB-D photographing system and the binocular imaging system as set forth in claim 2, wherein the optical center of the monocular imaging reference system is located at a central position between the optical center of the left eye imaging system and the optical center of the right eye imaging system.

4. The matching method for the RGB-D camera system and the binocular imaging system as set forth in any one of claims 1 to 3, wherein the monocular matching model is

Wherein U ' and V ' are pixel coordinates of pixel points in the monocular visualization reference system on the x-axis and the y-axis, α ' and β ' are scale ratios of the monocular visualization reference system on the x-axis and the y-axis, and c '_xAnd c'_yThe pixel translation amount of the monocular image reference system on the x axis and the y axis, f' is the focal length of the monocular image reference system, U and V are the pixel coordinates of the corresponding pixel point in the RGB monocular image pickup system on the x axis and the y axis, α and β are the scale scaling ratio of the RGB monocular image pickup system on the x axis and the y axis, c_xAnd c_yThe pixel translation amount of the RGB monocular camera system on the x axis and the y axis is obtained; f is the focal length of the RGB monocular camera system; z' and Z are respectively the image in the monocular vision reference system and the RGB monocular camera systemAnd the depth value corresponding to the pixel point.

5. The matching method for the RGB-D camera system and the binocular imaging system as set forth in any one of claims 1 to 3, wherein the monocular matching model is

6. The matching method for the RGB-D camera system and the binocular imaging system as set forth in any one of claims 1 to 3, wherein the first binocular mapping model is

and

the translation amount of the left eye visualization system on the x axis and the y axis is obtained; b^LIs a baseline between the left-eye visualization system and the monocular visualization reference system, U ' and V ' are pixel coordinates of pixel points in the monocular visualization reference system on the x-axis and the y-axis, α ' and β ' are scale ratios of the monocular visualization reference system on the x-axis and the y-axis, c '_xAnd c'_yThe translation amount of the monocular visualization reference system on the x axis and the y axis; f' is the focal length of the monocular visualization reference system; and Z' is the depth value corresponding to the pixel point in the monocular imaging reference system.

7. The matching method for the RGB-D camera system and the binocular imaging system as set forth in any one of claims 1 to 3, wherein the second binocular mapping model is

and

8. A matching method for an RGB-D camera system and a binocular imaging system is characterized by comprising the following steps:

9. The matching method for the RGB-D photographing system and the binocular imaging system of claim 8, wherein the second binocular mapping model is

and

and

10. A matching system for an RGB-D camera system and a binocular imaging system, comprising:

11. The method for matching an RGB-D camera system with a binocular imaging system as claimed in claim 10, wherein the monocular matching module is further configured to set the monocular imaging reference system based on the parameters of the binocular imaging system, so that the parameters of the monocular imaging reference system are equal to the parameters of the left eye imaging system of the binocular imaging system; and performing monocular matching on the monocular vision reference system and the RGB monocular camera system by the monocular matching model to obtain the RGB depth image reference data, so that the three-dimensional scene shot by the RGB-D camera system does not appear to have depth distortion in the monocular vision reference system.

12. A matching system for an RGB-D camera system and a binocular imaging system, comprising:

13. A computing system, comprising:

a logic machine; and

a storage machine, wherein the storage machine is configured to hold instructions for execution by the logic machine to implement the matching method for an RGB-D camera system and a binocular visualization system as claimed in any one of claims 1 to 9.

14. A computer-readable storage medium having stored thereon computer program instructions operable to, when executed by a computing device, perform the matching method for an RGB-D camera system and a binocular visualization system according to any one of claims 1 to 9.