CN111507894B

CN111507894B - Image stitching processing method and device

Info

Publication number: CN111507894B
Application number: CN202010307821.5A
Authority: CN
Inventors: 李超峰
Original assignee: Zhejiang Dahua Technology Co Ltd
Current assignee: Zhejiang Dahua Technology Co Ltd
Priority date: 2020-04-17
Filing date: 2020-04-17
Publication date: 2023-06-13
Anticipated expiration: 2040-04-17
Also published as: CN111507894A

Abstract

The invention provides an image stitching processing method and device, wherein the method comprises the following steps: adjusting the rotation angle of the dome camera in the horizontal direction under the target multiplying power, collecting a plurality of images, and determining a target focal length corresponding to the target multiplying power; projecting coordinate points of the plurality of images to a spherical surface; back projecting partial pixels in the overlapping area on the sphere to the plurality of images, and determining a re-projection error between adjacent images after back projection; adjusting the target focal length and the rolling angle of the dome camera according to the reprojection error to obtain an adjusted focal length and an adjusted rolling angle; the projection points on the spherical surface are reversely projected into the plane according to the adjusted focal length and the rolling angle to obtain the spliced image of the plurality of images, and the problems that the scale-invariant feature transform SIFT algorithm in the related technology is relatively time-consuming, the images to be spliced are required to have rich texture features, and the images cannot work normally in areas with few features can be solved.

Description

Image stitching processing method and device

Technical Field

The invention relates to the technical field of image processing, in particular to an image stitching processing method and device.

Background

The Zoom Pan-Tilt-Zoom (PTZ) camera can adjust focal length, gesture and the like, has the characteristics of flexibility, wide view field range, strong adaptability to illumination conditions and the like, and is widely applied to the field of monitoring security.

Camera calibration is a technique for connecting the relationship between a three-dimensional geometric position and a corresponding point in an image, and can be used for distortion correction, three-dimensional reconstruction and the like. Classical camera calibration algorithms, such as a two-step calibration method [1] of Tsai, a plane calibration method [2] of Zhang Zhengyou, and the like, mainly adopt camera calibration under the condition of fixed focal length, require a high-precision three-dimensional target or a plane calibration plate, and then obtain accurate calibration results by means of a linear least square method and a nonlinear optimization method. However, due to external environmental factors, intrinsic factors, etc., parameters of the camera may change, and zoom cameras sometimes need to calibrate camera parameters at different focal lengths. Therefore, in outdoor working occasions, calibrating the camera by using the reference calibration object is difficult to achieve.

At present, the image stitching technology is widely applied, and the general steps of the technology are detection and pairing of feature points in all images, feature point purification and rejection error matching, prediction and refinement of camera internal and external parameters, image projection transformation, image fusion and the like. The feature point detection and pairing are more time-consuming, and the feature point method splicing has a certain requirement on the image quality, and if the scene is a scene such as a sky or a wall surface with fewer textures, the scene can not be spliced accurately.

In the related art, an image stitching method is proposed, feature extraction is performed on a reference image to obtain a first feature point set, and feature extraction is performed on an image to be registered to obtain a second feature point set; character recognition is carried out on the reference image to obtain at least one first character area, character recognition is carried out on the image to be registered to obtain at least one second character area; removing the characteristic points located in the first character area from the first characteristic point set to obtain a third characteristic point set, and removing the characteristic points located in the second character area from the second characteristic point set to obtain a fourth characteristic point set; matching the characteristic points in the third characteristic point set with the characteristic points in the fourth characteristic point set to obtain model parameters of the image transformation model; and registering the image to be registered with the reference image according to the model parameters, and then splicing to obtain a spliced image.

In the above scheme, the scale-invariant feature transform SIFT algorithm is relatively time-consuming, and the images to be stitched need to have abundant texture features, such as characters in the images. For some areas of scarce features, this approach may not work properly.

Aiming at the problems that the scale-invariant feature transform SIFT algorithm in the related technology is relatively time-consuming, the images to be spliced are required to have rich texture features, and certain areas with sparse features cannot work normally, no solution is proposed yet.

Disclosure of Invention

The embodiment of the invention provides an image fusion processing method and device, which at least solve the problems that a scale-invariant feature transform (SIFT) algorithm in the related technology is relatively time-consuming, images to be spliced are required to have abundant texture features, and certain areas with sparse features cannot work normally.

According to an embodiment of the present invention, there is provided an image stitching processing method including:

adjusting the rotation angle of the dome camera in the horizontal direction under a target multiplying power, collecting a plurality of images, and determining a target focal length corresponding to the target multiplying power, wherein a superposition area exists between adjacent images in the plurality of images;

projecting coordinate points of the plurality of images to a spherical surface according to a pre-established corresponding relation between the spherical coordinates and world coordinates of the plurality of images, wherein the corresponding relation between the spherical coordinates and world coordinates of the plurality of images is determined according to the rotation angle;

back projecting partial pixels in the overlapping area on the spherical surface into the plurality of images, and determining a re-projection error between adjacent images after back projection;

adjusting the target focal length and the rolling angle of the dome camera according to the reprojection error to obtain an adjusted focal length and an adjusted rolling angle;

And back projecting the projection point on the spherical surface into a plane according to the adjusted focal length and the adjusted rolling angle to obtain a spliced image of the plurality of images.

Optionally, adjusting the target focal length and the roll angle of the ball machine according to the reprojection error, and obtaining the adjusted focal length and the adjusted roll angle includes:

and adjusting the target focal length and the rolling angle of the dome camera to obtain the adjusted focal length and the adjusted rolling angle, so that the reprojection error is minimum.

Optionally, projecting coordinate points of the plurality of images onto the spherical surface according to a pre-established correspondence between spherical coordinates and world coordinates of the plurality of images includes:

establishing the corresponding relation between the spherical machine coordinates and the world coordinates of the plurality of images according to the rotation angle;

and projecting coordinate points of the plurality of images to a spherical surface according to the corresponding relation between the spherical coordinates of the plurality of images and the world coordinates.

Optionally, establishing the correspondence between the coordinate points of the plurality of images and world coordinates according to the rotation angle includes:

determining a horizontal yaw angle attitude matrix of the dome camera according to the rotation angle;

acquiring a pitch angle posture matrix and a roll angle posture matrix of the ball machine, wherein the roll angle of the ball machine is set to be 0, and the roll angle posture matrix is determined according to the roll angle of the ball machine;

Determining an attitude matrix of the ball machine according to the horizontal yaw angle attitude matrix, the pitch angle attitude matrix and the roll angle attitude matrix;

and establishing the corresponding relation between the spherical machine coordinates and world coordinates of the plurality of images according to the gesture matrix and the internal reference matrix of the spherical machine, wherein the internal reference matrix of the spherical machine is determined according to the target focal length.

Optionally, projecting coordinate points of the plurality of images onto the spherical surface according to the correspondence between the spherical coordinates of the plurality of images and world coordinates includes:

determining a horizontal field angle after image splicing according to the rotation angle and the horizontal field angle of the dome camera;

determining the width of the spliced image according to the width of the plurality of images;

and projecting coordinate points of the plurality of images to a spherical surface according to the corresponding relation between the spherical coordinates and the world coordinates of the plurality of images, the width of the spliced image and the horizontal field angle of the spliced image.

Optionally, the method further comprises:

the corresponding relation between the spherical machine coordinates and the world coordinates of the plurality of images is established according to the gesture matrix and the internal reference matrix of the spherical machine in the following mode:

Wherein (X, Y, Z) is the world coordinates, (X, Y) is the spherical coordinates of the plurality of images, R is the pose matrix,k is an internal reference matrix of the ball machine;

the coordinate points of the plurality of images are projected to a spherical surface according to the corresponding relation between the spherical coordinates and world coordinates of the plurality of images, the width of the spliced image and the horizontal field angle of the spliced image in the following modes:

wherein w is _new And for the width of the spliced image, xi is the horizontal field angle of the spliced image, and u and v are projection coordinate values of the spherical coordinates of the plurality of images after being projected onto the spherical surface.

Optionally, before back-projecting a portion of pixels in the overlapping region on the sphere into the plurality of images, determining a re-projection error between adjacent images after back-projection, the method further comprises:

recording projection coordinate values of upper left corner coordinate points and lower right corner coordinate points of the images in the spherical surface;

determining an average value of the projection coordinate values of the lower right corner coordinate point of a first image and the projection coordinate values of the upper left corner coordinate point of a second image in the plurality of images, and determining the average value as the center coordinate of the overlapping area of two adjacent images, wherein the first image and the second image are two adjacent images, and the center coordinates of the first image and the last image in the plurality of images are the projection coordinate values of the upper left corner coordinate point and the projection coordinate values of the lower right corner coordinate point respectively;

And acquiring the partial pixel points in the overlapping region according to the central coordinates of the overlapping region.

Optionally, back projecting the projection point on the spherical surface into a plane according to the adjusted focal length and the adjusted roll angle, and obtaining the stitched image of the plurality of images includes:

determining an adjusted posture matrix and an adjusted internal reference matrix by using the adjusted roll angle and the adjusted focal length;

and back projecting the projection points on the spherical surface into the plane according to the adjusted posture matrix and the adjusted internal reference matrix to obtain a spliced image of the plurality of images.

Optionally, the method further comprises:

the projection points on the spherical surface are back projected into the plane according to the adjusted posture matrix and the adjusted internal reference matrix in the following mode, so that a spliced image of the plurality of images is obtained:

wherein (X, Y, Z) is the world coordinate, R 'is the adjusted posture matrix, K' is the adjusted internal reference matrix, and u, v are projection coordinate values of coordinate points of the plurality of images after being projected on the spherical surface.

Optionally, determining the target focal length corresponding to the target multiplying power includes:

Establishing a corresponding relation between multiplying power and focal length of the dome camera;

and determining a target focal length corresponding to the target multiplying power according to the corresponding relation between the multiplying power and the focal length.

Optionally, establishing the corresponding relation between the multiplying power and the focal length of the ball machine includes:

acquiring a third image through the dome camera under different multiplying powers, and acquiring a fourth image after controlling the dome camera to rotate by a preset angle in the horizontal direction and the vertical direction, wherein an overlapping area exists between the third image and the fourth image;

acquiring a first characteristic point from the third image and acquiring a second characteristic point from the fourth image;

determining the corresponding relation between the coordinates of the first characteristic points and the coordinates of the second characteristic points according to the internal reference matrix of the dome camera and the gesture matrix of the dome camera;

acquiring a preset number of pairing points from the third image and the fourth image through feature extraction, and determining a homography matrix according to the preset number of pairing points, wherein the preset number is an integer greater than or equal to 4;

determining the focal length of the dome camera according to the homography matrix and the corresponding relation between the coordinates of the first characteristic points and the coordinates of the second characteristic points;

And establishing the corresponding relation between the multiplying power and the focal length according to the focal length corresponding to the different multiplying powers.

Optionally, determining the correspondence between the coordinates of the first feature point and the coordinates of the second feature point according to the internal reference matrix of the ball machine and the gesture matrix of the ball machine includes:

determining a horizontal yaw angle attitude matrix of the ball machine and a pitch angle attitude matrix of the ball machine according to the preset angles of rotation of the ball machine in the horizontal direction and the vertical direction respectively;

determining a posture matrix of the ball machine according to the horizontal yaw angle posture matrix, the pitch angle posture matrix and the roll angle posture matrix, wherein the roll angle of the ball machine is set to be 0, and the roll angle posture matrix is determined according to the roll angle of the ball machine;

the corresponding relation between the coordinates of the first characteristic point and the coordinates of the second characteristic point is determined according to the internal reference matrix of the ball machine and the gesture matrix of the ball machine in the following manner:

wherein p is ₁ For the coordinates of the first feature point, p ₂ For the coordinates of the second feature points, G is a target matrix, R is the gesture matrix, and F is determined according to the internal reference matrix, +. >

Optionally, determining the focal length of the dome camera according to the homography matrix, the correspondence between the coordinates of the first feature point and the coordinates of the second feature point includes:

determining the target matrix from the homography matrix by: g=chc ^-1 Wherein H is the homography matrix, C is determined according to the internal reference matrix,

c _x 、c _y representing the offset of the optical axis of the dome camera in an image coordinate system;

determining the focal length of the dome camera according to the target matrix and the attitude matrix by the following steps:

wherein f is the focal length, G (i, j), R (i, j) represent the j-th column element of the i-th row of G, R, and i, j are integers greater than 0.

According to another embodiment of the present invention, there is also provided an image stitching apparatus including:

the first determining module is used for adjusting the rotation angle of the dome camera in the horizontal direction under the target multiplying power, collecting a plurality of images and determining a target focal length corresponding to the target multiplying power, wherein a superposition area exists between adjacent images in the plurality of images;

the projection module is used for projecting coordinate points of the plurality of images to a spherical surface according to the pre-established corresponding relation between the spherical coordinates and the world coordinates of the plurality of images, wherein the corresponding relation between the spherical coordinates and the world coordinates of the plurality of images is determined according to the rotation angle;

The first back projection module is used for back projecting partial pixels in the overlapping area on the spherical surface into the plurality of images and determining a re-projection error between adjacent images after back projection;

the adjusting module is used for adjusting the target focal length and the rolling angle of the dome camera according to the reprojection error to obtain an adjusted focal length and an adjusted rolling angle;

and the second back projection module is used for back projecting the projection point on the spherical surface into a plane according to the adjusted focal length and the adjusted rolling angle to obtain a spliced image of the plurality of images.

Optionally, the adjusting module is further used for

Optionally, the projection module includes:

the first establishing submodule is used for establishing the corresponding relation between the spherical machine coordinates and the world coordinates of the plurality of images according to the rotation angle;

and the projection sub-module is used for projecting coordinate points of the plurality of images to a spherical surface according to the corresponding relation between the spherical coordinates and world coordinates of the plurality of images.

Optionally, the first building sub-module includes:

a first determining unit, configured to determine a horizontal yaw angle gesture matrix of the ball machine according to the rotation angle;

the first acquisition unit is used for acquiring a pitch angle posture matrix and a roll angle posture matrix of the ball machine, wherein the roll angle of the ball machine is set to be 0, and the roll angle posture matrix is determined according to the roll angle of the ball machine;

the second determining unit is used for determining the attitude matrix of the ball machine according to the horizontal yaw angle attitude matrix, the pitch angle attitude matrix and the roll angle attitude matrix;

the first establishing unit is used for establishing the corresponding relation between the spherical machine coordinates and the world coordinates of the plurality of images according to the gesture matrix and the internal reference matrix of the spherical machine, wherein the internal reference matrix of the spherical machine is determined according to the target focal length.

Optionally, the projection submodule includes:

the third determining unit is used for determining the horizontal field angle after image splicing according to the rotation angle and the horizontal field angle of the dome camera;

a fourth determining unit configured to determine a width of the spliced image according to the widths of the plurality of images;

And the projection unit is used for projecting coordinate points of the plurality of images to a spherical surface according to the corresponding relation between the spherical coordinates and world coordinates of the plurality of images, the width of the spliced image and the horizontal field angle of the spliced image.

Optionally, the first establishing unit is further configured to establish correspondence between spherical machine coordinates and world coordinates of the plurality of images according to the gesture matrix and an internal reference matrix of the spherical machine by:

wherein (X, Y, Z) is the world coordinates, (X, Y) is the spherical machine coordinates of the plurality of images, R is the gesture matrix, and K is the internal reference matrix of the spherical machine;

the projection unit is further configured to project coordinate points of the plurality of images onto a spherical surface according to a correspondence between spherical coordinates and world coordinates of the plurality of images, a width of the spliced image, and a horizontal field angle of the spliced image in the following manner:

Optionally, the apparatus further comprises:

The recording module is used for recording projection coordinate values of the upper left corner coordinate points and the lower right corner coordinate points of the images in the spherical surface;

a second determining module, configured to determine an average value of projection coordinate values of the lower right corner coordinate point of a first image and projection coordinate values of the upper left corner coordinate point of a second image in the plurality of images, and determine the average value as a center coordinate of a superposition area of two adjacent images, where the first image and the second image are two adjacent images, and center coordinates of a first image and a last image in the plurality of images are respectively the projection coordinate values of the upper left corner coordinate point and the projection coordinate values of the lower right corner coordinate point;

and the acquisition module is used for acquiring the partial pixel points in the overlapping area according to the central coordinates of the overlapping area.

Optionally, the second back projection module includes:

a third determining submodule, configured to determine an adjusted pose matrix and an adjusted internal reference matrix by using the adjusted roll angle and the adjusted focal length;

and the reverse projection sub-module is used for reversely projecting the projection points on the spherical surface into the plane according to the adjusted posture matrix and the adjusted internal reference matrix to obtain a spliced image of the plurality of images.

Optionally, the back projection submodule is further configured to back project a projection point on the spherical surface into the plane according to the adjusted pose matrix and the adjusted internal reference matrix in the following manner, so as to obtain a stitched image of the multiple images:

Optionally, the first determining module includes:

the second establishing submodule is used for establishing the corresponding relation between the multiplying power and the focal length of the ball machine;

and the fourth determining submodule is used for determining a target focal length corresponding to the target multiplying power according to the corresponding relation between the multiplying power and the focal length.

Optionally, the second building sub-module includes:

the acquisition unit is used for acquiring third images through the ball machine under different multiplying powers respectively, and acquiring fourth images after controlling the ball machine to rotate in the horizontal direction and the vertical direction by a preset angle, wherein an overlapping area exists between the third images and the fourth images;

a second obtaining unit, configured to obtain a first feature point from the third image, and obtain a second feature point from the fourth image;

A fifth determining unit, configured to determine a correspondence between coordinates of the first feature point and coordinates of the second feature point according to the internal reference matrix of the ball machine and the gesture matrix of the ball machine;

a sixth determining unit, configured to obtain a predetermined number of paired points from the third image and the fourth image through feature extraction, and determine a homography matrix according to the predetermined number of paired points, where the predetermined number is an integer greater than or equal to 4;

a seventh determining unit, configured to determine a focal length of the spherical machine according to the homography matrix, a correspondence between coordinates of the first feature point and coordinates of the second feature point;

and the second establishing unit is used for establishing the corresponding relation between the multiplying power and the focal length according to the focal length corresponding to the different multiplying powers.

Optionally, the fifth determining unit is further configured to

wherein p is ₁ For the coordinates of the first feature point, p ₂ For the coordinates of the second feature points, G is a target matrix, R is the gesture matrix, and F is determined according to the internal reference matrix, +.>

Optionally, the seventh determining unit is further configured to

According to a further embodiment of the invention, there is also provided a computer-readable storage medium having stored therein a computer program, wherein the computer program is arranged to perform the steps of any of the method embodiments described above when run.

According to a further embodiment of the invention, there is also provided an electronic device comprising a memory having stored therein a computer program and a processor arranged to run the computer program to perform the steps of any of the method embodiments described above.

According to the invention, the rotation angle of the dome camera in the horizontal direction is adjusted under the target multiplying power, a plurality of images are collected, and the target focal length corresponding to the target multiplying power is determined, wherein a superposition area exists between adjacent images in the plurality of images; projecting coordinate points of the plurality of images to a spherical surface according to a pre-established corresponding relation between the spherical coordinates and world coordinates of the plurality of images, wherein the corresponding relation between the spherical coordinates and world coordinates of the plurality of images is determined according to the rotation angle; back projecting partial pixels in the overlapping area on the spherical surface into the plurality of images, and determining a re-projection error between adjacent images after back projection; adjusting the target focal length and the rolling angle of the dome camera according to the reprojection error to obtain an adjusted focal length and an adjusted rolling angle; the projection points on the spherical surface are reversely projected into a plane according to the adjusted focal length and the adjusted rolling angle to obtain a spliced image of the plurality of images, the problem that the scale-invariant feature transform SIFT algorithm in the related art is relatively time-consuming, the images to be spliced are required to have rich texture features, certain areas with sparse features cannot work normally is solved, based on the pose of a camera, the pose angle and the focal length are optimized, the method similar to SIFT feature extraction and pairing is not required to be used for splicing, the time consumption is short, and the scene with fewer textures can work normally.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiments of the invention and together with the description serve to explain the invention and do not constitute a limitation on the invention. In the drawings:

fig. 1 is a block diagram of a hardware structure of a mobile terminal of an image stitching processing method according to an embodiment of the present invention;

FIG. 2 is a flow chart of an image stitching method according to an embodiment of the present invention;

fig. 3 is a block diagram of an image stitching apparatus according to an embodiment of the present invention.

Detailed Description

The invention will be described in detail hereinafter with reference to the drawings in conjunction with embodiments. It should be noted that, in the case of no conflict, the embodiments and features in the embodiments may be combined with each other.

It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order.

Example 1

The method embodiment provided in the first embodiment of the present application may be executed in a mobile terminal, a computer terminal or a similar computing device. Taking a mobile terminal as an example, fig. 1 is a block diagram of a hardware structure of the mobile terminal according to the image stitching method of the embodiment of the present invention, as shown in fig. 1, the mobile terminal 10 may include one or more (only one is shown in fig. 1) processors 102 (the processors 102 may include, but are not limited to, a microprocessor MCU or a processing device such as a programmable logic device FPGA) and a memory 104 for storing data, and optionally, the mobile terminal may further include a transmission device 106 for a communication function and an input/output device 108. It will be appreciated by those skilled in the art that the structure shown in fig. 1 is merely illustrative and not limiting of the structure of the mobile terminal described above. For example, the mobile terminal 10 may also include more or fewer components than shown in FIG. 1 or have a different configuration than shown in FIG. 1.

The memory 104 may be used to store a computer program, for example, a software program of application software and a module, such as a computer program corresponding to a message receiving method in an embodiment of the present invention, and the processor 102 executes the computer program stored in the memory 104 to perform various functional applications and data processing, that is, implement the method described above. Memory 104 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory located remotely from the processor 102, which may be connected to the mobile terminal 10 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The transmission device 106 is used to receive or transmit data via a network. The specific examples of networks described above may include wireless networks provided by the communication provider of the mobile terminal 10. In one example, the transmission device 106 includes a network adapter (Network Interface Controller, simply referred to as NIC) that can connect to other network devices through a base station to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module, which is used to communicate with the internet wirelessly.

In this embodiment, an image stitching method operating on the mobile terminal or the network architecture is provided, and fig. 2 is a flowchart of the image stitching method according to an embodiment of the present invention, as shown in fig. 2, where the flowchart includes the following steps:

step S202, adjusting the rotation angle of the dome camera in the horizontal direction under a target multiplying power, collecting a plurality of images, and determining a target focal length corresponding to the target multiplying power, wherein a superposition area exists between adjacent images in the plurality of images;

step S204, projecting coordinate points of the plurality of images to a spherical surface according to a pre-established corresponding relation between the spherical coordinates and world coordinates of the plurality of images, wherein the corresponding relation between the spherical coordinates and world coordinates of the plurality of images is determined according to the rotation angle;

step S206, back projecting partial pixels in the overlapping area on the sphere to the plurality of images, and determining a re-projection error between adjacent images after back projection;

step S208, adjusting the target focal length and the rolling angle of the dome camera according to the reprojection error to obtain an adjusted focal length and an adjusted rolling angle;

further, step S208 may specifically include:

And step S210, back projecting the projection point on the spherical surface into a plane according to the adjusted focal length and the adjusted roll angle to obtain a spliced image of the plurality of images.

Through the steps S202 to S210, the problem that the scale-invariant feature transform SIFT algorithm in the related art is relatively time-consuming, the images to be spliced need to have abundant texture features, and the images cannot work normally in some areas with sparse features can be solved.

In the embodiment of the present invention, the step S204 may specifically include:

s2041, establishing a corresponding relation between spherical machine coordinates and world coordinates of the plurality of images according to the rotation angle;

further, determining a horizontal yaw angle attitude matrix of the dome camera according to the rotation angle; acquiring a pitch angle posture matrix and a roll angle posture matrix of the ball machine, wherein the roll angle of the ball machine is set to be 0, and the roll angle posture matrix is determined according to the roll angle of the ball machine; determining an attitude matrix of the ball machine according to the horizontal yaw angle attitude matrix, the pitch angle attitude matrix and the roll angle attitude matrix; and establishing the corresponding relation between the spherical machine coordinates and world coordinates of the plurality of images according to the gesture matrix and the internal reference matrix of the spherical machine, wherein the internal reference matrix of the spherical machine is determined according to the target focal length.

Specifically, the correspondence between the spherical machine coordinates and the world coordinates of the plurality of images can be established according to the gesture matrix and the internal reference matrix of the spherical machine by the following modes:

the coordinate points of the plurality of images may be projected to a spherical surface according to the correspondence between the spherical coordinates and world coordinates of the plurality of images, the width of the stitched image, and the horizontal angle of view of the stitched image by:

And S2042, projecting coordinate points of the plurality of images to a spherical surface according to the corresponding relation between the spherical coordinates and world coordinates of the plurality of images.

Further, the step S2042 may specifically include:

In the embodiment of the invention, before partial pixels in a superposition area on the spherical surface are back projected into the plurality of images and a re-projection error between adjacent images after back projection is determined, the projection coordinate values of the upper left corner coordinate point and the lower right corner coordinate point of the plurality of images in the spherical surface are recorded; determining an average value of the projection coordinate values of the lower right corner coordinate point of a first image and the projection coordinate values of the upper left corner coordinate point of a second image in the plurality of images, and determining the average value as the center coordinate of the overlapping area of two adjacent images, wherein the first image and the second image are two adjacent images, and the center coordinates of the first image and the last image in the plurality of images are the projection coordinate values of the upper left corner coordinate point and the projection coordinate values of the lower right corner coordinate point respectively; and acquiring the partial pixel points in the overlapping region according to the central coordinates of the overlapping region.

In the embodiment of the present invention, the step S210 may specifically include:

Specifically, the projection points on the spherical surface are back projected into the plane according to the adjusted posture matrix and the adjusted internal reference matrix in the following manner, so that a spliced image of the plurality of images is obtained:

In the embodiment of the present invention, the step S202 may specifically include:

s2021, establishing a corresponding relation between multiplying power and focal length of the dome camera;

further, the method comprises the steps of:

s11, respectively acquiring a third image through the dome camera under different multiplying powers, and acquiring a fourth image after controlling the dome camera to rotate by a preset angle in the horizontal direction and the vertical direction, wherein an overlapping area exists between the third image and the fourth image; acquiring a first characteristic point from the third image and acquiring a second characteristic point from the fourth image; determining the corresponding relation between the coordinates of the first characteristic points and the coordinates of the second characteristic points according to the internal reference matrix of the dome camera and the gesture matrix of the dome camera;

Specifically, determining a horizontal yaw angle posture matrix of the ball machine and a pitch angle posture matrix of the ball machine according to the preset angles of rotation of the ball machine in the horizontal direction and the vertical direction respectively; determining a posture matrix of the ball machine according to the horizontal yaw angle posture matrix, the pitch angle posture matrix and the roll angle posture matrix, wherein the roll angle of the ball machine is set to be 0, and the roll angle posture matrix is determined according to the roll angle of the ball machine; the corresponding relation between the coordinates of the first characteristic point and the coordinates of the second characteristic point is determined according to the internal reference matrix of the ball machine and the gesture matrix of the ball machine in the following manner:

S12, acquiring a preset number of pairing points from the third image and the fourth image through feature extraction, and determining a homography matrix according to the preset number of pairing points, wherein the preset number is an integer greater than or equal to 4; determining the focal length of the dome camera according to the homography matrix and the corresponding relation between the coordinates of the first characteristic points and the coordinates of the second characteristic points; and establishing the corresponding relation between the multiplying power and the focal length according to the focal length corresponding to the different multiplying powers.

Specifically, the target matrix is determined according to the homography matrix by: g=chc ^-1 Wherein H is the homography matrix, C is determined according to the internal reference matrix,

S2022, determining the target focal length corresponding to the target multiplying power according to the corresponding relation between the multiplying power and the focal length.

The following describes embodiments of the present invention in detail.

According to the embodiment of the invention, the internal reference of the camera can be accurately obtained when the PTZ spherical machine changes times, and the focal length and the rolling angle are optimized through minimum reprojection error according to the rotation angle information of the spherical machine, so that spliced images under different multiplying powers are obtained.

F _w ，F _c The world coordinate system and the spherical machine coordinate system are respectively represented, and the two coordinate systems are connected through the camera internal parameter matrix and the camera external parameter matrix. When the PTZ ball machine is in a certain state, assume alpha ₁ Is the value of the horizontal rotation angle beta ₁ K is a camera internal reference, wherein K is a vertical pitching angle value:

f _x And f _y Normalized focal lengths on an x axis and a y axis in an image coordinate system respectively; c (C) _x And C _y Respectively represent the offset of the camera optical axis in the image coordinate system, and is generally C _x ＝W/2，C _y =h/2, w is the width of the image, H is the height of the image.

Acquiring images through a ball machine under different multiplying powers respectively to obtain a first image (corresponding to the third image), and acquiring images again after controlling the ball machine to rotate a preset angle in the horizontal direction and the vertical direction to obtain a second image (corresponding to the fourth image); the dome camera shoots an image at a certain position, so that the scene detail in the field of view is rich. Then the horizontal and vertical directions are respectively rotated by a certain angle and marked as alpha ₂ And beta ₂ It is ensured that two adjacent pairs of images contain enough feature points and have more overlapping areas.

Extracting and matching feature points of the first image and the second image by using an ORB algorithm to obtain n pairs of matching feature points, and obtaining the corresponding relation of coordinates of the matching feature points:

wherein p is ₁ And p ₂ And respectively representing two paired characteristic points in the first image and the second image. The coordinate of the feature point in the three-dimensional world coordinate system is P _w ，s ₁ Sum s ₁ And respectively representing the scale information of the feature points in the corresponding camera coordinate system. Matrix array

Respectively representing a rotation matrix corresponding to a horizontal yaw angle and a vertical pitch angle, wherein +.>

The expression of (a) is as follows (>

And similarly available):

by combining the formulas, the preparation method can be obtained

Finishing to obtain p ₂ ＝KRK ^-1 p ₁ ，

Wherein, R is camera external parameter angle matrix, can directly obtain through the angle information of ball machine:

the camera internal reference matrix K may be decomposed as follows:

wherein the expression of the matrices C and F is as follows:

finishing to obtain p ₂ ＝CFRF ^-1 C ^-1 p ₁ ＝CGC ^-1 p ₁ ＝Hp ₁ 。

Wherein, homography matrix H and matrix G are expressed as follows:

H＝CGC ^-1 ,G＝FRF ^-1

wherein,,

let p be ₁ Is (u) ₁ ,v ₁ )，p ₂ Is (u) ₂ ,v ₂ ) The following steps are:

converting it into a matrix form:

two equations can be listed using a pair of matching point pairs, at least 4 matching point pairs are required to solve the 8-parameter matrix H, but it is required that none of the four pairs be collinear; four or more pairs of paired points can be extracted from two adjacent images through a feature extraction and matching algorithm, then a homography matrix H can be obtained through a least square method, and G can be calculated by combining the definition of the matrices F and G: g=c ^-1 HC。

Then using the matrix multiplication relationship:

determining the focal length of the ball machine to obtain f _x And f _y Is calculated by the following steps:

wherein G (i, j), R (i, j) (i=1, 2,3; j=1, 2, 3) represent the ith row and jth column elements of the matrix G, R, respectively. And (3) gradually advancing the multiplying power of the ball machine one by one, repeating the method, calculating the focal length under each multiplying power, and establishing a lookup table. Under a larger multiplying power, fitting a quadratic term function by using a least square method according to the data of the previous lookup table:

f(x)＝ax ² +bx+c to obtain three coefficients of a, b and c, and obtaining the focal length value under each multiplying power.

Setting an angle delta alpha of each horizontal rotation of the spherical machine under a certain multiplying power by utilizing the relation between the multiplying power and the focal length in the lookup table, so that a certain overlapping area exists between two adjacent images, and n images are obtained; determining the width w of the spliced image _new . The horizontal view angle theta and the longitudinal view angle omega of the spherical machine are obtained according to the focal length f:

where arctan denotes an arctangent, w denotes a width of the acquired image, h denotes a height of the acquired image, and units are pixels.

Calculating the horizontal field angle xi of the spliced image:

ξ＝(n-1)×Δα+θ。

calculating and splicing to obtain the height h of the image according to spherical projection _new ：

h _new ＝w _new ÷ξ×ω。

The width and height of the spliced image are adjusted to be divided by 2: dividing by 2, rounding, and multiplying by 2, namely:

w _new ＝[w _new ÷2]×2，h _new ＝[h _new ÷2]×2。

wherein "[ ]" means a rounded symbol, i.e., [ x ] means a maximum integer less than or equal to x.

Let the roll angle γ of the dome camera be 0 degrees, the vertical pitch angle β can be read from the SDK, and the horizontal yaw angle α is obtained according to the camera horizontal rotation angle, and each position is different: α (n) = (n-1) ×Δα, where α (n) represents a horizontal yaw angle of the i=1, 2. Wherein the attitude matrix R of the roll angle _roll Attitude matrix R of pitch angle _pitch Yaw angle attitude matrix R _yaw The method comprises the following steps of:

the pitch angle attitude matrix R _pitch Yaw angle attitude matrix R _yaw And roll angle attitude matrix R _roll As a pose matrix R for a certain state of the camera, i.e. r=r _pitch ×R _yaw ×R _roll 。

Then, according to the two-dimensional coordinate points (X, Y) in the image and the world coordinates (X, Y, Z) corresponding to the two-dimensional coordinate points, establishing a connection:

wherein R is an attitude matrix, and K is a camera internal reference matrix. Then according to the forward projection formula in spherical projection:

projecting all points in n images onto a sphere, wherein

Recording the upper left corner coordinate point (0, 0) and the lower right corner coordinate point (w-1, h-1) of each image on the spherical surfaceAnd (3) averaging the lower right corner projection coordinate value of the ith image with the upper left corner projection coordinate of the (i+1) th image to obtain coordinates of centers of overlapping areas of two adjacent images, wherein the central coordinates of the first image and the last image are respectively the upper left corner projection coordinate and the lower right corner projection coordinate of the first image and the last image. Then according to the back projection formula in spherical projection:

and reversely projecting the pixel coordinates in the superposition area of the ith image and the (i+1) th image into each image by utilizing the coordinate values to respectively obtain pixel coordinates of the ith image and the (i+1) th image taking the upper left corner of the ith image as an original point, respectively obtaining pixel values of the two images at the two coordinate values, subtracting absolute values to obtain a reprojection error of one pixel point, and obtaining reprojection errors of all the pixel points by the rest pixel points and rest adjacent image superposition areas according to the method, and adding the reprojection errors in an accumulated way. Then fine tuning the roll angle gamma to minimize the re-projection error, recording the roll angle at this point as xi _best . Similarly, the focal length f is finely adjusted to minimize the reprojection error, and the focal length at this time is denoted as f _best 。

Using an optimised roll angle ζ _best And focal length f _best And (3) obtaining a posture matrix R and an internal reference matrix K, orthographically projecting each image to the spherical surface, reversely projecting projection points on all the spherical surfaces to the plane, and processing the overlapping area by using a multi-resolution fusion algorithm to make the transition smooth, so as to finally obtain the splice graph.

The camera calibration method provided by the embodiment of the invention belongs to self-calibration, namely calibration can be completed without a target, camera focal length can be obtained under different multiplying powers, and the applicability is wide. Based on the gesture of the camera, the gesture angle and the focal length are minimized and optimized through the reprojection error, the similar SIFT feature extraction and pairing method is not needed for splicing, the time consumption is short, and the camera can work normally for scenes with fewer textures.

Example 2

According to another embodiment of the present invention, there is also provided an image stitching apparatus, fig. 3 is a block diagram of the image stitching apparatus according to an embodiment of the present invention, as shown in fig. 3, including:

a first determining module 32, configured to adjust a rotation angle of the dome camera in a horizontal direction under a target magnification, collect a plurality of images, and determine a target focal length corresponding to the target magnification, where a superposition area exists between adjacent images in the plurality of images;

A projection module 34, configured to project coordinate points of the plurality of images onto a spherical surface according to a pre-established correspondence between spherical coordinates and world coordinates of the plurality of images, where the correspondence between spherical coordinates and world coordinates of the plurality of images is determined according to the rotation angle;

a first back projection module 36, configured to back project a portion of pixels in the overlapping region on the sphere into the plurality of images, and determine a re-projection error between adjacent images after the back projection;

an adjustment module 38, configured to adjust the target focal length and the roll angle of the dome camera according to the reprojection error, so as to obtain an adjusted focal length and an adjusted roll angle;

and the second back projection module 310 is configured to back project the projection point on the spherical surface into a plane according to the adjusted focal length and the adjusted roll angle, so as to obtain a stitched image of the plurality of images.

Optionally, the adjusting module 38 is further configured to

Optionally, the projection module 34 includes:

Optionally, the first building sub-module includes:

Optionally, the projection submodule includes:

wherein (X, Y, Z) is the world coordinate and (X, Y) is the multipleThe coordinate of the ball machine of each image is R is the gesture matrix, and K is the internal reference matrix of the ball machine;

Optionally, the apparatus further comprises:

Optionally, the second back projection module 310 includes:

Optionally, the first determining module 32 includes:

Optionally, the second building sub-module includes:

Optionally, the fifth determining unit is further configured to

Optionally, the seventh determining unit is further configured to

It should be noted that each of the above modules may be implemented by software or hardware, and for the latter, it may be implemented by, but not limited to: the modules are all located in the same processor; alternatively, the above modules may be located in different processors in any combination.

Example 3

Embodiments of the present invention also provide a computer readable storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps of any of the method embodiments described above when run.

Alternatively, in the present embodiment, the above-described storage medium may be configured to store a computer program for performing the steps of:

s1, adjusting the rotation angle of a dome camera in the horizontal direction under a target multiplying power, collecting a plurality of images, and determining a target focal length corresponding to the target multiplying power, wherein a superposition area exists between adjacent images in the plurality of images;

S2, projecting coordinate points of the images to a spherical surface according to a pre-established corresponding relation between the spherical coordinates and world coordinates of the images, wherein the corresponding relation between the spherical coordinates and the world coordinates of the images is determined according to the rotation angle;

s3, back projecting partial pixels in the overlapping area on the spherical surface into the plurality of images, and determining a re-projection error between adjacent images after back projection;

s4, adjusting the target focal length and the rolling angle of the dome camera according to the reprojection error to obtain an adjusted focal length and an adjusted rolling angle;

and S5, back projecting the projection points on the spherical surface into a plane according to the adjusted focal length and the adjusted rolling angle to obtain a spliced image of the plurality of images.

Alternatively, in the present embodiment, the storage medium may include, but is not limited to: a usb disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing a computer program.

Example 4

An embodiment of the invention also provides an electronic device comprising a memory having stored therein a computer program and a processor arranged to run the computer program to perform the steps of any of the method embodiments described above.

Optionally, the electronic apparatus may further include a transmission device and an input/output device, where the transmission device is connected to the processor, and the input/output device is connected to the processor.

Alternatively, in the present embodiment, the above-described processor may be configured to execute the following steps by a computer program:

Alternatively, specific examples in this embodiment may refer to examples described in the foregoing embodiments and optional implementations, and this embodiment is not described herein.

It will be appreciated by those skilled in the art that the modules or steps of the invention described above may be implemented in a general purpose computing device, they may be concentrated on a single computing device, or distributed across a network of computing devices, they may alternatively be implemented in program code executable by computing devices, so that they may be stored in a memory device for execution by computing devices, and in some cases, the steps shown or described may be performed in a different order than that shown or described, or they may be separately fabricated into individual integrated circuit modules, or multiple modules or steps within them may be fabricated into a single integrated circuit module for implementation. Thus, the present invention is not limited to any specific combination of hardware and software.

The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the principle of the present invention should be included in the protection scope of the present invention.

Claims

1. An image stitching method is characterized by comprising the following steps:

back-projecting a portion of pixels in the spherical overlapping region into the plurality of images, determining a re-projection error between adjacent images after back-projection, comprising: back projecting pixel coordinates in a superposition area of adjacent images in the plurality of images into each image of the adjacent images to respectively obtain pixel coordinates back projected onto each image of the adjacent images, and determining the re-projection error according to the pixel coordinates back projected onto each image of the adjacent images;

adjusting the target focal length and the roll angle of the dome camera according to the reprojection error to obtain an adjusted focal length and an adjusted roll angle, wherein the method comprises the following steps: adjusting the target focal length and the rolling angle of the dome camera to obtain an adjusted focal length and an adjusted rolling angle, so that the re-projection error is minimum;

2. The method of claim 1, wherein projecting coordinate points of the plurality of images onto a sphere according to a pre-established correspondence of spherical coordinates of the plurality of images to world coordinates comprises:

3. The method of claim 2, wherein establishing the correspondence of coordinate points of the plurality of images to world coordinates according to the rotation angle comprises:

4. The method of claim 3, wherein projecting coordinate points of the plurality of images onto a sphere according to their spherical coordinates to world coordinates comprises:

5. The method according to claim 4, wherein the method further comprises:

6. The method of claim 1, wherein prior to backprojecting a portion of pixels within the sphere of overlap into the plurality of images, determining a reprojection error between adjacent images after backprojection, the method further comprises:

7. The method of claim 1, wherein back projecting the projected point on the sphere into a plane based on the adjusted focal length and the adjusted roll angle, resulting in a stitched image of the plurality of images comprises:

8. The method of claim 7, wherein the method further comprises:

9. The method of any one of claims 1 to 6, wherein determining a target focal length corresponding to the target magnification comprises:

10. The method of claim 9, wherein establishing a correspondence of a magnification of the ball machine to a focal length comprises:

11. The method of claim 10, wherein determining the correspondence of the coordinates of the first feature point and the coordinates of the second feature point from the internal reference matrix of the ball machine and the pose matrix of the ball machine comprises:

12. The method of claim 11, wherein determining the focal length of the dome camera from the homography matrix, the correspondence of the coordinates of the first feature points and the coordinates of the second feature points comprises:

c _x 、c _y representing the optical axis of the ball machine in an image coordinate systemOffset of (2); />

13. An image stitching device, comprising:

the first back projection module is configured to back project a portion of pixels in the overlapping area on the sphere into the plurality of images, determine a re-projection error between adjacent images after the back projection, and includes: back projecting pixel coordinates in a superposition area of adjacent images in the plurality of images into each image of the adjacent images to respectively obtain pixel coordinates back projected onto each image of the adjacent images, and determining the re-projection error according to the pixel coordinates back projected onto each image of the adjacent images;

the adjusting module is configured to adjust the target focal length and the roll angle of the ball machine according to the reprojection error, and obtain an adjusted focal length and an adjusted roll angle, and includes: adjusting the target focal length and the rolling angle of the dome camera to obtain an adjusted focal length and an adjusted rolling angle, so that the re-projection error is minimum;

14. A computer-readable storage medium, characterized in that the storage medium has stored therein a computer program, wherein the computer program is arranged to execute the method of any of the claims 1 to 12 when run.

15. An electronic device comprising a memory and a processor, wherein the memory has stored therein a computer program, the processor being arranged to run the computer program to perform the method of any of claims 1 to 12.