CN112422848A

CN112422848A - Video splicing method based on depth map and color map

Info

Publication number: CN112422848A
Application number: CN202011288939.4A
Authority: CN
Inventors: 王丹华; 顾秋生
Original assignee: Shenzhen Gehua Intelligent Technology Co ltd
Current assignee: Shenzhen Gehua Intelligent Technology Co ltd
Priority date: 2020-11-17
Filing date: 2020-11-17
Publication date: 2021-02-26
Anticipated expiration: 2040-11-17
Also published as: CN112422848B

Abstract

The invention discloses a video splicing method based on a depth map and a color map, which comprises the following steps: preparing a plurality of groups of color cameras and depth cameras; extracting frames from the shot color video stream and depth video stream to obtain a color image and a depth image corresponding to a certain moment; calibrating by adopting a checkerboard method to acquire the poses of a plurality of groups of color cameras and depth cameras; obtaining the three-dimensional coordinates of the current shooting scene relative to a depth camera coordinate system from the depth map, acquiring point cloud data from the depth map, and deleting the point cloud data which are not in the depth range and correspond to the pixels of the color map; performing image splicing according to the screened point cloud data; and repeating the operation to obtain the spliced video stream. The video splicing method based on the depth map and the color map has the advantages of high splicing precision and efficiency and the like.

Description

Video splicing method based on depth map and color map

Technical Field

The invention relates to the technical field of video splicing, in particular to a video splicing method based on a depth map and a color map.

Background

The existing video splicing method firstly extracts pictures in a video, then extracts feature points of each picture, performs feature matching on corresponding pictures in different videos according to the feature points, performs picture external reference settlement according to matching results, or calculates a homography matrix to splice adjacent pictures. The method has long processing time, and picture mismatching often exists on a complex scene, so that the calculation of external parameters or homography matrixes is not correct, and the splicing effect is not correct. In addition, the picture contains most scenes such as the sky and the sea surface, which are sometimes not of interest to the user because the scenes are basically consistent and often result in matching failure and inaccurate matching.

Disclosure of Invention

The invention provides a video splicing method based on a depth map and a color map, and aims to solve the problem of inaccurate matching of the existing video splicing technology.

According to the embodiment of the application, a video splicing method based on a depth map and a color map is provided, and comprises the following steps:

preparing a plurality of groups of color cameras and depth cameras;

extracting frames from the shot color video stream and depth video stream to obtain the color image and depth image corresponding to a certain time

A depth map;

calibrating by adopting a checkerboard method to acquire the poses of a plurality of groups of color cameras and depth cameras;

obtaining the three-dimensional coordinates of the current shooting scene relative to a depth camera coordinate system from the depth map, acquiring point cloud data from the depth map, and deleting the point cloud data which are not in the depth range and correspond to the pixels of the color map;

performing image splicing according to the screened point cloud data;

and repeating the operation to obtain the spliced video stream.

Preferably, a color image M and a depth image D of a certain moment in a plurality of groups of color cameras and depth cameras are obtained;

sets of color and depth map matching pairs (M1, D1), (M2, D2), …, (Mn, Dn);

preferably, the method comprises the following steps:

setting a plane with the average depth vertical to the Z axis as a projection plane;

projecting the point cloud data obtained by the depth map onto the projection surface to obtain a plurality of two-dimensional points;

filling the two-dimensional points on the plane according to the pixel value obtained by projecting each three-dimensional point to the color map;

carrying out hole filling optimization on the color image of the projection surface;

and obtaining a complete spliced image frame.

Preferably, the calibration is carried out by adopting a checkerboard method, and the poses of a plurality of groups of color cameras and depth cameras are obtained, and the method comprises the following steps:

the multiple groups of color cameras and the depth cameras observe the same checkerboard or multiple checkerboards;

taking one of the color cameras as a reference to obtain the positions of all the color cameras;

calculating the relative positions of a plurality of groups of color cameras;

the poses of the depth camera and the color camera on each device are fixed, and the poses of all the depth cameras are obtained.

Preferably, obtaining three-dimensional coordinates of the current shooting scene relative to a depth camera coordinate system from the depth map comprises the following steps:

converting the local three-dimensional coordinates according to the calculated relative pose to convert the local three-dimensional coordinates into three-dimensional point cloud;

fusing all three-dimensional point cloud data;

processing the three-dimensional point cloud data, including denoising and reprojection error optimization;

deleting the point cloud data which are not in the depth range;

and obtaining the information of the observation scene.

The technical scheme provided by the embodiment of the application can have the following beneficial effects: compared with the traditional scheme, the video splicing method based on the depth map and the color map only splices scenes in the depth range according to the specified depth range. Therefore, the user can obtain the image in the desired observation range, the interference of other scenes is eliminated, and the splicing precision and efficiency can be greatly improved. The method has great advantages in the aspect of fast real-time splicing, and can focus on certain scenes in a targeted manner. The depth map has the advantages of accurate matching, high speed and convenient scene designation.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a schematic flow chart of a video stitching method based on a depth map and a color map according to the present invention;

fig. 2 is a schematic flow chart of step S2 in the video stitching method based on depth map and color map according to the present invention;

FIG. 3 is a schematic flow chart of step S4 in the video stitching method based on depth map and color map according to the present invention;

FIG. 4 is a schematic flow chart of step S3 in the video stitching method based on the depth map and the color map according to the present invention;

fig. 5 is a schematic flowchart of step S4 in the video stitching method based on depth map and color map according to the present invention.

Description of reference numerals:

10. a video splicing method based on a depth map and a color map.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this specification and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.

Referring to fig. 1, the present invention discloses a video stitching method 10 based on a depth map and a color map, which includes the following steps:

step S1: preparing a plurality of groups of color cameras and depth cameras;

step S2: extracting frames from the shot color video stream and depth video stream to obtain a color image and a depth image corresponding to a certain moment;

step S3: calibrating by adopting a checkerboard method to acquire the poses of a plurality of groups of color cameras and depth cameras;

step S4: obtaining the three-dimensional coordinates of the current shooting scene relative to a depth camera coordinate system from the depth map, acquiring point cloud data from the depth map, and deleting the point cloud data which are not in the depth range and correspond to the pixels of the color map;

step S5: performing image splicing according to the screened point cloud data;

step S6: and repeating the operation to obtain the spliced video stream.

Compared with the traditional scheme, the video splicing method based on the depth map and the color map only splices scenes in the depth range according to the specified depth range. Therefore, the user can obtain the image in the desired observation range, the interference of other scenes is eliminated, and the splicing precision and efficiency can be greatly improved. The method has great advantages in the aspect of fast real-time splicing, and can focus on certain scenes in a targeted manner. The depth map has the advantages of accurate matching, high speed and convenient scene designation.

At present, a plurality of devices capable of acquiring depth images are available, such as Microsoft Kinect and Ophiometric 3d cameras. The devices can conveniently and simultaneously acquire the color image and the corresponding depth image. Since the color camera and the depth camera are fixed on the device, the relative pose of each color map and depth map is fixed.

Referring to fig. 2, the step S2 includes the following steps:

step S21: acquiring a color image M and a depth image D of a certain moment in a plurality of groups of color cameras and depth cameras;

step S22: sets of color and depth map matching pairs (M1, D1), (M2, D2), …, (Mn, Dn);

the 3D camera acquires the color image M and the depth image D at the same time every time of acquisition. The splicing requires multiple pictures at different positions, so that a picture of a larger range of scenes can be spliced. In practice, the color picture M1 … Mn is mainly used for stitching, and the purpose of the depth map is to assist stitching, so that stitching is more accurate, and meanwhile, the depth information can be used to remove an unnecessary range.

Referring to fig. 3, the step S4 includes the following steps:

step S41: setting a plane with the average depth vertical to the Z axis as a projection plane;

step S42: projecting the point cloud data obtained by the depth map onto the projection surface to obtain a plurality of two-dimensional points;

step S43: filling the two-dimensional points on the plane according to the pixel value obtained by projecting each three-dimensional point to the color map;

step S44: carrying out hole filling optimization on the color image of the projection surface;

step S45: and obtaining a complete spliced image frame.

Each point cloud data has a three-dimensional coordinate, and the Z-axis coordinate can be regarded as a depth value. Therefore, a depth range is set in advance, each point is traversed, and the point clouds of which the Z-axis values are not in the range are deleted.

Referring to fig. 4, the step S3 includes the following steps:

step S31: the multiple groups of color cameras and the depth cameras observe the same checkerboard or multiple checkerboards;

step S32: taking one of the color cameras as a reference to obtain the positions of all the color cameras;

step S33: calculating the relative positions of a plurality of groups of color cameras;

step S34: the poses of the depth camera and the color camera on each device are fixed, and the poses of all the depth cameras are obtained.

For example: taking the first 3d sensor position as a reference, the position relationship among other pre-calibrated 3d sensors is (R2, T2), … (Rn, Tn), R is a rotation matrix, dimension 3 × 3, T is a translation matrix, and dimension 3 × 1, so that the coordinate converted from one three-dimensional point p on the ith sensor to the reference coordinate system is Ri × p + Ti.

Referring to fig. 5, the step S4 further includes the following steps:

step S46: converting the local three-dimensional coordinates according to the calculated relative pose to convert the local three-dimensional coordinates into three-dimensional point cloud;

step S47: fusing all three-dimensional point cloud data;

step S48: processing the three-dimensional point cloud data, including denoising and reprojection error optimization;

step S49: deleting the point cloud data which are not in the depth range;

step S50: and obtaining the information of the observation scene.

The possible gaps in the picture are generally not considered because the content shot by the part is not in the depth observation range. If for aesthetic reasons, it can also be interpolated with surrounding color pixel values.

While the invention has been described with reference to specific embodiments, the invention is not limited thereto, and various equivalent modifications and substitutions can be easily made by those skilled in the art within the technical scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A video splicing method based on a depth map and a color map is characterized by comprising the following steps:

preparing a plurality of groups of color cameras and depth cameras;

extracting frames from the shot color video stream and depth video stream to obtain a color image and a depth image corresponding to a certain moment;

performing image splicing according to the screened point cloud data;

and repeating the operation to obtain the spliced video stream.

2. The method for video stitching based on depth map and color map as claimed in claim 1, comprising the steps of:

acquiring a color image M and a depth image D of a certain moment in a plurality of groups of color cameras and depth cameras;

multiple sets of color and depth map matching pairs (M1, D1), (M2, D2), …, (Mn, Dn).

3. The method for video stitching based on depth map and color map as claimed in claim 2, comprising the steps of:

and obtaining a complete spliced image frame.

4. The video stitching method based on the depth map and the color map as claimed in claim 1, wherein a checkerboard method is adopted for calibration to obtain the poses of a plurality of groups of color cameras and depth cameras, and the method comprises the following steps:

calculating the relative positions of a plurality of groups of color cameras;

5. The method for video stitching based on the depth map and the color map as claimed in claim 1, wherein the step of obtaining the three-dimensional coordinates of the current shooting scene relative to the depth camera coordinate system from the depth map comprises the following steps:

fusing all three-dimensional point cloud data;

deleting the point cloud data which are not in the depth range;

and obtaining the information of the observation scene.