CN110675346B

CN110675346B - Image acquisition and depth map enhancement method and device suitable for Kinect

Info

Publication number: CN110675346B
Application number: CN201910917492.3A
Authority: CN
Inventors: 吴怀宇; 洪运志; 丁元浩; 刘家乐; 李琳; 陈思文
Original assignee: Wuhan University of Science and Engineering WUSE
Current assignee: Wuhan University of Science and Engineering WUSE
Priority date: 2019-09-26
Filing date: 2019-09-26
Publication date: 2023-05-30
Anticipated expiration: 2039-09-26
Also published as: AU2020101832A4; CN110675346A

Abstract

The invention discloses an image acquisition and depth map enhancement method and device suitable for Kinect, belonging to the field of image processing, wherein the method comprises the following steps: depth camera data acquisition, improving depth map hole repair and depth map edge noise smoothing based on an FMM algorithm, wherein the depth camera data acquisition comprises preliminary original data acquisition of a color depth map, and then format conversion is carried out on the preliminary original data to obtain the original data of the color depth map; repairing a large-area hole in the depth map by adopting a repairing mask generating method based on inverse threshold binarization aiming at the obtained depth map data and the hole in the depth map; aiming at the depth image obtained by hole restoration, adopting an image edge noise smoothing method based on median filtering to carry out further image enhancement processing on the depth image. By the method, the holes and noise in the depth map can be obviously removed, and the applicability and reliability of Kinect in computer vision are further enhanced.

Description

Image acquisition and depth map enhancement method and device suitable for Kinect

Technical Field

The invention belongs to the field of image processing, relates to a data preprocessing method based on a depth image, and particularly relates to an image acquisition and depth image enhancement method and device suitable for Kinect.

Background

Depth information plays an important role in many computer vision applications, such as augmented reality, scene reconstruction, 3D television auxiliary sensors, etc. Advances in sensor technology have led to the advent of many depth sensing cameras in the marketplace. As a low cost and real-time depth sensor, a representative product such as the Kinect depth sensor manufactured by Microsoft corporation is a structured light type camera sensor, and the structured light sensor measurement is subject to multiple reflections, encountering abnormal reflections, light blocking and reflection obstructions, which can cause many problems with the original Kinect depth map, and some areas lack depth data. This error is an important issue affecting Kinect data usage and application development.

Therefore, prior to image applications such as Kinect-type depth cameras, it is necessary to pre-process their data, fill in hole pixels and smooth image noise, resulting in improved reliability in computer vision applications. Based on this, how to effectively pre-process data based on depth image is a technical problem that needs to be solved at present.

Disclosure of Invention

Aiming at the defects or improvement demands of the prior art, the invention provides an image acquisition and depth map enhancement method and device suitable for Kinect, which solve the technical problem of how to effectively preprocess depth image data so as to fill cavity pixels and smooth image noise.

To achieve the above object, according to one aspect of the present invention, there is provided an image acquisition and depth map enhancement method suitable for Kinect, comprising:

(1) Acquiring an original depth image, and performing format conversion on the original depth image to obtain a depth image in a target format;

(2) Determining a target mask generation mode according to whether a region to be repaired in the depth image in the target format is positioned at the edge position of the image, and obtaining a mask of the region to be repaired in the depth image in the target format based on the target mask generation mode;

(3) The mask of the region to be repaired is combined with a fast travelling algorithm to fill the hole of the region to be repaired in the depth image in the target format, so that a repaired depth image is obtained;

(4) And carrying out median filtering on the depth image subjected to hole repair so as to remove image edge noise and obtain the depth image subjected to image enhancement processing.

Preferably, step (2) comprises:

if the region to be repaired in the depth image in the target format is not located at the edge of the image, a mouse callback function is added to an interface function cvInpaint of OpenCv, and then a color threshold range is set according to the region to be repaired in the depth image in the target format, so that a mask of the region to be repaired is obtained.

Preferably, step (2) comprises:

if the region to be repaired in the depth image in the target format is positioned at the edge of the image, then the depth image is formed by

Determining an inverse binary thresholding function of a mask of a region to be repaired so as to set a color threshold range of the region to be repaired to obtain the mask of the region to be repaired, wherein threshold is a preset threshold, I (x, y) represents a pixel value at a pixel point (x, y) of the region to be repaired, and omega represents the region to be repaired.

Preferably, step (3) comprises:

from the following components

Hole filling is carried out on the region to be repaired in the depth image in the target format, wherein the p point is a pixel needing to be repaired, and D _p Representing depth value at point p, B _ε (p) represents the neighborhood of p-point, q is B _ε One point in (p), w (p, q) is used to measure the similarity of p point and the neighborhood pixel q, D _q Represents the depth value at q-point, +.D _q The luminance gradient value representing the q point, (p-q) represents the geometric distance between pixel p and pixel q.

Preferably, step (4) comprises:

and carrying out median filtering on the depth image after hole restoration by g (x, y) =med { f (x-k, y-l), (k, l epsilon w) } to obtain a depth image after image enhancement processing, wherein g (x, y) represents the image after median filtering, f (x, y) represents the depth image after hole restoration, w represents a two-dimensional median filtering template, and k and l take values in w.

According to another aspect of the present invention, there is provided an image acquisition and depth map enhancement apparatus adapted for Kinect, comprising:

the image format conversion module is used for obtaining an original depth image, and carrying out format conversion on the original depth image to obtain a depth image in a target format;

the mask image generation module is used for determining a target mask generation mode according to whether the region to be repaired in the depth image in the target format is positioned at the edge position of the image, and obtaining a mask of the region to be repaired in the depth image in the target format based on the target mask generation mode;

the restoration module is used for filling holes in the area to be restored in the depth image in the target format by combining the mask of the area to be restored with a fast-marching algorithm to obtain a restored depth image;

and the filtering module is used for carrying out median filtering on the depth image subjected to hole repair so as to remove image edge noise and obtain the depth image subjected to image enhancement processing.

Preferably, the mask image generating module is configured to add a mouse callback function to an interface function cvInpaint of OpenCv when the region to be repaired in the depth image in the target format is not located at an edge of the image, and then set a color threshold range according to the region to be repaired in the depth image in the target format, so as to obtain a mask of the region to be repaired.

Preferably, the mask image generating module is configured to generate, when the region to be repaired in the depth image in the target format is located at an edge of the image, a mask image generated by

Preferably, the repair module is used for repairing the object by

Preferably, the filtering module is configured to median filter the depth image after hole repair by g (x, y) =med { f (x-k, y-l), (k, l e w) }, to obtain a depth image after image enhancement processing, where g (x, y) represents the image after median filtering, f (x, y) represents the depth image after hole repair, w represents a two-dimensional median filtering template, and k and l take values in w.

In general, the above technical solutions conceived by the present invention, compared with the prior art, enable the following beneficial effects to be obtained: the method can obtain the color and depth original image of the depth camera Kinect, and the obtained data can be applied to three-dimensional reconstruction of scenes, robot V-SLAM and the like; meanwhile, aiming at ubiquitous holes and noise brought by the sensor and the environment in the depth map, the hole repairing method based on the FMM (fast marching method) algorithm and the noise smoothing method based on the median filtering are adopted, so that the holes and the noise in the depth map can be obviously removed, and the applicability and the reliability of Kinect in the aspect of computer vision research are further enhanced.

Drawings

FIG. 1 is a schematic flow chart of a method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of noise in a Kinect acquired image according to an embodiment of the present invention;

FIG. 3 is a flow chart of depth map restoration and denoising according to an embodiment of the present invention;

fig. 4 is a schematic diagram of depth map hole repair of an improved FMM algorithm according to an embodiment of the present invention, where (a) is an FMM algorithm repair principle, and (b) is a depth map repaired by the improved FMM algorithm;

FIG. 5 is a median filter noise smoothing schematic diagram provided in an embodiment of the present invention;

fig. 6 is a schematic diagram of a device structure according to an embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. In addition, the technical features of the embodiments of the present invention described below may be combined with each other as long as they do not collide with each other.

The invention provides an image acquisition and depth map enhancement method and device applicable to Kinect, and aims to design a set of data acquisition and preprocessing method flow applicable to a depth camera V-SLAM. Adopting a color depth map original data acquisition method based on Kinect_SDK in Windows environment; the method is improved based on an FMM algorithm, and the depth map cavity is repaired; next, a noise removing method based on median filtering is designed, and smoothing is carried out on edge noise of the depth map.

Fig. 1 is a schematic flow chart of an image acquisition and depth map enhancement method suitable for Kinect according to an embodiment of the present invention, where the method shown in fig. 1 includes the following steps:

s1: collecting and preprocessing color depth images;

because the color depth map is acquired under the ROS platform and is image data generated through rqt or screen capturing, the acquired depth map is not the original data (sixteen-bit single channel) of the depth map, and therefore, in the embodiment of the invention, an image acquisition method based on Kinect_SDK under Windows environment is adopted. And obtaining preliminary original data of the color depth map by means of a development kit and driving of Kinect, and then performing data processing on the preliminary original data by using an image matrix processing tool Matlab to obtain the required original data of the color depth map.

In the embodiment of the present invention, step S1 may be implemented by the following steps:

s11: and (3) collecting data: starting a PC and a depth camera Kinect, opening a driving package Kinect_SDK of the Kinect camera, and testing whether image information can be normally sent to an opening package; opening KinectExplorer-D2D.exe under the KinectSaver folder;

s12: selecting the type of the stored file: on the basis of step S11, an official specification file format of the color depth map needs to be generated. After the Kinect_SDK is opened, a default image storage format exists, and in the embodiment of the invention, the color image is required to be stored as a three-channel 8-bit png format image, and the depth image is required to be stored as a single-channel 16-bit png format image; firstly, after the KinectExplorer-D2D.exe operation interface appears, the lower right menu bar selects recording, the color map selects image format, and the depth map selects binary format.

S13: and (3) customizing folders and paths, and respectively storing the color depth maps: the Kinect v1 depth camera has an RGB color camera, an infrared CMOS camera and an infrared emitter. The color map and depth map information are independently distributed to the kinect_sdk. After the preliminary original data of the color depth map are obtained in two independent folders on the Windows platform, a series of data streams are obtained. In the embodiment of the invention, the obtained color data stream is a series of images in the format of the bmp at different times, and the obtained depth data stream is binary data in the format of the binary at the same time, so that the image information cannot be directly displayed. Therefore, the preliminary raw data needs to be processed to obtain a color depth raw image.

S14: converting and processing the format of the original data by Matlab: after the data in the format of the (bmp) and the (binary) are obtained in the step S13, converting the color image into a three-channel 8-bit image of the (png), writing a Matlab script into the depth image, splitting the data stream of the (binary) and converting the format, and processing to obtain a single-channel 16-bit image of the (png).

S2: improving depth map hole repair of an FMM algorithm;

as shown in fig. 2, kinect uses structured light imaging to obtain depth information of the field of view, and depth images often have missing values. When the surface of the object is smooth or the color of the surface of the object is deep, the light reflected into the camera is weak due to specular reflection or light absorption of the object, so that detection fails, and black holes appear in the corresponding depth image, so that pixel values are lost. To ensure the accuracy of feature positioning in V-SLAM, the black hole noise must be filled first, i.e. the depth image is repaired, which has important significance for the practical application of the depth image.

In the embodiment of the present invention, step S2 may be implemented by:

s21: representation of images in a computer:

in the conventional definition mode of the pixel coordinate system, the origin of the pixel coordinate system is located at the upper left corner of the image, the X axis is right, and the Y axis is downward in a gray scale image, and each pixel position (X, Y) corresponds to a gray scale value I, so that an image with the width w and the height h can be recorded as a matrix in mathematical form, as shown in the formula (1).

I(x,y)∈R ^w×h (1)

In the depth map of the depth camera, the distance of the individual pixels from the camera, i.e. d in (u, v, d), is also recorded. This distance is typically in millimeters, e.g., the Kinect range is typically around tens of meters, exceeding a maximum range of 0-255. Therefore, the original format of the depth image in the computer is generally represented by sixteen-bit integers, and the first ten bits in sixteen-bit data are required to be extracted during specific application, so that the real depth information of the image is obtained.

S22: FMM algorithm image restoration principle:

FIG. 4 (a) shows the depth map hole repair principle, where the Ω -region is the region to be repaired; δΩ refers to the boundary of Ω; to repair a pixel in Ω, a new pixel value needs to be calculated in place of the original value. The p-point is the pixel that needs repair. Selecting a small neighborhood B with p as the center _ε (p) the pixel values of the points in the neighborhood are known, so that the pixel value of a p-point can be estimated from the valid pixels in its neighborhood. Wherein epsilon is the parameter inpaintpadius in the Opencv function, and q is B _ε Calculating the gray value formula of p from q points at one point in (p) as follows (2), wherein D (q) is the gray gradient of q points;

D(p)＝D(q)+▽D(q)(p▽q) (2)

then use neighborhood B is needed _ε All the points in (p) are calculated to obtain new gray values of the p points, and a weight function is introduced to judge which points have more obvious effect on determining the new pixel gray values, because the effect of each point on the new pixel gray values is generally different. The following process is as shown in formula (3):

wherein D is _q Representing the depth value at point q, representing the pixel gradient of point q, [ V ] D _q The brightness gradient value of the q point is represented, (p-q) represents the geometric distance between the pixel p and the pixel q, w (p, q) is a weight function, and is used for measuring the similarity degree of the p point and the neighborhood pixel q and determining the influence factor of each pixel in the neighborhood on the gray value to be calculated. Interpretation of the weight function is as follows equation (4):

w(p,q)＝dir(p,q)·dst(p,q)·lev(p,q) (4)

wherein each amount in the formula (4) is explained as in the formulas (5 to 7).

Wherein d ₀ And T ₀ The distance parameter and the level set parameter are represented, respectively, and generally 1 is taken. The direction parameters dir (p, q) ensure that the influence of the pixel points which are closer to the normal direction N= v T on the p point is the largest; the geometric distance factor dst (p, q) ensures that the closer the pixel point is to the p point, the greater the contribution of the pixel point to the p point; the level set distance parameter lev (p, q) ensures that the closer to the outline of the region to be repaired passing through the p point, the greater the influence degree of the known pixel point to the p point is, N (p) represents the size of a neighborhood window of the repaired pixel point, and T (p) and T (q) represent the distances from the p, q point to the neighborhood boundary delta omega.

The interface function cvInpaint of OpenCv performs preliminary implementation on the above image restoration, and a prototype of the cvInpaint function is as follows:

void inpaint(InputArray src,InputArray inpaintMask,

OutputArray dst,double inpaintRadius,int flags)；

a parameter src, an input single-channel or three-channel image; the size of the parameter inpaintpask is consistent with that of the original image, and the pixel values of other parts of the inpaintpask image except the part needing to be repaired are all 0; a parameter dst, an output restored image; the parameter inpaint radius is a neighborhood radius obtained by a repair algorithm and is used for calculating the difference value of the current pixel point; parameter flags, repair algorithm constant parameters.

However, since the cvInpaint function is aimed at repairing color images, the invention aims at improving the original FMM algorithm aiming at the gray level characteristic of the depth image, in particular to a repairing mask generating part in the original FMM algorithm, and designs two mask (inpaintMask) generating methods.

S23: generating a mask for image restoration;

firstly, after the original data acquired by Kinect are converted and preprocessed, an original image of a color depth map is obtained. As shown in fig. 3, it is determined whether or not the hole of the depth map is located at the edge position of the image, and thus, it is decided to use manual mask generation (selective mask generation). The manual repair mask generation is based on an FMM algorithm, a mouse event and threshold processing are added, and the automatic mask generation method is a mask generation method obtained after performing inverse binary thresholding on mask generation.

S231: generating a selective mask;

when the hole noise of the depth map is located in the non-edge area of the image, a manual mask generation (selective mask generation) method is adopted, and the specific processing method is that a mouse callback function ("void MouseCallBack") is added to an original edition cvinipoint function, then a threshold range is set (between 0 and 255) according to the hole of the depth map, and two representative color thresholds in the embodiment of the invention are set as follows: white (235-255) and black (0-35).

S232: generating a thresholding mask based on the inverse binary;

when the hole noise of the depth map is positioned in the image edge area, an automatic mask generation (mask generation based on inverse binary thresholding) method is adopted to repair the hole, and at the moment, the edge of the depth map is better protected. Setting dΩ, which represents a repair mask of a function in OpenCv, is an inverse binary thresholding function for determining the area mask to be repaired, as in equation (8).

In the formula (8), binarizing the depth image according to the threshold range of the hole noise of the depth image, namely setting the pixels with the gray level value not larger than the threshold value threshold to 255 and setting the rest pixels to 0, thereby obtaining the mask of the region to be repaired of the image. Then, the image is restored by combining the image with an FMM algorithm.

In the embodiment of the invention, the threshold is set to 35, and the specific numerical value can be determined according to actual needs, so that the embodiment of the invention is not limited in uniqueness.

S24: the image restoration mask is combined with an FMM algorithm to restore the cavity;

after the image mask is generated by the depth map to be repaired, the hole filling can be performed by combining a fast-marching algorithm, and the hole repairing effect of the embodiment of the invention is shown in fig. 4 (b).

S3: median filtering is carried out to remove edge noise of the depth map;

the median filtering is a nonlinear filtering, and is convenient because the statistical characteristics of the image are not needed in the actual operation process. Median filtering is first applied in one-dimensional signal processing techniques and later in two-dimensional image signal processing techniques. Under certain conditions, the method can overcome the blurring of image details caused by a linear filter (such as neighborhood average operation), and is most effective in filtering pulse interference and image scanning noise. But for some images with more detail, especially with more points, lines and peaks, the median filtering method is not suitable. The basic principle of median filtering is to replace the value of a point in a digital image or digital sequence with the "median" of the values of points in a neighborhood of that point. "median" means that gray values in a neighborhood are arranged in a sequence from large to small (or vice versa), and the number in the middle is the median of the sequence.

In image processing, a noise removal method for filtering an image is generally represented by formula (9):

in the formula (9), I is an image with noise removed, omega is a neighborhood range of a pixel (x, y), and generally a rectangular area taking (x, y) as a center; w (i, j) is the weight of the filter at (i, j); w (W) _p Is a unique parameter, I is an image containing noise. Wherein:

s31: median filtering of the image;

median smoothing is a neighborhood operation, assuming an image I as a two-dimensional sequence { x } _i,j The filter window is also two-dimensional when median filtering, but the two-dimensional window can have various shapes, such as a line shape, a square shape, a round shape, a cross shape, a circular ring shape and the like. The median filtering of the two-dimensional data is represented by formula (11):

assuming that a certain image has a height H and a width W, as shown in the formula (11), for any position (x, y) in the image, x is more than or equal to 0 and less than or equal to t; y is more than or equal to 0 and less than or equal to t; and taking a neighborhood with (x, Y) as a center, the width of the neighborhood is i, the height of the neighborhood is j, wherein i and j are both odd numbers, sequencing pixel points in the neighborhood, and then mutually taking a median value as a pixel value of the output image Y at the (x, Y).

Specifically, in the embodiment of the invention, median filtering is performed on the depth image after hole repair by the formula (12) to obtain the depth image after image enhancement processing;

g(x,y)＝med{f(x-k,y-l),(k,l∈w)} (12)

wherein g (x, y) represents the median filtered image, f (x, y) represents the depth image after hole repair, w represents the two-dimensional median filtering template, and k and l take the values in w.

S32: selecting a filter window size;

further, the filtering parameters are adjusted to be suitable for smoothing of the depth image, and a plurality of groups of experiments are conducted to compare, analyze and select the filtering parameters with good effects. The proper adjustment of the filter window parameter winsize can improve the black hole filling effect, but at the same time, the fact that when the winsize is increased to a certain range, the edge of the depth image becomes blurred and the image detail is easy to lose is noticed, the two factors are comprehensively considered, and the median filter with winsize=13 is finally determined to be selected as a denoising and small-range black hole restoration method of the depth image.

It should be noted that in practical use of the window, the size of the window preferably does not exceed the size of the smallest effective object in the image, typically by gradually increasing by 3 x 3 and then by 5 x 5, until the filtering effect is satisfactory. For images with slowly varying longer contour objects, square or circular windows are preferred, and for images containing sharp-tipped objects, cross-shaped windows are preferred.

S33: calculating a median value of gray scales by neighborhood operation;

s331: judging the image boundary: firstly, the filter window width is set to be winH, winW (the filter window width is introduced in step S31 and is odd), and half of the filter window width is respectively: halfwinh= (winH-1)/2, halfwinw= (winW-1)/2. Calculating a boundary standard value x-halfWinH <0 or more than or equal to 0; y-halfWinW <0 or is greater than or equal to 0; the image width is denoted as h, w, and it is calculated whether x+halfwinH > h-1 and y+halfwinW > w-1 are true.

S332: taking the neighborhood: on the basis of step S331, the gray neighborhood to be compared is denoted as R,

then there are:

R＝[max(x-halfWinH,0):min(h-1,x+halfWinH),max(y-halfWinW,0):min(w-1,y+halfWinW)]

wherein max and min are the big and small functions after comparison, respectively.

S333: calculating a median: combining the neighborhood in step S332, and calculating to obtain the median of the gray values according to the formula (9) in step S31.

For the depth map filled with the holes, edge noise and tiny burr holes of the depth map are smoothly enhanced by utilizing image median filtering, and the effect of the noise repair map acquired by Kinect in the embodiment is as shown in fig. 5.

In the embodiment of the invention, the data acquisition and processing are processed in a PC, the system platform is based on a Windows operating system, the depth camera sensor is Kinect v1 produced by Microsoft corporation, and a matched software development interface Kinect_SDK for Windows is utilized; the data processing compiling environment of the color depth map is Visual Studio C++11, an OpenCV data processing library with version 2.4.9 and a large matrix operation processing software Matlab with version 2013 a.

The invention discloses an image acquisition and depth map enhancement method and device suitable for Kinect, and relates to the fields of image processing and robot vision. The method flow mainly comprises three parts: a depth camera data acquisition method based on Kinect_SDK improves depth map hole repair and depth map edge noise smoothing based on an FMM algorithm. The first part further comprises preliminary original data acquisition of the color depth map, format conversion is carried out on the preliminary original data, and the original data of the color depth map is obtained through processing; the second part aims at the depth map data obtained in the first step and the holes in the depth map, improves an original FMM (fast running method) algorithm, and provides a restoration mask generation method based on inverse threshold binarization for restoring large-area holes in the depth map. And thirdly, aiming at the depth image obtained by the one-step hole restoration, adopting an image edge noise smoothing method based on median filtering to carry out further image enhancement processing on the depth image. The method can obtain the color and depth original images of the depth camera Kinect, and the obtained data can be applied to three-dimensional reconstruction of scenes, robot V-SLAM and the like; meanwhile, aiming at ubiquitous hollows and noises brought by the sensor and the environment in the depth map, the improved FMM (fast running method) -based hollow repairing method and the adopted noise smoothing method based on median filtering are designed, so that hollows and noises in the depth map can be obviously removed, and the applicability and reliability of Kinect in the aspect of computer vision research are further enhanced.

As shown in fig. 6, a schematic structural diagram of an apparatus according to the present invention includes:

the image format conversion module is used for acquiring an original depth image, and performing format conversion on the original depth image to obtain a depth image in a target format;

the restoration module is used for filling holes in the region to be restored in the depth image in the target format by combining the mask of the region to be restored with a fast-marching algorithm to obtain a restored depth image;

and the filtering module is used for carrying out median filtering on the depth image after hole repair so as to remove image edge noise and obtain the depth image after image enhancement processing.

Wherein, the specific implementation manner of each module can refer to the description in the method embodiment, and the embodiment of the invention will not be repeated.

In another embodiment of the present invention, there is further provided a computer readable storage medium having stored thereon program instructions which, when executed by a processor, implement the image acquisition and depth map enhancement method applicable to Kinect as described in any of the above.

It should be noted that each step/component described in the present application may be split into more steps/components, or two or more steps/components or part of the operations of the steps/components may be combined into new steps/components, as needed for implementation, to achieve the object of the present invention.

The above-described method according to the present invention may be implemented in hardware, firmware, or as software or computer code storable in a recording medium such as a CD ROM, RAM, floppy disk, hard disk, or magneto-optical disk, or as computer code originally stored in a remote recording medium or a non-transitory machine-readable medium and to be stored in a local recording medium downloaded through a network, so that the method described herein may be stored on such software process on a recording medium using a general purpose computer, special purpose processor, or programmable or special purpose hardware such as an ASIC or FPGA. It is understood that a computer, processor, microprocessor controller, or programmable hardware includes a memory component (e.g., RAM, ROM, flash memory, etc.) that can store or receive software or computer code that, when accessed and executed by the computer, processor, or hardware, implements the processing methods described herein. Further, when the general-purpose computer accesses code for implementing the processes shown herein, execution of the code converts the general-purpose computer into a special-purpose computer for executing the processes shown herein.

It will be readily appreciated by those skilled in the art that the foregoing description is merely a preferred embodiment of the invention and is not intended to limit the invention, but any modifications, equivalents, improvements or alternatives falling within the spirit and principles of the invention are intended to be included within the scope of the invention.

Claims

1. An image acquisition and depth map enhancement method suitable for Kinect, comprising:

(4) Performing median filtering on the depth image subjected to hole repair to remove image edge noise and obtain a depth image subjected to image enhancement processing;

the step (2) comprises:

if the region to be repaired in the depth image in the target format is not located at the edge of the image, a mouse callback function is added to an interface function cvInpaint of OpenCv, and then a color threshold range is set according to the region to be repaired in the depth image in the target format, so that a mask of the region to be repaired is obtained;

the step (2) comprises:

2. The method of claim 1, wherein step (3) comprises:

from the following components

Hole filling is carried out on the region to be repaired in the depth image in the target format, wherein the p point is a pixel needing to be repaired, and D _p Representing depth value at point p, B _ε (p) represents the neighborhood of p-point, q is B _ε One point in (p), w (p, q) is used to measure the similarity of p point and the neighborhood pixel q, D _q Represents the depth value at point q, +.>

The luminance gradient value representing the q point, (p-q) represents the geometric distance between pixel p and pixel q.

3. The method of claim 2, wherein step (4) comprises:

4. An image acquisition and depth map enhancement apparatus adapted for Kinect, comprising:

the filtering module is used for carrying out median filtering on the depth image subjected to hole repair so as to remove image edge noise and obtain a depth image subjected to image enhancement processing;

the mask image generation module is used for adding a mouse callback function into an interface function cvInpaint of OpenCv when a region to be repaired in the depth image in the target format is not located at the edge of the image, and then setting a color threshold range according to the region to be repaired in the depth image in the target format to obtain a mask of the region to be repaired;

the mask image generating module is used for generating a mask image when the region to be repaired in the depth image in the target format is positioned at the edge of the image

5. The apparatus of claim 4, wherein the repair module is configured to be configured to, by

To-be-repaired area in the depth image of the target formatHole filling is carried out, wherein p point is a pixel needing repairing, D _p Representing depth value at point p, B _ε (p) represents the neighborhood of p-point, q is B _ε One point in (p), w (p, q) is used to measure the similarity of p point and the neighborhood pixel q, D _q Represents the depth value at point q, +.>

6. The apparatus of claim 5, wherein the filtering module is configured to median filter the hole-repaired depth image by g (x, y) =med { f (x-k, y-l), (k, l e w) }, to obtain an image-enhanced depth image, where g (x, y) represents the median-filtered image, f (x, y) represents the hole-repaired depth image, w represents a two-dimensional median filtering template, and k and l take values in w.