CN112085675A

CN112085675A - Depth image denoising method, foreground segmentation method and human motion monitoring method

Info

Publication number: CN112085675A
Application number: CN202010894752.2A
Authority: CN
Inventors: 李元媛; 何飞; 何凌; 朱婷; 熊熙; 孟雨璇; 周格屹
Original assignee: Sichuan University
Current assignee: Sichuan University
Priority date: 2020-08-31
Filing date: 2020-08-31
Publication date: 2020-12-15
Anticipated expiration: 2040-08-31
Also published as: CN112085675B

Abstract

The invention discloses a depth image denoising method, a foreground segmentation method and a human motion monitoring method. The method comprises the steps of firstly blocking a depth image, marking first and second feature points of each block based on a base plane, then calculating first parameters carrying isolated information and spatial distribution information, clustering each block by using attribute parameters representing each block as a feature set, further filtering basin blocks, reserving plateau blocks and judging the difference of gray levels of the first and second feature points of a nearest extension block of the plateau region to perform edge protection, and completing denoising. And for the denoised ROI, performing foreground segmentation by using a multi-level contour line. For the segmented human body object, the motion is monitored by performing plane fitting on the segmented part and calculating an included angle between a fitting plane and a basic plane. The invention does not need modeling or a large amount of iterative operation, and has simple operation process, high processing efficiency and high accuracy.

Description

Depth image denoising method, foreground segmentation method and human motion monitoring method

Technical Field

The invention relates to the field of image processing, in particular to a depth image denoising method, a foreground segmentation method and a human motion monitoring method.

Background

The depth image can reflect the depth information of a research object, and has good data support for the research of a moving object.

The depth image acquisition tool (such as a Kinect sensor) usually calculates the depth image by emitting and receiving infrared reflection of an infrared sensor in the space, but if a measured object is too far away from a target or multiple reflections occur at the target, a large amount of non-uniform noise occurs around a research object, and in the depth image, the noise also has depth information and has irregularity and uncertainty, so the noise has great influence on the subsequent analysis and calculation of the depth information of the research object. Most directly, the analysis of the object of study usually requires extracting/segmenting depth information of the object of study from the depth image, and noise can affect the extraction and segmentation of the object of study because the noise also has the depth information.

In the prior art, there are schemes for denoising through the image entropy of the depth image, such as an image denoising method based on PCNN and anisotropic diffusion of image entropy disclosed in CN105005975A, a small target infrared image processing method based on weighted local image entropy disclosed in CN104268844A, and an infrared image non-uniformity parameterization correction optimization method based on image entropy disclosed in CN 111047521A. However, these methods all require a large amount of data processing work such as modeling and iterative operation, and are complex, inefficient, and the denoising effect is not ideal.

Disclosure of Invention

The invention aims to: aiming at the existing problems, a depth image denoising method is provided, so that a simple and efficient denoising method is provided, and the depth image is accurately denoised.

The technical scheme adopted by the invention is as follows:

a depth image denoising method is used for denoising a noise region of a depth image, wherein the noise region is divided into a plurality of blocks in an equal size, and the depth image denoising method comprises the following steps:

the following A-B calculations were performed for each block, respectively:

A. and respectively determining a first characteristic point and a second characteristic point in the block based on the base plane, wherein the first characteristic point is a pixel point with a pixel value below the base plane, and the second characteristic point is a pixel point with a pixel value above the base plane. The purpose of this step is to mark the first and second feature points of each block.

B. The first parameter is calculated by matching the point weight of each second feature point with the minimum position corresponding to each second feature point. The pixel value of the second feature point is located above the base plane, and carries a more prominent feature, and the feature of the block can be further described by using the feature.

C. Taking the first parameter of each block, the attribute parameters of the first characteristic point and the second characteristic point as the feature set of each block, and adopting a clustering method to divide each block into three categories: a first class corresponding to a low gray scale range, a second class where the gray scale values carry a mutation value and a third class having a relatively flat surface; and flattening the first class block, reserving the third class block, and determining to reserve or flatten the corresponding second class block according to the gray scale difference between the second characteristic point and the first characteristic point of the block which is most recently extended by the third class block.

The low-gray-scale area block belongs to a background area and does not carry depth information of a research object, and can be directly processed in a background mode, the area with the gray scale value carrying the mutation value is most likely to be an edge area between the research object and the background, further judgment needs to be carried out according to the characteristics of the area, the area with the relatively flat surface belongs to the inside of a main body of the research object, and the area with the relatively flat surface carries the depth information of the research object and can be directly reserved. Therefore, through simple statistical analysis, the noise can be removed more accurately, and the whole operation process is simple and efficient.

Further, in the step a, the value of the base plane is an average value of gray-scale values of pixels in the block. The base plane of each block is calculated based on the gray value of the base plane, and therefore the more prominent features in each block can be highlighted.

Further, in the step B, the method for calculating the point weight of the second feature point includes:

describing the pixels of the first characteristic point and the second characteristic point in the block by

logic

1 and 0 respectively, and calculating the corresponding point weight of an NxN area around the neighborhood of the second characteristic point by the following formula:

where w is the point weight of the second feature point, a_ijIs the pixel logic value of the second characteristic point in the NxN neighborhood, and N is an integer greater than or equal to 3. The region around the neighborhood of the second feature point is a region centered on the second feature point. The second characteristic points are used as the prominent points of the gray value in the block, and the pixel gray of each characteristic point is logically replaced, so that the method can evaluate the isolation degree of the second characteristic points in the block, and facilitates the subsequent unified logical operation.

Further, in the step B, the method for calculating the minimum position matching corresponding to the second feature point includes:

P_M＝min{(|(m-x_i)|+|(n-y_i)|)|i＝1,2,…,N_o}

wherein P is_MIs the calculated minimum position match; (m, n) represents the coordinate position of a second feature point in the block, (x)_i,y_i) Coordinate values representing second feature points other than the (m, N) point, min being a function for calculating a minimum value, N_oIs the number of other second feature points than the current second feature point. The minimum position matching can reflect the spatial distribution among all the second feature points in the blockThe relationship, and the point weight implement feature description complementation.

Further, in step B, the calculation method of the first parameter is as follows: and respectively calculating the product of the point weight corresponding to each second feature point and the minimum position matching, and then taking the mean value of all the products.

Because the pixel gray levels of the first characteristic points and the second characteristic points are logically replaced, the point weight can participate in logical operation, the point weight is matched and multiplied with the minimum position, the isolation degree of each second characteristic point and the spatial distribution relation between the second characteristic point and the rest second characteristic points can be described, and the corresponding characteristics of the block can be described after the average. Corresponding to the complete solution, noise can be described more accurately.

Further, in step C, the attribute parameters of the first feature point and the second feature point are respectively: the gray level mean value of the first characteristic point and the second characteristic point.

The invention also provides a foreground segmentation method, which is used for segmenting the foreground from the ROI (region of interest) with depth information, wherein the ROI is denoised by applying the depth image denoising method; the foreground segmentation method comprises the following steps: and according to the distribution of the pixels of the ROI on the gray level, respectively extracting the main body contour of the ROI by using a plurality of levels of contour lines, then combining the contours extracted by the contour lines of each level, and segmenting the ROI by using the combined contours to obtain the foreground.

The contour extraction method based on the contour line can improve the continuity and the integrity of the extracted edge.

Further, the method for segmenting the ROI by using the merged contour includes:

filling the combined contour region: filling logic 1 into the area in the outline edge and filling logic 0 into the area beyond the edge to obtain a filled image;

multiplying the ROI with the filler image.

After the region within the outline edge is logically filled, the filled region may be enabled to participate in the logical operation. The foreground region can be directly extracted by directly multiplying the region subjected to logic filling with an ROI (object of operation), complex operation is not needed, the consumed computing power is low, and the computing efficiency is high.

The invention also provides a human body motion monitoring method, which is used for analyzing the human body foreground of the human body image with depth information, wherein the human body foreground in the human body image is obtained by the segmentation of the foreground segmentation method; the human motion monitoring method comprises the following steps:

A. a step of body part segmentation;

B. performing plane fitting on each divided body part and calculating an included angle between a fitting plane and a basic plane;

C. and a step of locating a variation value of the angle on the time axis to record as the effective motion.

The depth information can reflect the change of the distance or the angle of the change area, in the depth image, when the distance or the direction of the observed object changes, the distribution of the depth points in the three-dimensional space changes, the motion state of the monitored object can be monitored by monitoring the change of the angle in the three-dimensional space, and the three-dimensional motion state can be monitored.

Further, the step C includes:

c1, counting the included angle along the time axis process;

c2, keeping the included angle value larger than the average value of the included angles as a salient value, and setting the other included angle values to be 0;

c3, recording the effective movement according to the following rules:

when the salient value appears in the continuous K1 frame after one salient value, the action is regarded as the continuation of one action;

when the included angle values of more than K2 frames after a salient value are all set to 0, the motion is recorded to be finished, K1 and K2 are positive integers, and K1 is less than K2.

One action usually extends for a plurality of frames, that is, when one action occurs, similar plane included angles exist in continuous multiple frames, and for statistics of motion amount, whether the continuous action belongs to is needed to be distinguished. The method can set the corresponding threshold value according to the corresponding monitored object (or set according to the empirical value), thereby accurately and efficiently completing the statistics of the motion amount.

In summary, due to the adoption of the technical scheme, the invention has the beneficial effects that:

1. according to the depth image denoising method, a brand-new denoising design concept is adopted according to the inherent characteristics of the depth image, noise is filtered according to the distribution characteristics of the noise points, data processing work with large computation amount such as modeling and iteration data amount in the traditional method is saved, the computation efficiency is high, the denoising rule conforms to the noise distribution characteristics, denoising is more targeted, and the denoising effect is better.

2. The foreground segmentation method provided by the invention adopts the multi-level contour line to extract the edges of the main body, so that the continuity and integrity of the contour are improved, and the noise can be further filtered. In addition, in the foreground segmentation process, a logic replacement method is adopted, so that direct logic operation can be performed on the ROI, the operation efficiency is high, and the fidelity is high.

3. The human motion monitoring method monitors the plane angle change of the main body according to the motion distribution characteristics of the three-dimensional space, has high monitoring accuracy on the motion state, and has accurate statistical result on the motion amount.

4. All the methods designed by the invention do not need to carry out early modeling, thereby saving a large amount of sample acquisition work and model training work in the early stage. And a large amount of iterative operation on the monitoring data is not needed, so that the calculation resources and the calculation time are greatly saved, and the accuracy of the calculation result and the reliability of the calculation resources are improved.

Drawings

The invention will now be described, by way of example, with reference to the accompanying drawings, in which:

fig. 1 is a schematic diagram of a ROI extraction process based on depth images and bone images.

Fig. 2 is a diagram of noise distribution characteristics.

Fig. 3 is a schematic flow chart of locating noise regions by entropy distribution.

Fig. 4 is two embodiments in which the base plane distinguishes the first feature point and the second feature point.

Fig. 5 is a schematic diagram of logic replacement of a first feature point and a second feature point in a block.

Fig. 6 is an embodiment of point weight calculation for the second feature point in the block.

Fig. 7 shows a first embodiment of parameter calculation for 4 blocks.

FIG. 8 is a block de-noising flow diagram with mean clustering.

FIG. 9 is a schematic diagram of different levels of contour extraction of a subject's contour.

Fig. 10 is a schematic view of a ROI region extraction flow.

Fig. 11 is a schematic diagram of a head and limbs segmentation process.

FIG. 12 is one embodiment of head region angle statistics.

Detailed Description

All of the features disclosed in this specification, or all of the steps in any method or process so disclosed, may be combined in any combination, except combinations of features and/or steps that are mutually exclusive.

Any feature disclosed in this specification (including any accompanying claims, abstract) may be replaced by alternative features serving equivalent or similar purposes, unless expressly stated otherwise. That is, unless expressly stated otherwise, each feature is only an example of a generic series of equivalent or similar features.

The Kinect computes depth images by transmitting and receiving infrared reflections from infrared sensors in space. It has a certain field of view at a fixed location, so that if the object to be measured is too far from the target or multiple reflections occur at the target, non-uniform noise occurs around the subject, as shown in fig. 1(a), and noise occurs to different degrees around the person. For the subsequent research of the research object, in order to ensure the accuracy, the noise needs to be filtered out in advance, and a noise area in the depth image needs to be located firstly. In view of this, the present embodiment provides a method for positioning a noise region of a depth image.

A method for positioning noise region of depth image includes extracting ROI from depth image, using natural person as study object to define ROI as region of human body.

The depth image acquired by the Kinect has the same coordinate space with the bone image. Thus, as shown in fig. 1, the persons in both images can be located by the same coordinates. In fig. 1, (a) is an original depth image, (b) is a bone image, and (c) is an extracted ROI. The object can be positioned in the depth image by performing row and column projection on the depth area where the object is located in the depth image in comparison with the skeleton image, and the positioned area is further extracted to obtain the ROI.

The distribution of noise in the depth image has irregularity and uncertainty, and fig. 2 shows the distribution of noise in the depth image. Based on the characteristics of the irregularity and the uncertainty, the noise domain can be positioned, and the entropy can reflect the uncertainty of the information amount and the data point distribution in a region, so the noise domain is positioned based on the distribution of the image entropy.

In the noise localization method, the ROI region is divided into blocks of 4 × 4 pixel size (or other size, expressed in nx N, the same applies below), and for each block, a substitution representation is performed with its entropy value. The calculation formula of the entropy is shown as (1):

wherein H (x) is the entropy calculated by all the pixel points in the current block, P (x) is the probability of occurrence of the element with specific gray value in the block, a_iRepresenting the gray value of each pixel point.

As shown in fig. 3, by the above calculation, an entropy distribution map of the ROI region can be obtained, and from this entropy distribution map, it can be found that the entropy value calculated by the noise distribution region is higher. And (4) performing line projection on the entropy distribution diagram, wherein the noise area presents different prominent peaks in a projection curve, and the noise area can be determined by determining the peak value interval. For a human body research object, a noise region is directly determined by the mean level according to the human body edge characteristics, and a block mark with an entropy value higher than the mean value is positioned in the noise region by calculating the mean value of the entropy.

And performing further noise filtering processing on the calibrated noise region. Therefore, the present embodiment provides a depth image denoising method, which includes:

the following A-B calculations are performed separately for each block of the noisy region:

A. first feature points and second feature points are respectively determined in the block based on the base plane.

Noise in depth images is characterized by some sort of geographic distribution, such as "isolated peaks", "basins", and "plains". In the noise region, some "plains" are between "isolated peaks", and the subject segmentation region is characterized by being similar to "plateau". Based on the characteristics, the key point of extracting the complete object edge is to background the block of the 'isolated peak' and the 'basin', so that the depth region where the object is located is highlighted. Based on the characteristic that the difference exists between the isolated peak and the basin in the gray distribution, a base plane can be established based on the gray distribution to distinguish the isolated peak from the basin, and the calculation method of the base plane is as follows:

wherein B is_mIs the value of the base plane, N is the number of pixels in a selected area, G is the gray scale value of each pixel in the selected area, M₁And M₂Respectively, the mean value of the gray levels of the first characteristic point and the second characteristic point in the selected area, N₁And N₂Are respectively provided withRepresenting the number of first and second feature points within the selected area. Two examples of base planes are given in fig. 4.

As shown in fig. 4, each pixel point in the 4 × 4 block is divided into a first feature point and a second feature point, the point (pixel point) with the pixel value located above the base plane is the second feature point, and the point below is the first feature point. The first characteristic point and the second characteristic point in each block can be distinguished by taking the base plane as an interface.

B. A first parameter of the block is calculated.

The gray value of the first feature point in the block is represented by logic 1, the gray value of the second feature point is represented by logic 0, and as shown in fig. 5, the gray value of each pixel point in a certain block is replaced by a logic value, where the first feature point is marked as 1 and the second feature point is marked as 0 in each block. The calculation of the first parameter of the block includes:

B1. and calculating the point weight of each second characteristic point in the block.

The gray value marking (represented by logic 1 or 0) of each pixel point of the divided block is completed, the marking result of each pixel point can be used for calculating the isolation degree of a second feature point in a 3x3 (or other larger square size, here, only 3x3 is taken as an example, and meanwhile, the isolation degree is also used for better matching with the resolution of the depth image) template which takes the second feature point as the center, the judgment is carried out by counting the number of first feature points around the second feature point in the 3x3 template, and the calculation is carried out through an expression (5).

Where w is the point weight of each second feature point, a_ijAre pixels around the neighborhood of the second feature point (which is the center of the 3x3 region) (pixels not belonging to the selected region of the 8 neighborhoods are set to zero), w also represents the number of valid first feature points around the second feature point because all second feature points are 0. FIG. 6 shows a method for calculating a second feature in a blockExamples of point weights for points. The point weight can reflect the degree of isolation of each second feature point, but cannot reflect the spatial distribution relationship between all second feature points. Therefore, the spatial distribution relationship of the second feature points also needs to be described.

B2. And calculating the minimum position matching of each second feature point in the block and other second feature points.

Calculating the minimum position matching of each second feature point in each block by using the formula (6):

P_M＝min{(|(m-x_i)|+|(n-y_i)|)|i＝1，2，...，N_o} (6)

wherein P is_MIs the calculated minimum position match; (m, n) represents the coordinate position of a second feature point in the block, (x)_i,y_i) Coordinate values representing second feature points other than the (m, N) point, min being a function for calculating a minimum value, N_oIs the number of other second feature points than the current second feature point.

B3. The first parameter is calculated by matching the point weight of each second feature point with the minimum position corresponding to each second feature point.

The first parameter (denoted by MPD) is calculated by equation (7):

where N is the number of second feature points within one tile (4 × 4 area). Fig. 7 shows an example of calculating the first parameters for four blocks.

As shown in fig. 7, the first parameter may reflect the position distribution relationship of the second feature point. In addition, it has a great advantage in protecting the edge because the block having the edge-like characteristic has a smaller first parameter value, as shown in fig. 7(a) and (b). By utilizing the first parameter and combining with the isolated existence characteristic of the noise point, whether the current region is a noise block can be judged in advance by evaluating the distribution characteristic of the pixels.

C. The noise in the noise region is filtered out using a clustering method.

Three features (MPD, M1 and M2) were used as a feature set, and the features of each tile were described with this feature set.

The blocks are divided into 3 groups, feature sets of 3 blocks are randomly selected to serve as initial clustering centers, then the distance between each feature set and each clustering center is calculated, and the feature sets participating in current calculation are allocated to the clustering centers closest to the feature sets. The cluster centers and the feature sets assigned to them represent a cluster. When a sample (feature set) is allocated, the clustering center of the cluster is recalculated according to the existing object in the cluster until the clustering center is not changed any more and the sum of the squared errors is minimum, and then one-time clustering is completed. The blocks involved in the calculation can be divided into three types: a "basin" of low gray scale range, a "plain" carrying "isolated peaks" and a "plateau" with a relatively flat surface. Most of the depth information area of the main body is "plateau", and other classified areas are determined whether to be possible edge blocks for planarization or preservation by extending the main body block.

As shown in fig. 8, by clustering, the blocks with different distribution characteristics can be distinguished by combining the extracted feature sets. According to the block label (clustering result), the basin block is replaced by a flat background, the detected plateau area is reserved, and whether the gray level of the second characteristic point and the first characteristic point of the plateau block carrying the isolated peak adjacent to the plateau area is obviously different is judged (M)₁<M₂/2), edge protection (if M)₁<M₂And/2, then remain, otherwise replace with a flat background). Then, other noise block interference around the observed object is eliminated. Thus, the denoising of the ROI is completed.

For a research object, after denoising a depth image, depth information (namely foreground segmentation) in the edge of the research object needs to be extracted for further research, and the edge extraction is an important link for segmenting a foreground object in the depth image. Since kinect measures distance by emitting and receiving infrared reflection, noise on the boundary of the foreground object and the background forms an edge, and there are many breakpoints on the edge of the observed object. Conventional edge extraction methods are not applicable in such a case, and because these methods do not consider the continuity and integrity of the edge, the extracted edge may have a plurality of breakpoints, thereby affecting the segmentation of the object. In this regard, the present embodiment provides a foreground segmentation method that uses contours to extract the edges of an object. In the computation of the contour, linear interpolation is used to maintain the continuity of the edge. Using contours of different depth levels to extract edges can improve the continuity and completeness of the edges.

Based on the ROI in the above embodiment, the contour of the subject is extracted using contour lines of 4 levels, the values of which are 50, 100, 150, and 200, respectively. In other embodiments, the level and value of the contour may be adaptively adjusted according to the distribution of ROI pixels in gray scale. Fig. 9 shows contour information of the body region on contour lines at different levels.

As shown in fig. 10, which shows the process of extracting the ROI region, fig. 10(b) is a result of combining (projecting) contours extracted from contours of different levels onto the same plane, and on this basis, a region between edges is filled with a logical value 1, and other regions beyond the edges are marked with a logical value 0, so as to obtain a filled image, as shown in fig. 10(c), and finally, the depth information of the study object can be extracted by multiplying the denoised ROI depth image by the filled image, as shown in fig. 10 (d).

Based on the extraction of the edge depth information of the research object, the embodiment also discloses a human motion monitoring method. The method can be applied to the state analysis of ADHD symptomatic patients to provide more accurate and objective information to clinicians. The human motion monitoring method comprises the following steps:

A. and (4) dividing the body part.

The parts of the human body that move include the head, torso and limbs, where the invention divides the body area into five parts:

according to the characteristics of the proportion of human organs, the proportion of the head is small, and the head area can be quickly positioned from the logic image according to the line projection of the trunk, so that the head of the human body in the depth image is segmented according to the positioning. The head region in the logical image is then set to zero and the body centroid is calculated except for the head. The body region is divided into 4 parts according to the position of the centroid (x, y). The pixels in the x-th row and y-th column are set to zero, and the body is divided into four parts, as shown in fig. 11, corresponding to extracting the depth information of each part from the depth image.

B. And (5) plane fitting and included angle calculation.

The depth information may reflect changes in distance or angle of the changed region. In the depth image, when the distance or the orientation of the observed object changes, the distribution of the depth points in the three-dimensional space changes accordingly. Through a linear regression method, plane fitting can be carried out on the distribution of the depth points so as to reflect the motion change situation.

And (3) randomly selecting a basic plane, newly fitting the area plane by using a linear regression method, and calculating an included angle between the newly fitted plane and the basic plane by using a formula (8).

Where θ is the calculated angle, a is the normal vector of the newly fitted plane, and b is the normal vector of the base plane. The change in the angle may reflect the amplitude of the motion of the object under investigation. An example of the course of the angle over time in the head region is shown in fig. 12.

As shown in fig. 12, the line is a trajectory of the angles calculated from the head region in the time series. The results show that most angles have similar values, while some angles in the trace have significant values. This means that the distribution of the depth information is changed compared to the base plane. Thus, the angle between the base plane and the new fitted plane changes significantly. The intensity of the motion is reflected by the statistical amount of motion (the number of times of occurrence of the motion) for the changed region.

The statistics of the motion amount is based on the curve of the included angle, the motion area is positioned by utilizing the sudden change of the time sequence according to the change of the included angle on the time axis, the values of the sudden change are detected by the statistical average value of all the points on the curve, and the value higher than the average value is regarded as the sudden change. The method for judging whether the action occurs is as follows:

1) the included angle value is larger than the mean level of the whole included angle track curve, the values are reserved as salient values, and the included angle value lower than the mean level is set to be 0;

2) for the salient values, when one action occurs, similar plane included angles often exist in continuous multiframes, the salient value is searched backwards along a time axis for each salient value, when the salient values occur in continuous K1 frames (K1<10, K1 is a positive integer), the same action is considered to be continuous, and statistics is carried out to be one-time motion; when successive K2 frames (K2> -10, K2 is a positive integer) are set to 0, the current motion state is considered to be changed, and the change in the current motion state is counted as one motion, continuing to judge backward from the detected new protrusion value.

For example, for each salient value, a salient value search is performed for its backward direction, and when salient values appear in 5 consecutive frames, the salient values are considered to be consecutive to the same action; when the search from the saliency value to the right shows a value of 0 over 10 frames, two movements are recorded, since two abrupt changes of the angle occur. Of course, the corresponding threshold can be set according to the characteristics of the research object as long as the basic requirements are not deviated.

By the method, the motion occurrence condition of each part (head and four limbs) can be judged, posture detection of each part is not needed, and the method is more efficient and easier to realize.

The invention is not limited to the foregoing embodiments. The invention extends to any novel feature or any novel combination of features disclosed in this specification and any novel method or process steps or any novel combination of features disclosed.

Claims

1. A depth image denoising method is used for denoising a noise region of a depth image, wherein the noise region is divided into a plurality of blocks in an equal size, and the depth image denoising method comprises the following steps:

the following A-B calculations were performed for each block, respectively:

A. respectively determining a first characteristic point and a second characteristic point in the block based on the base plane, wherein the first characteristic point is a pixel point with a pixel value below the base plane, and the second characteristic point is a pixel point with a pixel value above the base plane;

B. calculating a first parameter by matching the point weight of each second feature point with the minimum position corresponding to each second feature point;

C. taking the first parameter of each block, the attribute parameters of the first characteristic point and the second characteristic point as the feature set of each block, and adopting a clustering method to divide each block into three categories: a first class corresponding to a low gray scale range, a second class carrying a mutation value and a third class having a relatively flat surface; and flattening the first class block, reserving the third class block, and determining to reserve or flatten the corresponding second class block according to the gray scale difference between the second characteristic point and the first characteristic point of the block which is most recently extended by the third class block.

2. The method for denoising the depth image according to claim 1, wherein in step a, the value of the base plane is a mean value of gray values of pixels in a block.

3. The method for denoising a depth image according to claim 1, wherein in step B, the method for calculating the point weight of the second feature point comprises:

describing the pixels of the first characteristic point and the second characteristic point in the block by logic 1 and 0 respectively, and calculating the corresponding point weight of an NxN area around the neighborhood of the second characteristic point by the following formula:

where w is the point weight of the second feature point, a_ijIs the pixel logic value of the second characteristic point in the NxN neighborhood, and N is an integer greater than or equal to 3.

4. The method for denoising a depth image according to claim 1, wherein in the step B, the method for calculating the minimum position matching corresponding to the second feature point comprises:

P_M＝min{(|(m-x_i)|+|(n-y_i)|)|i＝1，2，...，N_o}

wherein P is_MIs the calculated minimum position match; (m, n) represents the coordinate position of a second feature point in the block, (x)_i，y_i) Coordinate values representing second feature points other than the (m, N) point, min being a function for calculating a minimum value, N_oIs the number of other second feature points than the current second feature point.

5. The method for denoising the depth image according to claim 1, wherein in the step B, the first parameter is calculated by: and respectively calculating the product of the point weight corresponding to each second feature point and the minimum position matching, and then taking the mean value of all the products.

6. The method for denoising the depth image according to claim 1, wherein in the step C, the attribute parameters of the first feature point and the second feature point are respectively: the gray level mean value of the first characteristic point and the second characteristic point.

7. A foreground segmentation method for segmenting a foreground from an ROI with depth information, wherein the ROI is denoised by applying the depth image denoising method according to any one of claims 1-6; the foreground segmentation method comprises the following steps: and according to the distribution of the pixels of the ROI on the gray level, respectively extracting the main body contour of the ROI by using a plurality of levels of contour lines, then combining the contours extracted by the contour lines of each level, and segmenting the ROI by using the combined contours to obtain the foreground.

8. The foreground segmentation method of claim 7 wherein the method of segmenting the ROI using the merged contour comprises:

filling the merged contour: filling logic 1 in the area between the edges of the outline, and filling logic 0 in the area beyond the edges to obtain a filled image;

multiplying the ROI with the filler image.

9. A human motion monitoring method is used for analyzing a human foreground of a human image with depth information, and is characterized in that the human foreground in the human image is obtained by segmentation according to the foreground segmentation method of claim 7 or 8; the human motion monitoring method comprises the following steps:

A. a step of body part segmentation;

10. The human motion monitoring method of claim 9, wherein the step C comprises:

c1, counting the included angle along the time axis process;

c3, recording the effective movement according to the following rules:

when the included angle values of more than K2 frames after a salient value are all set to be 0, the motion is recorded to be finished, K1 and K2 are positive integers, and K1 is more than K2.