CN111652317A

CN111652317A - Hyper-parameter image segmentation method based on Bayesian deep learning

Info

Publication number: CN111652317A
Application number: CN202010501892.9A
Authority: CN
Inventors: 齐仁龙; 张庆辉; 杨绪华; 朱小会; 李大海
Original assignee: Zhengzhou University of Science and Technology
Current assignee: Zhengzhou University of Science and Technology
Priority date: 2020-06-04
Filing date: 2020-06-04
Publication date: 2020-09-11
Anticipated expiration: 2040-06-04
Also published as: CN111652317B

Abstract

The invention discloses a Bayesian depth learning-based hyper-parameter image segmentation method, and aims to solve the technical problems of large extraction calculation amount and low precision of hyper-parameters in the conventional image segmentation. The method comprises the steps of selecting an image training set, carrying out a Gaussian process on image information, preprocessing a data set by adopting an L2 regular operator, obtaining image contour edge characteristics, constructing a target characteristic edge characteristic recognition training set, classifying the data set according to Bayesian theorem, setting image target edge segmentation labels based on semantic recognition, further extracting a target edge characteristic data set by adopting the Gaussian process, and calculating a target set edge characteristic Gaussian hyperparameter set. The invention has the beneficial effects that: the efficiency and the precision of target identification are improved.

Description

Hyper-parameter image segmentation method based on Bayesian deep learning

Technical Field

The invention relates to the technical field of machine learning, in particular to a super-parameter image segmentation method based on Bayes deep learning.

Background

In the field of computer vision, image segmentation refers to the task of assigning a label to each pixel in an image, which can also be considered as pixel classification. Unlike target detection using rectangular candidate frames, image segmentation needs to be accurate to pixel level positions, so it plays a very important role in tasks such as medical analysis, satellite image object detection, iris recognition, and automatic driving of cars.

Human beings distinguish targets more by experience, and deep learning is to identify targets by constructing a convolutional neural network and extracting target features by training. The recognition result of the traditional target recognition method is often some objects of a certain category defined in advance, such as human faces, vehicles and the like, while one image contains far more than some mutually independent objects, and also contains information of a plurality of objects and the attributes, spatial relationships, logical relationships and the like of the objects, and the information cannot be described only by some class labels, but needs to be described by using natural language. Any mathematical model is difficult to satisfy all target recognition, so that a plurality of condition classification recognition is formed, and the efficiency of cross-domain fusion recognition of deep learning is low.

Pixel-level image segmentation is a research hotspot in the field of artificial intelligence, and is a mathematical modeling problem relating to multiple subjects of image processing, pattern recognition, visual perception, psychological cognition and the like. It is very easy for human beings to identify targets by experience in the long-term evolution and learning process, but the targets are automatically identified from a complex background by a machine, complex mathematical modeling is needed for realization, and therefore, selection of a recognition model and hyper-parameter optimization are important. The hyper-parameters of deep learning have the characteristics of difficult selection and no regularity, unpredictable influence exists among different hyper-parameters, the debugging is very time-consuming, and a large amount of iterative computation is needed for the evaluation of each hyper-parameter combination. For such problems, classical optimization algorithms: such as particle swarm algorithm, simulated annealing algorithm, local search algorithm, etc. are no longer applicable. Some researchers propose a method of using a proxy model to reduce the evaluation cost of the problem by simulating the estimation value of the objective function. However, no matter the monte carlo algorithm or the adaptive simulation algorithm proposed in the reinforcement learning field is adopted, the learning process is always time-consuming, and is limited under a certain condition or in a certain field, the precision is difficult to guarantee, and the cross-boundary fusion is difficult to realize.

Disclosure of Invention

The invention provides a Bayesian deep learning-based hyper-parameter image segmentation method, which aims to solve the technical problems of large extraction calculation amount and low precision of hyper-parameters in the conventional image segmentation.

In order to solve the technical problems, the invention adopts the following technical scheme:

a super-parameter image segmentation method based on Bayesian deep learning is designed, and comprises the following steps:

step 1: data preprocessing, namely regularizing data elements in the image to generate an image segmentation class data set;

step 2: extracting target edge characteristics of the image by using a Gaussian mask;

and step 3: extracting a boundary frame and a target mask of the image through target edge characteristics by using Bayesian estimation;

and 4, step 4: putting the bounding box and the target mask into a feature dictionary for comparison, so as to obtain the category of each target in the image;

the construction method of the feature dictionary comprises the following steps:

(1) establishing a training set and a testing set of images;

(2) carrying out the operation in the step 1-3 on each image in the training set to obtain a boundary frame and a target mask of each image;

(3) collecting the boundary frame and the target mask in the step (2) to obtain a feature dictionary formed by the boundary frame and the target mask;

(4) and (4) inputting the test set into the feature dictionary in the step (3), checking the accuracy of the obtained feature dictionary, and if the accuracy does not meet the requirement, adjusting the parameters of the model for retraining until the accuracy of the feature dictionary meets the requirement.

Further, in step 1, the image segmentation class data set comprises N object segmentation class attributes and M data attributes of each object class, and when the probability of the N class attributes and the probability of the M data attributes are maximum, a Bayesian classification matcher is adopted to select an object and segment the image; the software used was python and the framework used tensorflow.

Further, in step 2, the specific steps of extracting the target edge feature are as follows:

the first step is as follows: assuming that the edge probability of the image pixel f (x, y) satisfies the gaussian distribution, the two-dimensional gaussian function is:

the second step is that: the gradient function is calculated for the x, y directions of the image:

the third step: convolving the image data set:

the fourth step: calculating the probability density distribution of the image target edge, namely the target edge characteristic:

further, in step 2, before extracting the object class label, calculating a prior probability of the object appearing:

wherein, C_iIs a C type object set (C)₁、C₂、C₃...C_n) Any one element of (1), N_iRepresenting the number of occurrences of the target and N representing the total amount of the target set.

Further, in step 2, in the process of extracting the object class label, the conditional probability of the object occurrence is calculated:

wherein x is_aRepresenting the abscissa, y, of the target point_aRepresenting the ordinate coordinates of the target point. P (x)_a)、P(y_a) Representing the probability of the edge feature of the object.

Further, in step 3, the specific steps of extracting the bounding box of the image and the target mask are as follows:

(1) obtaining a target area of the image and the classification weight of each pixel in the area by learning the extracted target boundary characteristics;

(2) after obtaining the target area of the image, combining the internal and external feature maps of each target area into two complete feature maps, and then synchronously performing image segmentation and image classification on two branch data sets D1 and D2;

(3) in image segmentation, classifying the internal and external feature maps of a target region by using a Bayesian classifier so as to distinguish the foreground and the background in an image and generate a mask;

(4) in image classification, the maximum value is taken according to pixel probability distribution in the two types of feature maps to obtain a new feature map, and then the class of objects in the target area is obtained by using a maximum likelihood estimation classifier.

Further, in step 4, the method for comparing the bounding box, the target mask and the feature dictionary comprises: firstly, similarity weights of a boundary box, a target mask and a feature dictionary are calculated by using an L2 regular operator, then a similarity Gaussian process is carried out, a target edge feature data set is extracted, and a semantic segmentation result can be obtained through Bayesian classification matching.

Further, before the output of the semantic segmentation result, a side Gaussian hyperparametric function is calculated, then the target matching degree is calculated according to the value of the score, and the better the hyperparametric set is, the higher the precision score of the obtained semantic segmentation is.

Further, the training set of Images included an Open Images V4 test set containing 190 ten thousand pictures and 1540 ten thousand bounding boxes for 600 categories on the pictures.

Compared with the prior art, the invention has the beneficial technical effects that:

the method mainly comprises the steps of utilizing a Bayesian formula, a python language and a tensoflow frame, carrying out pixel preprocessing on an image according to the principle that after an image edge feature Gaussian process, the edge feature projection is steep, carrying out the Gaussian process on the whole image, obtaining an edge feature data set, preprocessing the data set by utilizing an L2 criterion, utilizing a Bayesian estimation model for image target edge feature recognition aiming at the image target segmentation requirement based on semantic recognition, constructing an image target edge feature data dictionary based on semantic recognition by combining deep learning, applying the trained model to a complex target recognition system, and improving the efficiency and the precision of target recognition.

Drawings

FIG. 1 is a flow chart of a hyper-parametric image segmentation method based on Bayesian deep learning according to the present invention.

Detailed Description

The following examples are intended to illustrate the present invention in detail and should not be construed as limiting the scope of the present invention in any way.

The programs referred to or relied on in the following embodiments are all conventional programs or simple programs in the art, and those skilled in the art can make routine selection or adaptation according to specific application scenarios.

Example 1: a hyperparametric image segmentation method based on Bayesian deep learning is disclosed, and referring to FIG. 1, the overall steps are as follows: selecting an image training set, carrying out a Gaussian process on image information, preprocessing a data set by adopting an L2 regular operator to obtain image contour edge characteristics, constructing a target characteristic edge characteristic recognition training set, classifying the data set according to Bayesian theorem, setting an image target edge segmentation label based on semantic recognition, further adopting the Gaussian process to extract a target edge characteristic data set, calculating a target set edge characteristic Gaussian hyperparameter set, calculating a target edge posterior probability according to the data set, taking the maximum posterior probability as the target image segmentation and recognition probability based on the semantic, matching Bayesian scores, judging that the recognition is correct if the score is higher than 90, and carrying out Gaussian process training again by adopting deep learning and 0.618 coefficient adjustment Gaussian and hyperparameter until super-optimal parameters are obtained. The final result of the invention can be realized as follows: and inputting the image and the target label into the model for image target segmentation, wherein the target can be separated from a corresponding background. That is, for a trained model, given an image and target information to be queried, a corresponding target can be detected from the image.

The method for manufacturing the feature dictionary comprises the following steps:

(1) data pre-processing

And carrying out regularization processing on data elements in the image to generate an image segmentation class data set. The data set comprises N object segmentation class attributes and M data attributes representing each object class, and when the probability of the N class attributes and the probability of the M data attributes are the maximum, the model is adopted to segment the image according to the object edge contour. The specific implementation is realized by using Python language under an artificial intelligence tenserflow framework.

(2) Extracting target edge features

The method mainly extracts the characteristics of the target super-edge parameters through Gaussian transformation, and the main extracted characteristics are as follows:

1. the gray level of the edge pixel of the target image jumps;

2. the pixel boundary between different materials, textures, colors and brightness of the target generates jump;

3. the target contour line and the background have different reflection characteristics and can also form pixel value jump;

4. the object is illuminated to form shadows, which also form inter-pixel gray value transitions.

Calculating the attributes, including: adopting a Gaussian mask edge feature extraction algorithm for the image f (x, y), and being characterized in that: according to the characteristics of image target edge data jump and large probability density change, selecting a proper Gaussian mask, and obtaining a position with high probability density of a data set, namely an edge pixel point, on the basis of calculating the maximum value in the pixel point neighborhood.

The specific process is as follows:

the first step is as follows: assuming that the edge probability of an image pixel f (x, y) satisfies the gaussian distribution, the two-dimensional gaussian function is:

the second step is that: gradient function for x, y direction:

the third step: convolving the image data set:

the method for calculating the Gaussian edge hyper-parameter feature extraction module is essentially to calculate a Gaussian kernel function, and then the Gaussian kernel function is used as a mask to be convolved with a target pixel, so that the target edge feature can be extracted, and the process is also a hyper-parameter estimation process.

After the edge characteristics of image pixels are obtained, characteristic distribution parameters are calculated by using a Gaussian process, then the minimum (optimal) loss function of a Gaussian characteristic function is calculated by using an L2 regularization operator, and meanwhile a penalty function is added in an algorithm to prevent the model from being over-fitted. The L2 regularization operator is:

where Loss is a Loss function, E_inIs the training sample error without the regularization term and λ is the regularization parameter (penalty function). To make the model more optimal, the rule function is defined as follows:

i.e. the sum of the squares of all w (errors) does not exceed the parameter C (threshold), it can be ensured that the training sample error E is minimized_inThe loss function value is minimal.

(3) Extracting bounding boxes and target masks of an image

The Bayesian estimation algorithm is mainly adopted. The Bayes estimation model is used for recognizing the objective world by using a prior probability recognition algorithm, the basic idea is that on the basis of accumulating a large number of samples, the maximum probability estimation is carried out on a recognized object through the prior probability and the conditional probability, the maximum probability is a recognition result, and when the analysis sample is close to the total number, the probability of the event occurrence in the sample is close to the probability of the event occurrence in the total, so that the minimum error prediction can be realized. The Bayesian estimation core is hyper-parameter selection, in order to improve the local dominance involved in the image segmentation process, a hyper-parameter image data classification model based on Gaussian distribution is constructed and combined with deep learning, a 0.618 estimation method (golden segmentation method) is adopted to select hyper-parameters, the balance between hyper-parameter errors and weights is effectively adjusted, and the efficient pixel segmentation method based on semantic understanding is realized.

The Bayes classifier is a classification method based on statistical theory, and for a sample set C containing M class samples, C is { C ═ C₁C₂C₃......C_nThe classifier first calculates N-dimensional feature vector X ═ X₁x₂......x_n]The maximum likelihood estimate of the labels belonging to each category is ranked and the maximum value is taken to calculate the category label C to which x belongs_iThe Bayesian equation is as follows:

wherein, Pr (c)_i| x) is the posterior probability, Pr (x | c)_i) Conditional probability, Pr (c)_i) As a priori probability, P (x)_a)、P(y_a) Representing the edge of an objectThe feature probability. The classification problem is resolved into the x attribute class C_iMaximum value problem:

C_i＝argmaxPr(x|c_i)Pr(c_i) (11)

experiments prove that the accuracy of the naive Bayes classifier and other classification classifiers is much higher.

According to different image gray levels, obvious protruding edges generally exist at the image boundary, and the image can be segmented by utilizing the characteristic. The kernel operator is a filter with Gaussian hyperparameter, has the characteristics of denoising, smoothing and strengthening edge characteristic attributes of an image, and the calculation process comprises four steps: firstly, smoothing the image by a Gaussian filter; secondly, calculating the amplitude and the direction of the gradient by using the finite difference of the first-order partial derivatives; thirdly, carrying out non-maximum suppression on the gradient amplitude; and fourthly, detecting and connecting edges by using a double-threshold algorithm.

The image segmentation process is as follows:

After the image segmentation is completed, an image target contour edge data set needs to be established, which specifically comprises the following steps: the image target edge segmentation task comprises 5 basic subtasks of target candidate generation, candidate target edge feature extraction, candidate target Bayes classification, candidate target hyper-parameter correction, candidate target edge feature dictionary construction and the like. The alternative target data sets include the OpenImages V4 test set, which contains 190 ten thousand pictures and 1540 ten thousand bounding boxes for 600 categories on the pictures. Preprocessing image data by adopting a pixel L2 regular operator to form an image classification set and a classification image pixel subset, calculating a target edge feature nuclear mask by adopting a multi-dimensional Gaussian distribution probability model, acquiring a target edge feature data set by convolution, and acquiring the prior probability of the target edge feature classification set by adopting Bayesian evidence learning.

(4) Implementation procedure

The device mainly comprises three parts: (1) preprocessing image target pixels: and (3) setting a threshold function to generate an initial point set by adopting an L2 regular algorithm: x, Y ═ (X1, Y1), (X2, Y2),. · (xt, yt); (2) constructing a data set D { (x1, y1) · (xt, yt) } by adopting a Gaussian kernel model; (3) and entering Bayes maximum posterior probability estimation. The specific operation is as follows:

A. setting labels according to target classification;

B. counting image target classification;

C. calculating the prior probability of each type of target: pr (C)_i)；

D. According to the target classification, selecting all data in a corresponding data set D, constructing a Gaussian process model, and extracting target edge points Xi and Yi sets;

E. calculating a Gaussian distribution hyperparametric function (mu)_i,σ_i)；

F. Further, an acquisition function (μ) is used_i,σ_i) Calculating the value of the next evaluation point xi, i is 1-t, xi is argmaxu (x | D), and calculating response yi;

G. adding new data points to a set D, D ← D { xi, yi }, i ← i + 1;

H. calculating posterior probability using Bayesian posterior estimation formula

I. Calculating posterior probability distribution: max (Pr (C) ═ R₁|X),Pr(C₂|X)......Pr(C_tX))；

J. And (4) checking and calculating results: matching the calculated posterior probability with the edge characteristics of the target, wherein if the matching score is higher, the identified month is close to the actual target, and the score is set to be more than 90 by the model;

K. correcting, if the score is less than 90, adopting deep learning to measure Gaussian super parameter mu_i，σ_iCorrecting with a correction coefficient of 0.618;

and L, placing the target marginal probability parameters with the scores larger than 90 into a data dictionary to form a hyper-parameter data dictionary set.

(5) Training and testing

190 ten thousand pictures in the Open Images V4 detection set are divided into a training set and a test set, and a supervised learning training model is used for inputting the test set to verify whether the trained model meets the requirements. At this point, the feature dictionary establishment is complete.

(6) Image detection

Inputting the image to be detected into the detection model, extracting the boundary frame and the target mask of the image to be detected, and then putting the boundary frame and the target mask into the feature dictionary for comparison to obtain the category of each target in the image. The method comprises the following steps: firstly, similarity weights of a boundary box, a target mask and the feature dictionary are calculated by using an L2 regular operator, then a target edge feature data set is extracted by the similarity Gaussian process, and a semantic segmentation result can be obtained through Bayesian classification matching. Before the output of the semantic segmentation result, a side Gaussian hyperparametric function is calculated, then the target matching degree is calculated according to the size of the score, and the better the hyperparametric set is, the higher the precision score of the obtained semantic segmentation is.

Through the steps, the image segmentation method based on the Bayes deep learning hyper-parameter optimization can be completed, and the target edge segmentation hyper-parameter dictionary is obtained on the basis of the training of the image set. With the hyper-parameter dictionary, the recognition and the segmentation of the target can be realized, namely: after training, the system can realize the separation and recognition of the target and the background through model calculation in the environment of inputting images and voice. The method can effectively overcome the defects of long time consumption, large performance fluctuation, large occupied resources and the like of the traditional deep learning optimization algorithm, and the model can be applied to plug-ins in image semantic recognition software of the smart phone after being trained.

While the present invention has been described in detail with reference to the drawings and the embodiments, those skilled in the art will understand that various specific parameters in the above embodiments can be changed without departing from the spirit of the present invention, and a plurality of specific embodiments are formed, which are common variation ranges of the present invention, and will not be described in detail herein.

Claims

1. A hyper-parametric image segmentation method based on Bayesian deep learning is characterized by comprising the following steps:

and step 3: extracting a boundary frame and a target mask of the image through the target edge characteristics by using Bayesian estimation;

(1) establishing a training set and a testing set of images;

(2) performing the operation in the step 1-3 on each image in the training set to obtain a boundary frame and a target mask of the image;

2. The Bayesian deep learning-based hyper-parametric image segmentation method as recited in claim 1, wherein in step 1, the image segmentation class data set comprises N object segmentation class attributes and M data attributes of each object class, and when the probability of the N class attributes and the probability of the M data attributes are the maximum, a Bayesian classification matcher is adopted to select an object and segment the image.

3. The hyperparametric image segmentation method based on Bayesian deep learning as recited in claim 1, wherein in the step 2, the specific steps of extracting the target edge feature are as follows:

the third step: convolving the image data set:

。

4. the Bayesian deep learning-based hyper-parametric image segmentation method as recited in claim 3, wherein before obtaining the class label of the target in step 3, the prior probability of the target occurrence needs to be calculated:

5. The Bayesian deep learning-based hyper-parametric image segmentation method as recited in claim 3, wherein in step 3, in the process of obtaining the class label of the target, a conditional probability of the target occurrence is calculated:

6. The Bayesian deep learning-based hyper-parametric image segmentation method as recited in claim 1, wherein in step 3, the specific steps of extracting the bounding box and the target mask of the image are as follows:

(3) in image segmentation, classifying the internal and external feature maps of the target region by using a Bayesian classifier so as to distinguish the foreground and the background in the image and generate a mask;

7. The Bayesian deep learning-based hyper-parametric image segmentation method as recited in claim 1, wherein in step 4, the method for comparing the bounding box and the target mask with the feature dictionary comprises: firstly, similarity weights of a boundary box, a target mask and the feature dictionary are calculated by using an L2 regular operator, then a target edge feature data set is extracted by the similarity Gaussian process, and a semantic segmentation result can be obtained through Bayesian classification matching.

8. The Bayesian deep learning-based hyper-parameter image segmentation method as recited in claim 7, wherein before outputting the semantic segmentation result, a side Gaussian hyper-parameter function is calculated, and then a target matching degree is calculated according to the value, and the better the hyper-parameter set is, the higher the precision value of the semantic segmentation is.

9. The Bayesian deep learning-based hyper-parametric image segmentation method as recited in claim 1, wherein the training set of Images comprises an Open Images V4 detection set containing 190 ten thousand pictures and 1540 ten thousand bounding boxes for 600 classes on the pictures.