CN113723231A

CN113723231A - Low-illumination semantic segmentation model training method, semantic segmentation method and semantic segmentation device

Info

Publication number: CN113723231A
Application number: CN202110940177.XA
Authority: CN
Inventors: 王卓恒; 葛琦
Original assignee: Nanjing University of Posts and Telecommunications
Current assignee: Nanjing University of Posts and Telecommunications
Priority date: 2021-08-17
Filing date: 2021-08-17
Publication date: 2021-11-30

Abstract

The invention discloses a training method of a low-illumination semantic segmentation model, a semantic segmentation method and a device, wherein the training method of the low-illumination semantic segmentation model comprises the following steps: acquiring an image data set under low illumination; respectively carrying out enhancement processing on the acquired image data through an image enhancement model to obtain an enhanced image data set and an enhanced image loss function; respectively performing semantic segmentation on the acquired enhanced image data through a semantic segmentation model to acquire a semantic segmentation image data set and a semantic segmentation loss function; obtaining a total loss function according to the enhanced image loss function and the semantic segmentation loss function; and adjusting the semantic segmentation model according to a total loss function to obtain a low-illumination semantic segmentation model. The method is used for obtaining accurate segmentation precision in a low-illumination scene.

Description

Low-illumination semantic segmentation model training method, semantic segmentation method and semantic segmentation device

Technical Field

The invention relates to image processing and semantic segmentation technologies, in particular to a low-illumination semantic segmentation model training method, a semantic segmentation method and a semantic segmentation device.

Background

The semantic segmentation provides semantic information of a driving environment for a control system of the automobile, helps a driving system to judge driving, and is important for applications such as automatic driving and geological survey. At present, the semantic algorithm for the city street view of automatic driving only acts on an ideal scene with sufficient illumination, the robustness of other scenes is not high, and particularly, when low-illumination scenes such as night and the like are processed, an accurate semantic segmentation result is difficult to obtain.

Disclosure of Invention

The purpose of the invention is as follows: in order to overcome the defects of the prior art, the invention provides a low-illumination semantic segmentation model training method, a semantic segmentation method and a semantic segmentation device, so as to improve the semantic segmentation accuracy under a low-illumination scene.

The technical scheme is as follows: in a first aspect, the present invention provides a training method for a low-light semantic segmentation model, where the method includes:

acquiring an image data set under low illumination; respectively carrying out enhancement processing on the acquired image data through an image enhancement model to obtain an enhanced image data set and an enhanced image loss function; respectively performing semantic segmentation on the acquired enhanced image data through a semantic segmentation model to acquire a semantic segmentation image data set and a semantic segmentation loss function; obtaining a total loss function according to the enhanced image loss function and the semantic segmentation loss function; and adjusting the semantic segmentation model according to a total loss function to obtain a low-illumination semantic segmentation model.

Optionally, the acquisition of the low-light image dataset is based on a cityscaps dataset.

Optionally, the function of the image enhancement model is represented as:

wherein the content of the first and second substances,

representing normal images output after image enhancement, I_LAnd (4) representing a mapping function of the low-illumination image f (x) representing the brightness adjustment of the image, wherein a Sigmod activation function is adopted, and theta is a network parameter of the enhancement network.

Optionally, the enhanced image loss function is expressed as:

where N represents the total number of pixels in the input image,

and

and the output pixel result of the network and the label pixel at the corresponding position are respectively represented, theta is a network parameter of the enhancement network, and beta is set to be 0.1 in the training process.

Optionally, the semantic segmentation model is obtained based on a ghostNet neural network model training.

Optionally, the semantic segmentation loss function is expressed as:

wherein, y_iRepresenting the probability of presoftmax belonging to class i, e being a constant.

Optionally, the total loss function is obtained by performing cascade training on an enhanced image loss function and a semantic segmentation loss function.

Optionally, the total loss function is represented as:

L＝β_DL_D+β_SL_S (4)

where L represents the total loss function of the entire network, L_DAnd L_SRepresenting the loss function, beta, of the enhancement network and the segmentation network_SAnd beta_DTo representThe weight of the function is lost.

In a second aspect, a method for low-illumination semantic segmentation, the method comprising:

acquiring a source image, and carrying out image enhancement on the source image through an image enhancement model to obtain an enhanced source image; and performing semantic segmentation on the enhanced source image through a low-illumination semantic segmentation model to obtain segmentation precision, wherein the low-illumination semantic segmentation model is obtained through training by using the method of any one of the first aspect.

In a third aspect, a low-illumination semantic segmentation apparatus, the apparatus comprising:

the enhancement unit is used for carrying out image enhancement on the source image through the image enhancement model to obtain an enhanced source image; and the segmentation unit is used for performing semantic segmentation on the enhanced source image through a low-illumination semantic segmentation model to obtain segmentation precision, wherein the low-illumination semantic segmentation model is obtained through training by using the method of any one of the third aspect.

Has the advantages that: according to the method, the semantic segmentation model is adjusted according to the total loss function, the image loss is enhanced, and the segmentation precision is improved; a semantic segmentation model is established based on a ghostNet neural network model, the total sum of parameters required in a Ghost module and the calculation complexity are reduced, and the segmentation efficiency is improved; and the image enhancement model adopts a self-coding structure with jump links, so that the image brightness is improved, and more image details can be acquired.

Drawings

Fig. 1 is a schematic flow chart of a training method of a low-illumination semantic segmentation model according to an embodiment of the present invention.

Detailed Description

The invention is further described below with reference to the accompanying drawings. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present invention is not limited thereby.

Example 1:

as shown in FIG. 1, the invention relates to a low-illumination semantic segmentation model training method, which comprises the following steps:

the Cityscapes data set is selected as a main data set of an experiment, and the Gamma value of the image is adjusted firstly, namely the image is subjected to inverse Gamma transformation. Gamma transformation is an image operation for performing nonlinear adjustment on an image input gray value, and can make the gray value of an image before transformation and the gray value of the image after transformation have an exponential nonlinear relationship, as shown in formula (1):

wherein I_inI_outEach represents an input/output image, β represents a Gamma value, and α represents a transform coefficient. After Gamma conversion, in order to obtain a better low-illumination effect of the image, a PS tool is also used for adjusting the brightness and the contrast of the city maps street view image after Gamma conversion, so that the finally manually adjusted low-illumination street view data is similar to the actual low-illumination dark scene as much as possible in vision to obtain the low-illumination street view image with good visual effect, thus on the basis of the original data set containing the street view image of normal weather and the corresponding semantic segmentation label, a new corresponding low-illumination image is generated, each group of data contains three city street view images with low illumination respectively, and the city street view image and the semantic segmentation label in normal day. The artwork and segmentation labels for these datasets are from the cityscaps dataset.

The obtained low-illumination image is subjected to image enhancement processing by using a self-coding structure with jump connection, the number of parameters can be effectively reduced through dimension reduction of a coder, representative characteristics of the image are obtained, and a decoder achieves the denoising effect, as shown in formula (2):

in the formula (2), the reaction mixture is,

representing normal images output after image enhancement, I_LIndicating low lightThe image f (x) represents a mapping function of image brightness adjustment, wherein a Sigmod activation function is adopted, and theta is a network parameter of the enhancement network.

When a normal image is obtained, the loss of an image enhancement network structure can be generated, and Mean Absolute Error (MAE) and a quadratic norm (L) are adopted in the training of the enhancement network structure₂Error) to calculate the total error of the enhancement network, as shown in equation (3):

n represents the total number of pixels in the input image,

and

label pixels respectively representing the output pixel result and the corresponding position of the network; l is₂The regularization term can prevent the network from over-fitting, improve the convergence rate of the network, and beta is set to 0.1 during the training process.

The method is characterized in that the method selects the ghostNet as a semantic segmentation network, and the ghostNet is used as a lightweight network which is applied to generating more characteristic maps by a series of simple linear operations used by the convolutional layer. Compared with the common convolutional neural network, the total number of parameters and the calculation complexity required in the Ghost module are reduced under the condition of not changing the size of the output feature map, and all convolutions in the module are replaced by pointwise convolutions in order to improve the efficiency.

The loss function of semantic segmentation, adopting the cross entropy loss function of softmax, comprises two steps for each pixel in the image, and the mathematical expression of the two steps is shown in the formula (4) (5):

formula (4), y_iRepresenting the probability of presoftmax belonging to class i, e being a constant. Firstly, softmax is calculated for the last layer output of the cascade segmentation network to obtain the probability that each pixel belongs to a certain class, then cross entropy loss L is calculated for the predicted probability vector, each pixel is taken as a sample to calculate the cross entropy loss, and then the average value is taken as the final segmentation network loss L_s。

The mathematical expression of the final total loss function of two networks which jointly train their losses through the cascade network is shown as the formula (6)

L＝β_DL_D+β_SL_S (6)

Where L represents the total loss function of the entire network, L_DAnd L_SRepresenting the loss function, beta, of the enhancement network and the segmentation network_SAnd beta_DRepresenting the weight of the loss function.

And adjusting the semantic segmentation model established based on the ghostNet neural network model according to the total loss function to obtain the low-illumination semantic segmentation model.

Example 2:

a method of low-light semantic segmentation, the method comprising:

acquiring a source image, and carrying out image enhancement on the source image through an image enhancement model to obtain an enhanced source image;

performing semantic segmentation on the enhanced source image through a low-illumination semantic segmentation model to obtain segmentation precision,

wherein the low-light semantic segmentation model is obtained by training according to the method of any one of embodiment 1.

Example 3:

a low-light semantic segmentation apparatus, the apparatus comprising:

the enhancement unit is used for carrying out image enhancement on the source image through the image enhancement model to obtain an enhanced source image;

a segmentation unit for performing semantic segmentation on the enhanced source image through a low-illumination semantic segmentation model to obtain segmentation accuracy,

It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims

1. A training method of a low-illumination semantic segmentation model is characterized by comprising the following steps:

acquiring an image data set under low illumination;

respectively carrying out enhancement processing on the acquired image data through an image enhancement model to obtain an enhanced image data set and an enhanced image loss function;

respectively performing semantic segmentation on the acquired enhanced image data through a semantic segmentation model to acquire a semantic segmentation image data set and a semantic segmentation loss function;

obtaining a total loss function according to the enhanced image loss function and the semantic segmentation loss function;

and adjusting the semantic segmentation model according to a total loss function to obtain a low-illumination semantic segmentation model.

2. The method for training the low-light semantic segmentation model according to claim 1, wherein the low-light image dataset is obtained based on a cityscaps dataset.

3. The method for training the low-light semantic segmentation model according to claim 1, wherein the function of the image enhancement model is expressed as:

wherein the content of the first and second substances,

4. The method for training the low-light semantic segmentation model according to claim 3, wherein the enhanced image loss function is expressed as:

where N represents the total number of pixels in the input image,

and

5. The method for training the low-illumination semantic segmentation model according to claim 1, wherein the semantic segmentation model is obtained based on a ghostNet neural network model training.

6. The method for training the low-light semantic segmentation model according to claim 5, wherein the semantic segmentation loss function is expressed as:

7. The method for training the low-illumination semantic segmentation model according to claim 1, wherein the total loss function is obtained by training an enhanced image loss function and a semantic segmentation loss function in a cascade manner.

8. The method for training the low-light semantic segmentation model according to claim 7, wherein the total loss function is expressed as:

L＝β_DL_D+β_SL_S (4)

9. A low-light semantic segmentation method, comprising:

wherein the low-light semantic segmentation model is trained by the method of any one of claims 1-8.

10. An apparatus for low-light semantic segmentation, the apparatus comprising: