CN110827284A

CN110827284A - Codec network for optimizing component analysis model and rapid semantic segmentation method

Info

Publication number: CN110827284A
Application number: CN201911065888.6A
Authority: CN
Inventors: 刘桂雄; 黄坚
Original assignee: South China University of Technology SCUT
Current assignee: South China University of Technology SCUT
Priority date: 2019-11-04
Filing date: 2019-11-04
Publication date: 2020-02-21
Anticipated expiration: 2039-11-04
Also published as: CN110827284B

Abstract

The invention discloses an optimized component analysis model codec network and a rapid semantic segmentation method, which comprises the steps of constructing a component analysis model codec architecture network and training the network to identify object-level semantic meaning in an imageObject response value p_Object‑jComponent level semantics

And component response value p_Part‑k(ii) a Using object response value p_Object‑jResponse value p to component_Part‑kWeighting to obtain

The relevant semantic recognition capability of the component analysis model is improved; due to the fact thatObject level semantics

With component level semantics

Mutually exclusive relationships of, i.e.

The irrelevant semantic exclusion capability of the component analysis model is improved; constructing an optimized component analysis model coder-decoder architecture network to realize complex background semantic segmentation; and adjusting the depth of the backbone network, retraining the optimized coder-decoder architecture network, and realizing rapid and complex background semantic segmentation.

Description

Codec network for optimizing component analysis model and rapid semantic segmentation method

Technical Field

The invention relates to the technical field of image segmentation, in particular to fast complex background image semantic segmentation for deep learning.

Background

More and more application scenes need accurate and efficient image segmentation technologies, such as automatic driving, indoor navigation, even virtual reality and augmented reality. This requirement is consistent with the development of deep learning techniques in various fields and application scenarios related to vision, especially the semantic segmentation technique based on deep learning. In the semantic segmentation network, the spatial resolution of a part of a pooling layer of a main network removed by the semantic segmentation network with a porous convolution architecture is stronger, and the reduction optimization of the main network and an ASPP (application specific fragment protocol) module is worthy of in-depth research content and can be applied to light-weight semantic segmentation CNN (CNN) identification; the encoder-decoder architecture semantic segmentation network reserves more components in the classification network and can be used for realizing feature extraction under a complex background; the full convolution network FCN does not change the structure of a convolution layer and a pooling layer of a backbone network, can simultaneously realize target detection and semantic segmentation, and reduces the computational complexity and data storage.

The invention starts from the application requirement of machine vision detection, and researches the complex background semantic segmentation technology of the codec framework network under the condition of mainly considering the semantic response value-semantic mutual exclusion relationship of the object components, namely, the detection algorithm is required to identify only relevant identification components, ignore other identification components, and identify the semantic irrelevant to identification as the background under different imaging conditions.

Disclosure of Invention

In order to solve the problems and the defects, the invention provides an optimized component analysis model coder-decoder architecture network, which improves the related semantic recognition capability of the component analysis model and improves the unrelated semantic exclusion capability of the component analysis model.

The purpose of the invention is realized by the following technical scheme:

an optimized component analysis model codec network and a fast semantic segmentation method are disclosed, the method comprises constructing and training an optimized codec architecture network, optimizing a component analysis model, and realizing fast complex background semantic segmentation; the method specifically comprises the following steps:

a, constructing a component analysis model coder-decoder architecture network, and training the network to recognize object-level semantics in an image

Object response value p_Object-jComponent level semantics

And component response value p_Part-k；

B using the object response value p_Object-jResponse value p to component_Part-kWeighting to obtain

The relevant semantic recognition capability of the component analysis model is improved;

c due object level semantics

With component level semantics

Mutually exclusive relationships of, i.e.

The irrelevant semantic exclusion capability of the component analysis model is improved;

d, constructing an optimized component analysis model coder-decoder architecture network to realize complex background semantic segmentation;

and E, adjusting the depth of the backbone network, retraining the optimized coder-decoder architecture network, and realizing rapid complex background semantic segmentation.

One or more embodiments of the present invention may have the following advantages over the prior art:

optimizing component analysis model codecsThe architecture network can divide the time T in the network_segUnder the condition of basically keeping unchanged, the application component analysis optimization module can ensure the pixel identification accuracy

If at T_segIn the case of harsh requirements, the depth d of the backbone network can be reduced_mainAdjusting the pixel identification accuracy

Remains substantially unchanged, while T_segA shortened network structure.

Drawings

FIG. 1 is a flow diagram of an optimized component analysis model codec network and a fast semantic segmentation method;

FIGS. 2a and 2b are schematic diagrams comparing an optimized component analysis model and a UperNet component analysis model according to the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the following embodiments and accompanying drawings.

The invention relates to an optimized component analysis model codec network and a rapid semantic segmentation method. The method specifically comprises the following steps:

step 10, constructing a component analysis model coder-decoder framework network, and training the network to recognize object-level semantics in an image

Object response value p_Object-jComponent level semantics

Component response value p_Part-k；

The component analysis model coder-decoder architecture network comprises an object semantic segmentation network, a component semantic segmentation network and a component analysis model coder-decoder architecture networkA piece analysis model; the object semantic segmentation network can identify object level semantics in the output image

Object response value p_Object-jTraining on object-level labels; the object semantic segmentation network can identify component level semantics in an output image

Component response value p_Part-kTraining is performed on part-level labels. The component analysis model can identify effective component semantics

Step 20 Using p_Object-jTo p_Part-kWeighting to obtain

Instead of the former

(i.e. the) The relevant semantic recognition capability of the component analysis model is improved;

enabling an object-level classifier in the UperNet network to realize N_objectObject semantic identification by regression of object probability vectors

Then the probability maximum semantics

Identifying as object semantics, the expression is:

similarly, let the component-level classifier implement N_PartComponent semantic identification, component probability vector by regression

Then the probability maximum semantics

Identifying as component semantics, the expression is:

the joint type (1), formula (2) and formula (3) can obtain:

to formula (4), by p_Object-jTo p_Part-kWeighting (coefficient α)_Part-k) To obtain

Instead of the former

Namely, the method comprises the following steps:

namely, the related semantic recognition capability of the component analysis model is improved.

Step 30 considersAnd

mutually exclusive relationships, i.e.

for formula (1), formula (4), conditionsIn, if

And

mutually exclusive and the mutual exclusion are carried out,

all components are recorded

The components are integrated into

And is also provided with

Namely, the method comprises the following steps:

namely, the irrelevant semantic exclusion capability of the component analysis model is improved.

Step 40, constructing an optimized component analysis model coder-decoder framework network to realize complex background semantic segmentation;

by using

Instead of the former

Consider that

And

the component analysis model of the mutual exclusion relationship, the joint type (5) and (6), can obtain:

fig. 2a) and fig. 2b) are a UperNet component analysis module structure and a component analysis module optimization structure of the invention, respectively. It can be seen that the component analysis module optimization structure of the present invention is different from the UperNet component analysis module structure in that the input is selected from the group consisting of

Modified as p_Object-j、p_Part-kIncreasing input

WeightingThen recognize it

UperNet without a component analysis module was trained in the ADE20K dataset to segment time T_segThe calculation adopts a GeForce GTX 1080Ti GPU test hardware environment, the single-precision floating point number calculation capacity is 11.34TFLOPs, the memory capacity is 11GB, and the memory read-write speed is 11 Gbps. The learning rate of the network training is attenuated along with the iteration times by using a poly curve, the initial learning rate and the index are respectively 0.02 and 0.9, and the initial learning rate and the index are subjected to 5000 times of iterative training through 50 rounds.

[ application example 1 ] the backbone networks are all usedResNet101

If the backbone networks of the three networks all use the verification set effect from ADE20K under ResNet101, the complex semantic segmentation capability after the analysis module is applied is analyzedTABLE 1 parts analysis Module for ADE20K validation set

ΔPA_PartTable (7).

Table 1ADE20K validation set

ΔPA_PartWatch (A)

Component classifier to component semantics PA_Part48.3%, using a component analysis module

ΔPA_Part155.5% -48.3% -7.2%, using a component analysis optimization moduleΔPA_Part2＝56.9％-48.3％＝8.6％，ΔPA_Part2-ΔPA_Part11.4%, namely, the application component analysis optimization module compares the component analysis module

The method improves the capability by 1.4 percent, and the complex semantic segmentation capability of the network to various scenes is enhanced. But average division time

Slightly increased, 2000 images of the ADE20K validation set were segmented, and the part classifier averaged segmentation time

Using component analysis modules

Using component analysis optimisation modules

The network computing overhead and the segmentation speed are basically kept unchanged.

[ application example 2 ] adjusting the optimization component analysis model codec architecture network backbone network depth to ResNet50

The specific indexes are shown in Table 2,

table 2 ADE20K validation set

ΔPA_PartWatch (A)

Using component analysis modules

Using component analysis optimisation modules

The segmentation speed is increased by 108 ms.

The embodiment researches the complex back of the codec framework network under the premise of focusing on the semantic response value-semantic mutual exclusion relationship of the object componentsThe scene semantic segmentation technology requires a detection algorithm to identify only relevant identification components, neglects identification of other components, and identifies irrelevant identification semantics as a background under different imaging conditions. Splitting time T in a network_segUnder the condition of basically keeping unchanged, the application component analysis optimization module can ensure the pixel identification accuracyIf at T_segIn the case of harsh requirements, the depth d of the backbone network can be reduced_mainAdjusting the pixel identification accuracyRemains substantially unchanged, while T_segA shortened network structure.

Although the embodiments of the present invention have been described above, the above descriptions are only for the convenience of understanding the present invention, and are not intended to limit the present invention. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. The method is characterized by comprising the steps of constructing and training an optimized coder-decoder architecture network, optimizing a component analysis model and realizing rapid complex background semantic segmentation; the method specifically comprises the following steps:

Object response value p_Object-jComponent level semantics

And component response value p_Part-k；

c due object level semantics

With component level semanticsMutually exclusive relationships of, i.e.

2. The optimized component analysis model codec network and the fast semantic segmentation method according to claim 1, wherein the component analysis model codec architecture network in step a includes an object semantic segmentation network, a component semantic segmentation network, and a component analysis model;

the object semantic segmentation network can identify object level semantics in the output imageObject response value p_Object-jTraining on object-level labels;

the component semantic segmentation network can identify component level semantics in an output image

Component response value p_Part-kTraining on part-level labels;

the component analysis model can identify effective component semantics

3. The optimized component analysis model codec network and the fast semantic segmentation method of claim 1, wherein p is used_Object-jTo p_Part-kWeighting to obtain

By

Instead of the former

(i.e. the

) After that, the component analysis model becomes:

Then the probability maximum semantics

Identifying as object semantics, the expression is:

similarly, let the component-level classifier implement N_PartComponent semantic identification, component probability vector by regressionThen the probability maximum semantics

Identifying as component semantics, the expression is:

the joint type (1), formula (2) and formula (3) can obtain:

Instead of the former

Namely, the method comprises the following steps:

4. The optimized component analysis model codec network of claim 1The method for fast semantic segmentation is characterized in that in the step C: due to the fact that