CN110827284A - Codec network for optimizing component analysis model and rapid semantic segmentation method - Google Patents

Codec network for optimizing component analysis model and rapid semantic segmentation method Download PDF

Info

Publication number
CN110827284A
CN110827284A CN201911065888.6A CN201911065888A CN110827284A CN 110827284 A CN110827284 A CN 110827284A CN 201911065888 A CN201911065888 A CN 201911065888A CN 110827284 A CN110827284 A CN 110827284A
Authority
CN
China
Prior art keywords
analysis model
component
component analysis
network
semantic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911065888.6A
Other languages
Chinese (zh)
Other versions
CN110827284B (en
Inventor
刘桂雄
黄坚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN201911065888.6A priority Critical patent/CN110827284B/en
Publication of CN110827284A publication Critical patent/CN110827284A/en
Application granted granted Critical
Publication of CN110827284B publication Critical patent/CN110827284B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention discloses an optimized component analysis model codec network and a rapid semantic segmentation method, which comprises the steps of constructing a component analysis model codec architecture network and training the network to identify object-level semantic meaning in an imageObject response value pObject‑jComponent level semantics
Figure DDA0002259327170000012
And component response value pPart‑k(ii) a Using object response value pObject‑jResponse value p to componentPart‑kWeighting to obtain
Figure DDA0002259327170000013
The relevant semantic recognition capability of the component analysis model is improved; due to the fact thatObject level semantics
Figure DDA0002259327170000014
With component level semantics
Figure DDA0002259327170000015
Mutually exclusive relationships of, i.e.
Figure DDA0002259327170000016
The irrelevant semantic exclusion capability of the component analysis model is improved; constructing an optimized component analysis model coder-decoder architecture network to realize complex background semantic segmentation; and adjusting the depth of the backbone network, retraining the optimized coder-decoder architecture network, and realizing rapid and complex background semantic segmentation.

Description

Codec network for optimizing component analysis model and rapid semantic segmentation method
Technical Field
The invention relates to the technical field of image segmentation, in particular to fast complex background image semantic segmentation for deep learning.
Background
More and more application scenes need accurate and efficient image segmentation technologies, such as automatic driving, indoor navigation, even virtual reality and augmented reality. This requirement is consistent with the development of deep learning techniques in various fields and application scenarios related to vision, especially the semantic segmentation technique based on deep learning. In the semantic segmentation network, the spatial resolution of a part of a pooling layer of a main network removed by the semantic segmentation network with a porous convolution architecture is stronger, and the reduction optimization of the main network and an ASPP (application specific fragment protocol) module is worthy of in-depth research content and can be applied to light-weight semantic segmentation CNN (CNN) identification; the encoder-decoder architecture semantic segmentation network reserves more components in the classification network and can be used for realizing feature extraction under a complex background; the full convolution network FCN does not change the structure of a convolution layer and a pooling layer of a backbone network, can simultaneously realize target detection and semantic segmentation, and reduces the computational complexity and data storage.
The invention starts from the application requirement of machine vision detection, and researches the complex background semantic segmentation technology of the codec framework network under the condition of mainly considering the semantic response value-semantic mutual exclusion relationship of the object components, namely, the detection algorithm is required to identify only relevant identification components, ignore other identification components, and identify the semantic irrelevant to identification as the background under different imaging conditions.
Disclosure of Invention
In order to solve the problems and the defects, the invention provides an optimized component analysis model coder-decoder architecture network, which improves the related semantic recognition capability of the component analysis model and improves the unrelated semantic exclusion capability of the component analysis model.
The purpose of the invention is realized by the following technical scheme:
an optimized component analysis model codec network and a fast semantic segmentation method are disclosed, the method comprises constructing and training an optimized codec architecture network, optimizing a component analysis model, and realizing fast complex background semantic segmentation; the method specifically comprises the following steps:
a, constructing a component analysis model coder-decoder architecture network, and training the network to recognize object-level semantics in an image
Figure BDA0002259327150000021
Object response value pObject-jComponent level semantics
Figure BDA0002259327150000022
And component response value pPart-k
B using the object response value pObject-jResponse value p to componentPart-kWeighting to obtain
Figure BDA0002259327150000023
The relevant semantic recognition capability of the component analysis model is improved;
c due object level semantics
Figure BDA0002259327150000024
With component level semantics
Figure BDA0002259327150000025
Mutually exclusive relationships of, i.e.
Figure BDA0002259327150000026
The irrelevant semantic exclusion capability of the component analysis model is improved;
d, constructing an optimized component analysis model coder-decoder architecture network to realize complex background semantic segmentation;
and E, adjusting the depth of the backbone network, retraining the optimized coder-decoder architecture network, and realizing rapid complex background semantic segmentation.
One or more embodiments of the present invention may have the following advantages over the prior art:
optimizing component analysis model codecsThe architecture network can divide the time T in the networksegUnder the condition of basically keeping unchanged, the application component analysis optimization module can ensure the pixel identification accuracy
Figure BDA0002259327150000027
If at TsegIn the case of harsh requirements, the depth d of the backbone network can be reducedmainAdjusting the pixel identification accuracy
Figure BDA0002259327150000028
Remains substantially unchanged, while TsegA shortened network structure.
Drawings
FIG. 1 is a flow diagram of an optimized component analysis model codec network and a fast semantic segmentation method;
FIGS. 2a and 2b are schematic diagrams comparing an optimized component analysis model and a UperNet component analysis model according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the following embodiments and accompanying drawings.
The invention relates to an optimized component analysis model codec network and a rapid semantic segmentation method. The method specifically comprises the following steps:
step 10, constructing a component analysis model coder-decoder framework network, and training the network to recognize object-level semantics in an image
Figure BDA0002259327150000031
Object response value pObject-jComponent level semantics
Figure BDA0002259327150000032
Component response value pPart-k
The component analysis model coder-decoder architecture network comprises an object semantic segmentation network, a component semantic segmentation network and a component analysis model coder-decoder architecture networkA piece analysis model; the object semantic segmentation network can identify object level semantics in the output image
Figure BDA0002259327150000033
Object response value pObject-jTraining on object-level labels; the object semantic segmentation network can identify component level semantics in an output image
Figure BDA0002259327150000034
Component response value pPart-kTraining is performed on part-level labels. The component analysis model can identify effective component semantics
Figure BDA0002259327150000035
Figure BDA0002259327150000036
Step 20 Using pObject-jTo pPart-kWeighting to obtain
Figure BDA0002259327150000037
Figure BDA0002259327150000038
Instead of the former
Figure BDA0002259327150000039
(i.e. the) The relevant semantic recognition capability of the component analysis model is improved;
enabling an object-level classifier in the UperNet network to realize NobjectObject semantic identification by regression of object probability vectors
Figure BDA00022593271500000311
Then the probability maximum semantics
Figure BDA00022593271500000312
Identifying as object semantics, the expression is:
Figure BDA00022593271500000313
similarly, let the component-level classifier implement NPartComponent semantic identification, component probability vector by regression
Figure BDA00022593271500000314
Then the probability maximum semantics
Figure BDA00022593271500000315
Identifying as component semantics, the expression is:
Figure BDA00022593271500000316
the joint type (1), formula (2) and formula (3) can obtain:
Figure BDA0002259327150000041
to formula (4), by pObject-jTo pPart-kWeighting (coefficient α)Part-k) To obtain
Figure BDA0002259327150000043
Instead of the former
Figure BDA0002259327150000044
Namely, the method comprises the following steps:
namely, the related semantic recognition capability of the component analysis model is improved.
Step 30 considersAnd
Figure BDA0002259327150000047
mutually exclusive relationships, i.e.
Figure BDA0002259327150000048
The irrelevant semantic exclusion capability of the component analysis model is improved;
for formula (1), formula (4), conditionsIn, if
Figure BDA00022593271500000410
And
Figure BDA00022593271500000411
mutually exclusive and the mutual exclusion are carried out,
Figure BDA00022593271500000412
all components are recorded
Figure BDA00022593271500000413
The components are integrated into
Figure BDA00022593271500000414
And is also provided with
Figure BDA00022593271500000415
Figure BDA00022593271500000416
Namely, the method comprises the following steps:
Figure BDA00022593271500000417
namely, the irrelevant semantic exclusion capability of the component analysis model is improved.
Step 40, constructing an optimized component analysis model coder-decoder framework network to realize complex background semantic segmentation;
by using
Figure BDA0002259327150000051
Instead of the former
Figure BDA0002259327150000052
Consider that
Figure BDA0002259327150000053
And
Figure BDA0002259327150000054
the component analysis model of the mutual exclusion relationship, the joint type (5) and (6), can obtain:
Figure BDA0002259327150000055
fig. 2a) and fig. 2b) are a UperNet component analysis module structure and a component analysis module optimization structure of the invention, respectively. It can be seen that the component analysis module optimization structure of the present invention is different from the UperNet component analysis module structure in that the input is selected from the group consisting of
Figure BDA0002259327150000056
Modified as pObject-j、pPart-kIncreasing input
Figure BDA0002259327150000057
WeightingThen recognize it
Figure BDA0002259327150000059
UperNet without a component analysis module was trained in the ADE20K dataset to segment time TsegThe calculation adopts a GeForce GTX 1080Ti GPU test hardware environment, the single-precision floating point number calculation capacity is 11.34TFLOPs, the memory capacity is 11GB, and the memory read-write speed is 11 Gbps. The learning rate of the network training is attenuated along with the iteration times by using a poly curve, the initial learning rate and the index are respectively 0.02 and 0.9, and the initial learning rate and the index are subjected to 5000 times of iterative training through 50 rounds.
[ application example 1 ] the backbone networks are all usedResNet101
If the backbone networks of the three networks all use the verification set effect from ADE20K under ResNet101, the complex semantic segmentation capability after the analysis module is applied is analyzedTABLE 1 parts analysis Module for ADE20K validation set
Figure BDA00022593271500000511
ΔPAPartTable (7).
Table 1ADE20K validation set
Figure BDA00022593271500000512
ΔPAPartWatch (A)
Figure BDA00022593271500000513
Figure BDA0002259327150000061
Component classifier to component semantics PAPart48.3%, using a component analysis module
Figure BDA0002259327150000062
ΔPAPart155.5% -48.3% -7.2%, using a component analysis optimization moduleΔPAPart2=56.9%-48.3%=8.6%,ΔPAPart2-ΔPAPart11.4%, namely, the application component analysis optimization module compares the component analysis module
Figure BDA0002259327150000064
The method improves the capability by 1.4 percent, and the complex semantic segmentation capability of the network to various scenes is enhanced. But average division time
Figure BDA0002259327150000065
Slightly increased, 2000 images of the ADE20K validation set were segmented, and the part classifier averaged segmentation time
Figure BDA0002259327150000066
Using component analysis modules
Figure BDA0002259327150000068
Using component analysis optimisation modules
Figure BDA0002259327150000069
Figure BDA00022593271500000610
The network computing overhead and the segmentation speed are basically kept unchanged.
[ application example 2 ] adjusting the optimization component analysis model codec architecture network backbone network depth to ResNet50
The specific indexes are shown in Table 2,
table 2 ADE20K validation set
Figure BDA00022593271500000611
ΔPAPartWatch (A)
Figure BDA00022593271500000612
Using component analysis modules
Figure BDA00022593271500000613
Using component analysis optimisation modules
Figure BDA00022593271500000614
The segmentation speed is increased by 108 ms.
The embodiment researches the complex back of the codec framework network under the premise of focusing on the semantic response value-semantic mutual exclusion relationship of the object componentsThe scene semantic segmentation technology requires a detection algorithm to identify only relevant identification components, neglects identification of other components, and identifies irrelevant identification semantics as a background under different imaging conditions. Splitting time T in a networksegUnder the condition of basically keeping unchanged, the application component analysis optimization module can ensure the pixel identification accuracyIf at TsegIn the case of harsh requirements, the depth d of the backbone network can be reducedmainAdjusting the pixel identification accuracyRemains substantially unchanged, while TsegA shortened network structure.
Although the embodiments of the present invention have been described above, the above descriptions are only for the convenience of understanding the present invention, and are not intended to limit the present invention. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (5)

1. The method is characterized by comprising the steps of constructing and training an optimized coder-decoder architecture network, optimizing a component analysis model and realizing rapid complex background semantic segmentation; the method specifically comprises the following steps:
a, constructing a component analysis model coder-decoder architecture network, and training the network to recognize object-level semantics in an image
Figure FDA0002259327140000011
Object response value pObject-jComponent level semantics
Figure FDA0002259327140000012
And component response value pPart-k
B using the object response value pObject-jResponse value p to componentPart-kWeighting to obtain
Figure FDA0002259327140000013
The relevant semantic recognition capability of the component analysis model is improved;
c due object level semantics
Figure FDA0002259327140000014
With component level semanticsMutually exclusive relationships of, i.e.
Figure FDA0002259327140000016
The irrelevant semantic exclusion capability of the component analysis model is improved;
d, constructing an optimized component analysis model coder-decoder architecture network to realize complex background semantic segmentation;
and E, adjusting the depth of the backbone network, retraining the optimized coder-decoder architecture network, and realizing rapid complex background semantic segmentation.
2. The optimized component analysis model codec network and the fast semantic segmentation method according to claim 1, wherein the component analysis model codec architecture network in step a includes an object semantic segmentation network, a component semantic segmentation network, and a component analysis model;
the object semantic segmentation network can identify object level semantics in the output imageObject response value pObject-jTraining on object-level labels;
the component semantic segmentation network can identify component level semantics in an output image
Figure FDA0002259327140000018
Component response value pPart-kTraining on part-level labels;
the component analysis model can identify effective component semantics
Figure FDA0002259327140000019
Figure FDA00022593271400000110
3. The optimized component analysis model codec network and the fast semantic segmentation method of claim 1, wherein p is usedObject-jTo pPart-kWeighting to obtain
Figure FDA00022593271400000111
By
Figure FDA00022593271400000112
Instead of the former
Figure FDA00022593271400000113
(i.e. the
Figure FDA00022593271400000114
) After that, the component analysis model becomes:
enabling an object-level classifier in the UperNet network to realize NobjectObject semantic identification by regression of object probability vectors
Figure FDA00022593271400000115
Then the probability maximum semantics
Figure FDA00022593271400000116
Identifying as object semantics, the expression is:
Figure FDA0002259327140000021
similarly, let the component-level classifier implement NPartComponent semantic identification, component probability vector by regressionThen the probability maximum semantics
Figure FDA0002259327140000023
Identifying as component semantics, the expression is:
Figure FDA0002259327140000024
the joint type (1), formula (2) and formula (3) can obtain:
Figure FDA0002259327140000025
to formula (4), by pObject-jTo pPart-kWeighting (coefficient α)Part-k) To obtain
Figure FDA0002259327140000026
Figure FDA0002259327140000027
Instead of the former
Figure FDA0002259327140000028
Namely, the method comprises the following steps:
Figure FDA0002259327140000029
namely, the related semantic recognition capability of the component analysis model is improved.
4. The optimized component analysis model codec network of claim 1The method for fast semantic segmentation is characterized in that in the step C: due to the fact that
Figure FDA00022593271400000210
And
Figure FDA00022593271400000211
mutually exclusive relationships, i.e.
Figure FDA00022593271400000212
The specific steps of improving the irrelevant semantic exclusion capability of the component analysis model include:
for formula (1), formula (4), conditions
Figure FDA00022593271400000213
In, if
Figure FDA00022593271400000214
And
Figure FDA00022593271400000215
mutually exclusive and the mutual exclusion are carried out,
Figure FDA00022593271400000216
all components are recorded
Figure FDA00022593271400000217
The components are integrated into
Figure FDA00022593271400000218
And is also provided with
Figure FDA00022593271400000219
Namely, the method comprises the following steps:
Figure FDA0002259327140000031
namely, the irrelevant semantic exclusion capability of the component analysis model is improved.
5. The codec network and the fast semantic segmentation method according to claim 1, wherein the optimized component analysis model in step D is:
by using
Figure FDA0002259327140000032
Instead of the former
Figure FDA0002259327140000033
Consider that
Figure FDA0002259327140000034
And
Figure FDA0002259327140000035
the component analysis model of the mutual exclusion relationship, the joint type (5) and (6), can obtain:
Figure FDA0002259327140000036
CN201911065888.6A 2019-11-04 2019-11-04 Optimizing component analysis model coder-decoder network and quick semantic segmentation method Active CN110827284B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911065888.6A CN110827284B (en) 2019-11-04 2019-11-04 Optimizing component analysis model coder-decoder network and quick semantic segmentation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911065888.6A CN110827284B (en) 2019-11-04 2019-11-04 Optimizing component analysis model coder-decoder network and quick semantic segmentation method

Publications (2)

Publication Number Publication Date
CN110827284A true CN110827284A (en) 2020-02-21
CN110827284B CN110827284B (en) 2023-10-10

Family

ID=69552588

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911065888.6A Active CN110827284B (en) 2019-11-04 2019-11-04 Optimizing component analysis model coder-decoder network and quick semantic segmentation method

Country Status (1)

Country Link
CN (1) CN110827284B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170262735A1 (en) * 2016-03-11 2017-09-14 Kabushiki Kaisha Toshiba Training constrained deconvolutional networks for road scene semantic segmentation
CN110276402A (en) * 2019-06-25 2019-09-24 北京工业大学 A kind of salt body recognition methods based on the enhancing of deep learning semanteme boundary
US10430946B1 (en) * 2019-03-14 2019-10-01 Inception Institute of Artificial Intelligence, Ltd. Medical image segmentation and severity grading using neural network architectures with semi-supervised learning techniques

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170262735A1 (en) * 2016-03-11 2017-09-14 Kabushiki Kaisha Toshiba Training constrained deconvolutional networks for road scene semantic segmentation
US10430946B1 (en) * 2019-03-14 2019-10-01 Inception Institute of Artificial Intelligence, Ltd. Medical image segmentation and severity grading using neural network architectures with semi-supervised learning techniques
CN110276402A (en) * 2019-06-25 2019-09-24 北京工业大学 A kind of salt body recognition methods based on the enhancing of deep learning semanteme boundary

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张晓明;尹鸿峰;: "基于卷积神经网络和语义信息的场景分类", 软件, no. 01, pages 37 - 42 *

Also Published As

Publication number Publication date
CN110827284B (en) 2023-10-10

Similar Documents

Publication Publication Date Title
WO2021196873A1 (en) License plate character recognition method and apparatus, electronic device, and storage medium
CN112465008B (en) Voice and visual relevance enhancement method based on self-supervision course learning
CN110570458A (en) Target tracking method based on internal cutting and multi-layer characteristic information fusion
US11244157B2 (en) Image detection method, apparatus, device and storage medium
CN109948542A (en) Gesture identification method, device, electronic equipment and storage medium
CN111968123B (en) Semi-supervised video target segmentation method
CN111178161B (en) Vehicle tracking method and system based on FCOS
CN111507215B (en) Video target segmentation method based on space-time convolution cyclic neural network and cavity convolution
CN107564007B (en) Scene segmentation correction method and system fusing global information
CN111144376A (en) Video target detection feature extraction method
US20140098988A1 (en) Fitting Contours to Features
CN109087337B (en) Long-time target tracking method and system based on hierarchical convolution characteristics
CN112308128A (en) Image matching method based on attention mechanism neural network
CN113537119B (en) Transmission line connecting part detection method based on improved Yolov4-tiny
CN107564013B (en) Scene segmentation correction method and system fusing local information
CN116912924B (en) Target image recognition method and device
CN110827284A (en) Codec network for optimizing component analysis model and rapid semantic segmentation method
WO2023206964A1 (en) Pedestrian re-identification method, system and device, and computer-readable storage medium
CN113538509B (en) Visual tracking method and device based on adaptive correlation filtering feature fusion learning
JP2023029236A (en) Method for training object detection model and object detection method
CN112381056B (en) Cross-domain pedestrian re-identification method and system fusing multiple source domains
CN114596609A (en) Audio-visual counterfeit detection method and device
Zhang et al. Research on lane identification based on deep learning
CN115147434A (en) Image processing method, device, terminal equipment and computer readable storage medium
Wang et al. INSPIRATION: A reinforcement learning-based human visual perception-driven image enhancement paradigm for underwater scenes

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant