CN110827284A - Codec network for optimizing component analysis model and rapid semantic segmentation method - Google Patents
Codec network for optimizing component analysis model and rapid semantic segmentation method Download PDFInfo
- Publication number
- CN110827284A CN110827284A CN201911065888.6A CN201911065888A CN110827284A CN 110827284 A CN110827284 A CN 110827284A CN 201911065888 A CN201911065888 A CN 201911065888A CN 110827284 A CN110827284 A CN 110827284A
- Authority
- CN
- China
- Prior art keywords
- analysis model
- component
- component analysis
- network
- semantic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10004—Still image; Photographic image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Abstract
The invention discloses an optimized component analysis model codec network and a rapid semantic segmentation method, which comprises the steps of constructing a component analysis model codec architecture network and training the network to identify object-level semantic meaning in an imageObject response value pObject‑jComponent level semanticsAnd component response value pPart‑k(ii) a Using object response value pObject‑jResponse value p to componentPart‑kWeighting to obtainThe relevant semantic recognition capability of the component analysis model is improved; due to the fact thatObject level semanticsWith component level semanticsMutually exclusive relationships of, i.e.The irrelevant semantic exclusion capability of the component analysis model is improved; constructing an optimized component analysis model coder-decoder architecture network to realize complex background semantic segmentation; and adjusting the depth of the backbone network, retraining the optimized coder-decoder architecture network, and realizing rapid and complex background semantic segmentation.
Description
Technical Field
The invention relates to the technical field of image segmentation, in particular to fast complex background image semantic segmentation for deep learning.
Background
More and more application scenes need accurate and efficient image segmentation technologies, such as automatic driving, indoor navigation, even virtual reality and augmented reality. This requirement is consistent with the development of deep learning techniques in various fields and application scenarios related to vision, especially the semantic segmentation technique based on deep learning. In the semantic segmentation network, the spatial resolution of a part of a pooling layer of a main network removed by the semantic segmentation network with a porous convolution architecture is stronger, and the reduction optimization of the main network and an ASPP (application specific fragment protocol) module is worthy of in-depth research content and can be applied to light-weight semantic segmentation CNN (CNN) identification; the encoder-decoder architecture semantic segmentation network reserves more components in the classification network and can be used for realizing feature extraction under a complex background; the full convolution network FCN does not change the structure of a convolution layer and a pooling layer of a backbone network, can simultaneously realize target detection and semantic segmentation, and reduces the computational complexity and data storage.
The invention starts from the application requirement of machine vision detection, and researches the complex background semantic segmentation technology of the codec framework network under the condition of mainly considering the semantic response value-semantic mutual exclusion relationship of the object components, namely, the detection algorithm is required to identify only relevant identification components, ignore other identification components, and identify the semantic irrelevant to identification as the background under different imaging conditions.
Disclosure of Invention
In order to solve the problems and the defects, the invention provides an optimized component analysis model coder-decoder architecture network, which improves the related semantic recognition capability of the component analysis model and improves the unrelated semantic exclusion capability of the component analysis model.
The purpose of the invention is realized by the following technical scheme:
an optimized component analysis model codec network and a fast semantic segmentation method are disclosed, the method comprises constructing and training an optimized codec architecture network, optimizing a component analysis model, and realizing fast complex background semantic segmentation; the method specifically comprises the following steps:
a, constructing a component analysis model coder-decoder architecture network, and training the network to recognize object-level semantics in an imageObject response value pObject-jComponent level semanticsAnd component response value pPart-k;
B using the object response value pObject-jResponse value p to componentPart-kWeighting to obtainThe relevant semantic recognition capability of the component analysis model is improved;
c due object level semanticsWith component level semanticsMutually exclusive relationships of, i.e.The irrelevant semantic exclusion capability of the component analysis model is improved;
d, constructing an optimized component analysis model coder-decoder architecture network to realize complex background semantic segmentation;
and E, adjusting the depth of the backbone network, retraining the optimized coder-decoder architecture network, and realizing rapid complex background semantic segmentation.
One or more embodiments of the present invention may have the following advantages over the prior art:
optimizing component analysis model codecsThe architecture network can divide the time T in the networksegUnder the condition of basically keeping unchanged, the application component analysis optimization module can ensure the pixel identification accuracyIf at TsegIn the case of harsh requirements, the depth d of the backbone network can be reducedmainAdjusting the pixel identification accuracyRemains substantially unchanged, while TsegA shortened network structure.
Drawings
FIG. 1 is a flow diagram of an optimized component analysis model codec network and a fast semantic segmentation method;
FIGS. 2a and 2b are schematic diagrams comparing an optimized component analysis model and a UperNet component analysis model according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the following embodiments and accompanying drawings.
The invention relates to an optimized component analysis model codec network and a rapid semantic segmentation method. The method specifically comprises the following steps:
The component analysis model coder-decoder architecture network comprises an object semantic segmentation network, a component semantic segmentation network and a component analysis model coder-decoder architecture networkA piece analysis model; the object semantic segmentation network can identify object level semantics in the output imageObject response value pObject-jTraining on object-level labels; the object semantic segmentation network can identify component level semantics in an output imageComponent response value pPart-kTraining is performed on part-level labels. The component analysis model can identify effective component semantics
enabling an object-level classifier in the UperNet network to realize NobjectObject semantic identification by regression of object probability vectorsThen the probability maximum semanticsIdentifying as object semantics, the expression is:
similarly, let the component-level classifier implement NPartComponent semantic identification, component probability vector by regressionThen the probability maximum semanticsIdentifying as component semantics, the expression is:
the joint type (1), formula (2) and formula (3) can obtain:
to formula (4), by pObject-jTo pPart-kWeighting (coefficient α)Part-k) To obtain Instead of the formerNamely, the method comprises the following steps:
namely, the related semantic recognition capability of the component analysis model is improved.
for formula (1), formula (4), conditionsIn, ifAndmutually exclusive and the mutual exclusion are carried out,all components are recordedThe components are integrated intoAnd is also provided with Namely, the method comprises the following steps:
namely, the irrelevant semantic exclusion capability of the component analysis model is improved.
by usingInstead of the formerConsider thatAndthe component analysis model of the mutual exclusion relationship, the joint type (5) and (6), can obtain:
fig. 2a) and fig. 2b) are a UperNet component analysis module structure and a component analysis module optimization structure of the invention, respectively. It can be seen that the component analysis module optimization structure of the present invention is different from the UperNet component analysis module structure in that the input is selected from the group consisting ofModified as pObject-j、pPart-kIncreasing inputWeightingThen recognize it
UperNet without a component analysis module was trained in the ADE20K dataset to segment time TsegThe calculation adopts a GeForce GTX 1080Ti GPU test hardware environment, the single-precision floating point number calculation capacity is 11.34TFLOPs, the memory capacity is 11GB, and the memory read-write speed is 11 Gbps. The learning rate of the network training is attenuated along with the iteration times by using a poly curve, the initial learning rate and the index are respectively 0.02 and 0.9, and the initial learning rate and the index are subjected to 5000 times of iterative training through 50 rounds.
[ application example 1 ] the backbone networks are all usedResNet101
If the backbone networks of the three networks all use the verification set effect from ADE20K under ResNet101, the complex semantic segmentation capability after the analysis module is applied is analyzedTABLE 1 parts analysis Module for ADE20K validation setΔPAPartTable (7).
Component classifier to component semantics PAPart48.3%, using a component analysis moduleΔPAPart155.5% -48.3% -7.2%, using a component analysis optimization moduleΔPAPart2=56.9%-48.3%=8.6%,ΔPAPart2-ΔPAPart11.4%, namely, the application component analysis optimization module compares the component analysis moduleThe method improves the capability by 1.4 percent, and the complex semantic segmentation capability of the network to various scenes is enhanced. But average division timeSlightly increased, 2000 images of the ADE20K validation set were segmented, and the part classifier averaged segmentation timeUsing component analysis modules Using component analysis optimisation modules The network computing overhead and the segmentation speed are basically kept unchanged.
[ application example 2 ] adjusting the optimization component analysis model codec architecture network backbone network depth to ResNet50
The specific indexes are shown in Table 2,
Using component analysis modulesUsing component analysis optimisation modulesThe segmentation speed is increased by 108 ms.
The embodiment researches the complex back of the codec framework network under the premise of focusing on the semantic response value-semantic mutual exclusion relationship of the object componentsThe scene semantic segmentation technology requires a detection algorithm to identify only relevant identification components, neglects identification of other components, and identifies irrelevant identification semantics as a background under different imaging conditions. Splitting time T in a networksegUnder the condition of basically keeping unchanged, the application component analysis optimization module can ensure the pixel identification accuracyIf at TsegIn the case of harsh requirements, the depth d of the backbone network can be reducedmainAdjusting the pixel identification accuracyRemains substantially unchanged, while TsegA shortened network structure.
Although the embodiments of the present invention have been described above, the above descriptions are only for the convenience of understanding the present invention, and are not intended to limit the present invention. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.
Claims (5)
1. The method is characterized by comprising the steps of constructing and training an optimized coder-decoder architecture network, optimizing a component analysis model and realizing rapid complex background semantic segmentation; the method specifically comprises the following steps:
a, constructing a component analysis model coder-decoder architecture network, and training the network to recognize object-level semantics in an imageObject response value pObject-jComponent level semanticsAnd component response value pPart-k;
B using the object response value pObject-jResponse value p to componentPart-kWeighting to obtainThe relevant semantic recognition capability of the component analysis model is improved;
c due object level semanticsWith component level semanticsMutually exclusive relationships of, i.e.The irrelevant semantic exclusion capability of the component analysis model is improved;
d, constructing an optimized component analysis model coder-decoder architecture network to realize complex background semantic segmentation;
and E, adjusting the depth of the backbone network, retraining the optimized coder-decoder architecture network, and realizing rapid complex background semantic segmentation.
2. The optimized component analysis model codec network and the fast semantic segmentation method according to claim 1, wherein the component analysis model codec architecture network in step a includes an object semantic segmentation network, a component semantic segmentation network, and a component analysis model;
the object semantic segmentation network can identify object level semantics in the output imageObject response value pObject-jTraining on object-level labels;
the component semantic segmentation network can identify component level semantics in an output imageComponent response value pPart-kTraining on part-level labels;
3. The optimized component analysis model codec network and the fast semantic segmentation method of claim 1, wherein p is usedObject-jTo pPart-kWeighting to obtainByInstead of the former(i.e. the) After that, the component analysis model becomes:
enabling an object-level classifier in the UperNet network to realize NobjectObject semantic identification by regression of object probability vectorsThen the probability maximum semanticsIdentifying as object semantics, the expression is:
similarly, let the component-level classifier implement NPartComponent semantic identification, component probability vector by regressionThen the probability maximum semanticsIdentifying as component semantics, the expression is:
the joint type (1), formula (2) and formula (3) can obtain:
to formula (4), by pObject-jTo pPart-kWeighting (coefficient α)Part-k) To obtain Instead of the formerNamely, the method comprises the following steps:
namely, the related semantic recognition capability of the component analysis model is improved.
4. The optimized component analysis model codec network of claim 1The method for fast semantic segmentation is characterized in that in the step C: due to the fact thatAndmutually exclusive relationships, i.e.The specific steps of improving the irrelevant semantic exclusion capability of the component analysis model include:
for formula (1), formula (4), conditionsIn, ifAndmutually exclusive and the mutual exclusion are carried out,all components are recordedThe components are integrated intoAnd is also provided with Namely, the method comprises the following steps:
namely, the irrelevant semantic exclusion capability of the component analysis model is improved.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911065888.6A CN110827284B (en) | 2019-11-04 | 2019-11-04 | Optimizing component analysis model coder-decoder network and quick semantic segmentation method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911065888.6A CN110827284B (en) | 2019-11-04 | 2019-11-04 | Optimizing component analysis model coder-decoder network and quick semantic segmentation method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110827284A true CN110827284A (en) | 2020-02-21 |
CN110827284B CN110827284B (en) | 2023-10-10 |
Family
ID=69552588
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911065888.6A Active CN110827284B (en) | 2019-11-04 | 2019-11-04 | Optimizing component analysis model coder-decoder network and quick semantic segmentation method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110827284B (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170262735A1 (en) * | 2016-03-11 | 2017-09-14 | Kabushiki Kaisha Toshiba | Training constrained deconvolutional networks for road scene semantic segmentation |
CN110276402A (en) * | 2019-06-25 | 2019-09-24 | 北京工业大学 | A kind of salt body recognition methods based on the enhancing of deep learning semanteme boundary |
US10430946B1 (en) * | 2019-03-14 | 2019-10-01 | Inception Institute of Artificial Intelligence, Ltd. | Medical image segmentation and severity grading using neural network architectures with semi-supervised learning techniques |
-
2019
- 2019-11-04 CN CN201911065888.6A patent/CN110827284B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170262735A1 (en) * | 2016-03-11 | 2017-09-14 | Kabushiki Kaisha Toshiba | Training constrained deconvolutional networks for road scene semantic segmentation |
US10430946B1 (en) * | 2019-03-14 | 2019-10-01 | Inception Institute of Artificial Intelligence, Ltd. | Medical image segmentation and severity grading using neural network architectures with semi-supervised learning techniques |
CN110276402A (en) * | 2019-06-25 | 2019-09-24 | 北京工业大学 | A kind of salt body recognition methods based on the enhancing of deep learning semanteme boundary |
Non-Patent Citations (1)
Title |
---|
张晓明;尹鸿峰;: "基于卷积神经网络和语义信息的场景分类", 软件, no. 01, pages 37 - 42 * |
Also Published As
Publication number | Publication date |
---|---|
CN110827284B (en) | 2023-10-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021196873A1 (en) | License plate character recognition method and apparatus, electronic device, and storage medium | |
CN112465008B (en) | Voice and visual relevance enhancement method based on self-supervision course learning | |
CN110570458A (en) | Target tracking method based on internal cutting and multi-layer characteristic information fusion | |
US11244157B2 (en) | Image detection method, apparatus, device and storage medium | |
CN109948542A (en) | Gesture identification method, device, electronic equipment and storage medium | |
CN111968123B (en) | Semi-supervised video target segmentation method | |
CN111178161B (en) | Vehicle tracking method and system based on FCOS | |
CN111507215B (en) | Video target segmentation method based on space-time convolution cyclic neural network and cavity convolution | |
CN107564007B (en) | Scene segmentation correction method and system fusing global information | |
CN111144376A (en) | Video target detection feature extraction method | |
US20140098988A1 (en) | Fitting Contours to Features | |
CN109087337B (en) | Long-time target tracking method and system based on hierarchical convolution characteristics | |
CN112308128A (en) | Image matching method based on attention mechanism neural network | |
CN113537119B (en) | Transmission line connecting part detection method based on improved Yolov4-tiny | |
CN107564013B (en) | Scene segmentation correction method and system fusing local information | |
CN116912924B (en) | Target image recognition method and device | |
CN110827284A (en) | Codec network for optimizing component analysis model and rapid semantic segmentation method | |
WO2023206964A1 (en) | Pedestrian re-identification method, system and device, and computer-readable storage medium | |
CN113538509B (en) | Visual tracking method and device based on adaptive correlation filtering feature fusion learning | |
JP2023029236A (en) | Method for training object detection model and object detection method | |
CN112381056B (en) | Cross-domain pedestrian re-identification method and system fusing multiple source domains | |
CN114596609A (en) | Audio-visual counterfeit detection method and device | |
Zhang et al. | Research on lane identification based on deep learning | |
CN115147434A (en) | Image processing method, device, terminal equipment and computer readable storage medium | |
Wang et al. | INSPIRATION: A reinforcement learning-based human visual perception-driven image enhancement paradigm for underwater scenes |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |