CN116188778A - Double-sided semantic segmentation method based on super resolution - Google Patents

Double-sided semantic segmentation method based on super resolution Download PDF

Info

Publication number
CN116188778A
CN116188778A CN202310159918.XA CN202310159918A CN116188778A CN 116188778 A CN116188778 A CN 116188778A CN 202310159918 A CN202310159918 A CN 202310159918A CN 116188778 A CN116188778 A CN 116188778A
Authority
CN
China
Prior art keywords
multiplied
resolution
feature map
super
semantic segmentation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310159918.XA
Other languages
Chinese (zh)
Inventor
刘顺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN202310159918.XA priority Critical patent/CN116188778A/en
Publication of CN116188778A publication Critical patent/CN116188778A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a double-side semantic segmentation method based on super resolution, which is based on a super resolution technology and a attention mechanism technology, adopts a double-branch semantic segmentation method, has higher flexibility, and can improve the accuracy of image segmentation under the condition of not increasing extra calculation in the low-resolution field; according to the invention, a more advanced semantic segmentation method is replaced by exchanging a main network of a main branch, and simultaneously, an image channel and pixels are associated and fused to obtain a high-accuracy segmentation result under high resolution; and sending the segmentation result to a fusion module to guide the main branch to carry out segmentation learning.

Description

Double-sided semantic segmentation method based on super resolution
Technical Field
The invention relates to the field of image semantic segmentation, in particular to a bilateral semantic segmentation method based on super resolution.
Background
Over the past decade, deep learning-based machine learning related techniques have gained widespread social attention. For example, the automatic driving technology of new energy automobiles, which has been developed in recent years, has been gradually tried in the hands of the general public. The basis for making this technology possible is an image segmentation technique that gives the machine the ability to identify roads, pedestrians, traffic control lights, and ground markings. However, in the whole deep learning process, a proper algorithm is needed and a sufficient amount of original road surface pictures for machine learning are provided, and in the process, an image semantic segmentation technology is utilized.
At present, the image semantic segmentation technology is mainly applied to the fields of land segmentation, automatic driving, face segmentation, clothing classification, precise agriculture and the like. Meanwhile, the problems which are not solved yet exist in various fields, for example, in land segmentation, large-scale publicly available data sets are needed for monitoring regional forest cutting, urban progress, urban planning and the like. Also for example in the field of autopilot, it is necessary to sense, plan and execute the corresponding commands in a constantly changing environment. In the task, the security guarantee is a non-negligible link. This task needs to be performed with the highest accuracy. Semantic segmentation provides free space information on roads in this task, as well as detecting ground markers and traffic markers. However, the trade-off between real-time and accuracy of segmentation remains a challenge for this task.
In recent years, face recognition technology has been applied quite widely, and this technology often needs to be executed on a smaller device. A problem that follows is the need for fast, high precision segmentation of pictures with relatively low pixels. In the current semantic segmentation technology, the input of high-resolution images is accompanied with the best precision, which means that the calculation amount is increased sharply, and the calculation is difficult or even impossible on a mobile device or a terminal with lower configuration. And reducing the resolution of the image input into the semantic segmentation model brings about a sharp drop in segmentation accuracy. According to experiments, taking the current popular semantic segmentation technique deep labv3+ as an example, the accuracy of segmentation at 512 x 1024 resolution as input is reduced from 70% to 63.2% compared to 448 x 896 resolution as input, and only 56.5% is used in case 256 x 512 is used as input. The exact drop is quite obvious. In a scene requiring accurate recognition, the requirement of practical use is obviously not met, so that it is necessary to provide a segmentation method capable of performing rapid calculation under the condition of being as low as possible and improving accuracy.
Disclosure of Invention
This section is intended to outline some aspects of embodiments of the invention and to briefly introduce some preferred embodiments. Some simplifications or omissions may be made in this section as well as in the description summary and in the title of the application, to avoid obscuring the purpose of this section, the description summary and the title of the invention, which should not be used to limit the scope of the invention.
The present invention has been made in view of the above-described problems occurring in the prior art. Therefore, the invention provides a bilateral semantic segmentation method based on super resolution, which is used for solving the problems of large calculation amount and low calculation speed of extra data generated by image semantic segmentation in actual problems, and meanwhile, when the input of a high-resolution image is pursued, part of mobile equipment and terminal equipment with low configuration cannot be calculated, so that the image semantic segmentation precision is greatly reduced.
In order to solve the technical problems, the invention provides the following technical scheme:
the invention provides a bilateral semantic segmentation method based on super resolution, which comprises the following steps:
the acquired image is input into a segmentation grid of the main branch to obtain a corresponding feature map;
the feature map is sent into a slave branch to obtain a new feature map;
calculating an image channel and a pixel respectively by two sub-branches in the sub-branch;
the image channel and the pixels are associated and fused to obtain a high-accuracy segmentation result under high resolution;
and sending the segmentation result into a fusion module, and guiding the main branch to carry out segmentation learning by combining the fusion result.
As a preferable scheme of the super-resolution-based bilateral semantic segmentation method, the invention comprises the following steps: the image is input into a segmentation grid of a main branch to obtain a corresponding feature map, and the steps comprise:
through a backbone network, an input image is input into a pooling layer through a convolution layer, so that parameters in a parameter matrix and the number of parameters in a subsequent convolution layer are reduced, and meanwhile, the phenomenon of model overfitting is relieved; and then the image is operated for 3 times through the convolution layer by the same method, and a corresponding characteristic diagram is obtained.
As a preferable scheme of the super-resolution-based bilateral semantic segmentation method, the invention comprises the following steps: the feature map is sent into a secondary branch, and a new feature map is obtained through reconstruction, and the method comprises the following steps: reconstructing a high-resolution picture from the branch, and inputting the high-resolution picture into the sub-branch to obtain a new feature map.
As a preferable scheme of the super-resolution-based bilateral semantic segmentation method, the invention comprises the following steps: the constructing a high resolution picture from a branch includes:
the sub-pixel convolution-based mode is adopted from the branch, and fine granularity structural information input in high resolution is effectively rebuilt according to the result of the feature map obtained by the main network, namely a single image super-resolution module;
as a preferable scheme of the super-resolution-based bilateral semantic segmentation method, the invention comprises the following steps: by two sub-branches of the slave branch, comprising: the sub-branches are divided into an inter-channel attention module and an inter-pixel attention module.
As a preferable scheme of the super-resolution-based bilateral semantic segmentation method, the invention comprises the following steps: the inter-channel attention module calculates an image channel and pixels, and the method comprises the following steps:
carrying out reshape operation on the feature map A with the specification of H multiplied by W multiplied by C to obtain a feature map C multiplied by N, and then carrying out softmax operation once;
the characteristic diagram with the specification of C multiplied by C, which is obtained after softmax operation, is marked as X, the characteristic diagram is transposed and multiplied by B again, H multiplied by W multiplied by C is obtained through reshape operation, and then a coefficient beta is multiplied to obtain the characteristic diagram, which is marked as D;
adding the obtained feature map D with the feature map A to obtain a final result;
the initial value of the coefficient β is 0, which is the optimal value obtained after the debugging by deep learning.
As a preferable scheme of the super-resolution-based bilateral semantic segmentation method, the invention comprises the following steps: the inter-pixel attention module calculates an image channel and pixels, comprising:
the characteristic diagram A with the specification of H multiplied by W multiplied by C is subjected to a convolution layer to obtain a new characteristic diagram B with the specification of C multiplied by H multiplied by W; then carrying out reshape operation on the obtained product so that the size is changed into C multiplied by N, wherein N=H multiplied by W, and obtaining a new characteristic diagram S with the specification of N multiplied by N after softmax operation; the sum of each row in S is 1, S ij The pixel weight of the pixel at the j position to the pixel at the i position can be understood as that the sum of the weights of all the pixels j to a certain fixed pixel i is 1;
the obtained feature map S is transposed and then multiplied by a feature map B with the specification of C multiplied by N and subjected to reshape operation to obtain a feature map with the specification of C multiplied by N, and the feature map S is subjected to reshape operation to obtain a feature map with the specification of C multiplied by H multiplied by W;
multiplying the obtained characteristic diagram with the specification of C multiplied by H multiplied by W by a coefficient alpha, and then adding the characteristic diagram with the characteristic diagram A to obtain a final result;
the initial value of the coefficient α is 0, which is the optimal value obtained after the debugging through deep learning.
As a preferable scheme of the super-resolution-based bilateral semantic segmentation method, the invention comprises the following steps: the image channel and the pixel are associated and fused to obtain a high-accuracy segmentation result under high resolution, comprising the following steps: and carrying out element summation on the two convolution feature graphs generated in the inter-pixel attention module and the inter-channel attention module, and then sending the generated new feature graph into a convolution layer to obtain a final segmentation result.
As a preferable scheme of the super-resolution-based bilateral semantic segmentation method, the invention comprises the following steps: the feature map performs element summation, including: the two convolution feature graphs are respectively connected with one convolution layer, and then the summation operation of the two convolution layers is carried out.
As a preferable scheme of the super-resolution-based bilateral semantic segmentation method, the invention comprises the following steps: the segmentation result is sent to a fusion module, and the main branch is guided to carry out segmentation learning by combining the fusion result, comprising the following steps:
carrying out feature fusion on the feature map obtained in the claim 2 and the new feature map generated in the claim 8, so that the low-resolution picture obtains additional picture structure information;
searching a local optimal solution through a loss function of feature fusion, matching more proper super parameters, guiding semantic segmentation of a main branch and guiding segmentation learning of a module, and finally achieving an optimal semantic segmentation result;
the loss function for feature fusion is expressed as follows:
L=L ce +w 1 L msc
Figure SMS_1
Figure SMS_2
wherein y is i Representing the probability of classification, S (X i ) Representing a super-resolution output picture obtained by a super-resolution module, N represents the sum of pixel points of the current picture, and p i The probability of judging as the target class y under the condition that the pixel point i is output in the main branch segmentation grid is represented; y is Y i Representing the true classification of the current pixel point; w (w) 1 Is a super parameter, can be adjusted in the actual segmentation learning process, and is generally set to be 0.1; l is the total loss function, consisting of multiple cross entropies, L ce Loss functions commonly used for semantic segmentation, i.e. cross entropy, L msc Is the mean square error.
Compared with the prior art, the invention has the beneficial effects that: the method has the advantages that the calculation speed is remarkably improved, and the precision of semantic segmentation is improved under the condition that the calculation amount is not additionally increased; carrying out feature reconstruction on a feature image generated under low resolution in a super-resolution mode to obtain fine granularity information under high resolution, wherein the high resolution image has clearer category classification; meanwhile, the serial attention mechanism modules also improve the segmentation accuracy; more importantly, the super-resolution module and the attention mechanism module of the branch can be deleted in the actual reasoning stage, and only a higher calculated amount is needed in the training process; under the same basic network architecture, the network architecture added with the branch can improve the segmentation precision by 3-5% under the low resolution input.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art. Wherein:
FIG. 1 is a flow chart of a super-resolution based bilateral semantic segmentation method according to one embodiment of the present invention;
FIG. 2 is a schematic diagram of an inter-channel attention module in a super-resolution based bilateral semantic segmentation method according to one embodiment of the present invention;
FIG. 3 is a schematic diagram of an inter-pixel attention module in a super-resolution based bilateral semantic segmentation method according to one embodiment of the present invention;
FIG. 4 is a graph of segmentation accuracy results of a super-resolution based bilateral semantic segmentation method according to one embodiment of the present invention;
fig. 5 is a schematic diagram of a super-resolution module of a super-resolution-based bilateral semantic segmentation method according to an embodiment of the present invention.
Detailed Description
So that the manner in which the above recited objects, features and advantages of the present invention can be understood in detail, a more particular description of the invention, briefly summarized above, may be had by reference to the embodiments, some of which are illustrated in the appended drawings. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, but the present invention may be practiced in other ways other than those described herein, and persons skilled in the art will readily appreciate that the present invention is not limited to the specific embodiments disclosed below.
Further, reference herein to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic can be included in at least one implementation of the invention. The appearances of the phrase "in one embodiment" in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments.
While the embodiments of the present invention have been illustrated and described in detail in the drawings, the cross-sectional view of the device structure is not to scale in the general sense for ease of illustration, and the drawings are merely exemplary and should not be construed as limiting the scope of the invention. In addition, the three-dimensional dimensions of length, width and depth should be included in actual fabrication.
Also in the description of the present invention, it should be noted that the orientation or positional relationship indicated by the terms "upper, lower, inner and outer", etc. are based on the orientation or positional relationship shown in the drawings, are merely for convenience of describing the present invention and simplifying the description, and do not indicate or imply that the apparatus or elements referred to must have a specific orientation, be constructed and operated in a specific orientation, and thus should not be construed as limiting the present invention. Furthermore, the terms "first, second, or third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
The terms "mounted, connected, and coupled" should be construed broadly in this disclosure unless otherwise specifically indicated and defined, such as: can be fixed connection, detachable connection or integral connection; it may also be a mechanical connection, an electrical connection, or a direct connection, or may be indirectly connected through an intermediate medium, or may be a communication between two elements. The specific meaning of the above terms in the present invention will be understood in specific cases by those of ordinary skill in the art.
Example 1
Referring to fig. 1, a first embodiment of the present invention provides a method for bilateral semantic segmentation based on super resolution, including:
s1, inputting an image into a segmentation grid of a main branch to obtain a corresponding feature map; taking VGG16 as an example, removing the last full connection layer to obtain a feature map with the specification of 14 multiplied by 512;
s2, dividing the obtained feature image into two parts, and sending the first part into a Decoder module corresponding to the VGG16, wherein the semantic segmentation result is obtained from the bilinear difference value to the size of the input image; the second part is sent to a super resolution module;
s3, calculating an image channel and pixels respectively through two sub-branches under super-resolution, an inter-channel attention module and an inter-pixel attention module;
s4, the image channels and the pixels are associated and fused to obtain a high-accuracy segmentation result under high resolution;
s5, sending the segmentation result into a fusion module, guiding the semantic segmentation of the main branch and the segmentation learning of the guiding module by combining the fusion result, and finally achieving the optimal semantic segmentation result;
by the semantic segmentation method of the double branches, the computing speed of the system can be effectively improved on the premise of not improving the computing amount of the system.
Example 2
Referring to fig. 2 and 3, a second embodiment of the present invention provides a super-resolution-based bilateral semantic segmentation method, which includes: a block diagram of an inter-channel attention module and a block diagram of an inter-pixel attention module;
the inter-picture channel attention module of fig. 2 refines to:
carrying out reshape operation on the feature map A with the specification of H multiplied by W multiplied by C to obtain a feature map C multiplied by N, and then carrying out softmax operation once;
the characteristic diagram with the specification of C multiplied by C, which is obtained after softmax operation, is marked as X, the characteristic diagram is transposed and multiplied by B again, H multiplied by W multiplied by C is obtained through reshape operation, and then a coefficient beta is multiplied to obtain the characteristic diagram, which is marked as D;
adding the obtained feature map D with the feature map A to obtain a final result;
the inter-pixel attention module of fig. 3 refines to:
the characteristic diagram A with the specification of H multiplied by W multiplied by C is subjected to a convolution layer to obtain a new characteristic diagram B with the specification of C multiplied by H multiplied by W; then carrying out reshape operation on the obtained product so that the size is changed into C multiplied by N, wherein N=H multiplied by W, and obtaining a new characteristic diagram S with the specification of N multiplied by N after softmax operation; the sum of each row in S is 1, S ij The pixel weight of the pixel at the j position to the pixel at the i position can be understood as that the sum of the weights of all the pixels j to a certain fixed pixel i is 1;
the obtained feature map S is transposed and then multiplied by a feature map B with the specification of C multiplied by N and subjected to reshape operation to obtain a feature map with the specification of C multiplied by N, and the feature map S is subjected to reshape operation to obtain a feature map with the specification of C multiplied by H multiplied by W;
multiplying the obtained characteristic diagram with the specification of C multiplied by H multiplied by W by a coefficient alpha, and then adding the characteristic diagram with the characteristic diagram A to obtain a final result;
wherein, the initial value of the coefficient beta and the coefficient alpha is 0, which is the optimal value obtained after the deep learning and the debugging;
according to the invention, after a series of operations of the inter-channel attention module and the inter-pixel attention module on the feature map, the precision of image segmentation can be effectively improved.
Example 3
Referring to fig. 4, a third embodiment of the present invention provides a super-resolution-based bilateral semantic segmentation method, which includes:
2000 pictures in the CityScaps are randomly selected as a data set, the resolution is respectively adjusted to 256×512, 320×640, 384×768, 448×896 and 512×1024, and the semantic segmentation method adopted by the invention, for example VGG16, is respectively input to obtain MIOU, namely, the result of the homonymy ratio.
Example 4
Referring to fig. 5, a fourth embodiment of the present invention provides a bilateral semantic segmentation method based on super resolution, including:
taking VGG16 as an example, a 14×14×512 feature map is obtained; in order to achieve the super-resolution of 448×448×3, a feature map of 112×112×12 is obtained after two convolution layers, wherein 12 channels of each pixel point are arranged to obtain a 2×2×3 image, and after the rest pixels are subjected to the above operation, the generated images are spliced to obtain a reconstructed super-resolution image of 448×448×3, and the formula is expressed as:
I SR =f L (I LR )
wherein f L The image of H×W×C is convolved into H×W×r 2 Combination of convolutional layer of C and subpixel convolutional layer, I LR Is a low resolution picture, I SR Is a super-resolution picture;
according to the invention, the characteristic reconstruction is carried out on the characteristic map generated under the low resolution by utilizing the mode based on super resolution, so that fine granularity information under the high resolution is obtained, and the image can be clearer.
It should be noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that the technical solution of the present invention may be modified or substituted without departing from the spirit and scope of the technical solution of the present invention, which is intended to be covered in the scope of the claims of the present invention.

Claims (10)

1. The bilateral semantic segmentation method based on super resolution is characterized by comprising the following steps of:
the acquired image is input into a segmentation grid of the main branch to obtain a corresponding feature map;
the feature map is sent into a slave branch to obtain a new feature map;
calculating an image channel and a pixel respectively by two sub-branches in the sub-branch;
the image channel and the pixels are associated and fused to obtain a high-accuracy segmentation result under high resolution;
and sending the segmentation result into a fusion module, and guiding the main branch to carry out segmentation learning by combining the fusion result.
2. The method for bilateral semantic segmentation based on super resolution as in claim 1, wherein the image is input into a segmentation grid of a main branch to obtain a corresponding feature map, the steps comprising:
through a backbone network, an input image is input into a pooling layer through a convolution layer, so that parameters in a parameter matrix and the number of parameters in a subsequent convolution layer are reduced, and meanwhile, the phenomenon of model overfitting is relieved; and then the image is operated for 3 times through the convolution layer by the same method, and a corresponding characteristic diagram is obtained.
3. The super-resolution-based bilateral semantic segmentation method as in claim 1 or 2, wherein the feature map feeding-in sub-branch and reconstructing to obtain a new feature map comprises:
reconstructing a high-resolution picture from the branch, and inputting the high-resolution picture into the sub-branch to obtain a new feature map.
4. The super-resolution based bilateral semantic segmentation method as in claim 3, wherein the constructing a high resolution picture from a branch comprises:
and the sub-pixel convolution-based mode is adopted from the branch, and the fine granularity structural information input in high resolution is effectively reconstructed according to the result of the feature map obtained by the main network, namely the single-image super-resolution module.
5. The super resolution based bilateral semantic segmentation method as in claim 4, wherein the method comprises: the sub-branches are divided into an inter-channel attention module and an inter-pixel attention module.
6. The super-resolution based bilateral semantic segmentation method as in claim 4 or 5, wherein the inter-channel attention module calculates image channels and pixels, the step comprising:
carrying out reshape operation on the feature map A with the specification of H multiplied by W multiplied by C to obtain a feature map C multiplied by N, and then carrying out softmax operation once;
the characteristic diagram with the specification of C multiplied by C, which is obtained after softmax operation, is marked as X, the characteristic diagram is transposed and multiplied by B again, H multiplied by W multiplied by C is obtained through reshape operation, and then a coefficient beta is multiplied to obtain the characteristic diagram, which is marked as D;
adding the obtained feature map D with the feature map A to obtain a final result;
the initial value of the coefficient β is 0, which is the optimal value obtained after the debugging by deep learning.
7. The super-resolution based bilateral semantic segmentation method as in claim 6, wherein the inter-pixel attention module calculates an image channel and a pixel, comprising:
the characteristic diagram A with the specification of H multiplied by W multiplied by C is subjected to a convolution layer to obtain a new characteristic diagram B with the specification of C multiplied by H multiplied by W; then carrying out reshape operation on the obtained product so that the size is changed into C multiplied by N, wherein N=H multiplied by W, and obtaining a new characteristic diagram S with the specification of N multiplied by N after softmax operation; the sum of each row in S is 1, S ij The pixel weight of the pixel at the j position to the pixel at the i position can be understood as that the sum of the weights of all the pixels j to a certain fixed pixel i is 1;
the obtained feature map S is transposed and then multiplied by a feature map B with the specification of C multiplied by N and subjected to reshape operation to obtain a feature map with the specification of C multiplied by N, and the feature map S is subjected to reshape operation to obtain a feature map with the specification of C multiplied by H multiplied by W;
multiplying the characteristic diagram with the specification of C multiplied by H multiplied by W obtained in the previous step by a coefficient alpha, and then adding the characteristic diagram with the characteristic diagram A to obtain a final result;
the initial value of the coefficient α is 0, which is the optimal value obtained after the debugging through deep learning.
8. The super-resolution-based bilateral semantic segmentation method as in claim 7, wherein the image channel and the pixel are fused in association to obtain a segmentation result with high accuracy at high resolution, comprising:
and carrying out element summation on the two convolution feature graphs generated in the inter-pixel attention module and the inter-channel attention module, and then sending the generated new feature graph into a convolution layer to obtain a final segmentation result.
9. The super-resolution based bilateral semantic segmentation method according to claim 7 or 8, wherein the feature map performs element summation, comprising:
the two convolution feature graphs are respectively connected with one convolution layer, and then the summation operation of the two convolution layers is carried out.
10. The method for bilateral semantic segmentation based on super resolution as in claim 9, wherein the segmentation result is sent to a fusion module, and the main branch is guided to perform segmentation learning by combining the fusion result, comprising:
carrying out feature fusion on the feature map obtained in the claim 2 and the new feature map generated in the claim 8, so that the low-resolution picture obtains additional picture structure information;
and searching a local optimal solution through a loss function of feature fusion, matching more proper super parameters, guiding semantic segmentation of a main branch and guiding segmentation learning of a module, and finally achieving an optimal semantic segmentation result.
CN202310159918.XA 2023-02-23 2023-02-23 Double-sided semantic segmentation method based on super resolution Pending CN116188778A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310159918.XA CN116188778A (en) 2023-02-23 2023-02-23 Double-sided semantic segmentation method based on super resolution

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310159918.XA CN116188778A (en) 2023-02-23 2023-02-23 Double-sided semantic segmentation method based on super resolution

Publications (1)

Publication Number Publication Date
CN116188778A true CN116188778A (en) 2023-05-30

Family

ID=86434242

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310159918.XA Pending CN116188778A (en) 2023-02-23 2023-02-23 Double-sided semantic segmentation method based on super resolution

Country Status (1)

Country Link
CN (1) CN116188778A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117409208A (en) * 2023-12-14 2024-01-16 武汉纺织大学 Real-time clothing image semantic segmentation method and system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117409208A (en) * 2023-12-14 2024-01-16 武汉纺织大学 Real-time clothing image semantic segmentation method and system
CN117409208B (en) * 2023-12-14 2024-03-08 武汉纺织大学 Real-time clothing image semantic segmentation method and system

Similar Documents

Publication Publication Date Title
CN108537192B (en) Remote sensing image earth surface coverage classification method based on full convolution network
CN110232394A (en) A kind of multi-scale image semantic segmentation method
CN109101975A (en) Image, semantic dividing method based on full convolutional neural networks
CN109509149A (en) A kind of super resolution ratio reconstruction method based on binary channels convolutional network Fusion Features
Zhao et al. Pyramid global context network for image dehazing
CN113516124B (en) Electric energy meter electricity consumption identification algorithm based on computer vision technology
CN110349087B (en) RGB-D image high-quality grid generation method based on adaptive convolution
CN109191511A (en) A kind of binocular solid matching process based on convolutional neural networks
CN116188778A (en) Double-sided semantic segmentation method based on super resolution
CN114549555A (en) Human ear image planning and division method based on semantic division network
CN116469100A (en) Dual-band image semantic segmentation method based on Transformer
CN116486074A (en) Medical image segmentation method based on local and global context information coding
CN115830575A (en) Transformer and cross-dimension attention-based traffic sign detection method
CN116343053A (en) Automatic solid waste extraction method based on fusion of optical remote sensing image and SAR remote sensing image
CN117576402B (en) Deep learning-based multi-scale aggregation transducer remote sensing image semantic segmentation method
CN113610707A (en) Video super-resolution method based on time attention and cyclic feedback network
CN103226818B (en) Based on the single-frame image super-resolution reconstruction method of stream shape canonical sparse support regression
CN115578260B (en) Attention method and system for directional decoupling of image super-resolution
CN111080533A (en) Digital zooming method based on self-supervision residual error perception network
CN109740551A (en) A kind of night Lane detection method and system based on computer vision
Park et al. Image super-resolution using dilated window transformer
CN111667443B (en) Context fusion-based silk pattern image restoration method
CN111553921B (en) Real-time semantic segmentation method based on channel information sharing residual error module
CN112164065B (en) Real-time image semantic segmentation method based on lightweight convolutional neural network
CN115170921A (en) Binocular stereo matching method based on bilateral grid learning and edge loss

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination