CN112164065A - Real-time image semantic segmentation method based on lightweight convolutional neural network - Google Patents

Real-time image semantic segmentation method based on lightweight convolutional neural network Download PDF

Info

Publication number
CN112164065A
CN112164065A CN202011036023.XA CN202011036023A CN112164065A CN 112164065 A CN112164065 A CN 112164065A CN 202011036023 A CN202011036023 A CN 202011036023A CN 112164065 A CN112164065 A CN 112164065A
Authority
CN
China
Prior art keywords
neural network
convolutional neural
layer
convolution
lightweight
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011036023.XA
Other languages
Chinese (zh)
Other versions
CN112164065B (en
Inventor
刘发贵
唐泉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN202011036023.XA priority Critical patent/CN112164065B/en
Publication of CN112164065A publication Critical patent/CN112164065A/en
Application granted granted Critical
Publication of CN112164065B publication Critical patent/CN112164065B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a real-time image semantic segmentation method based on a lightweight convolutional neural network. The method comprises the following steps: constructing a lightweight convolutional neural network; training the constructed lightweight convolutional neural network; and performing semantic segmentation on the image in the given scene by using the trained lightweight neural network. In the constructed convolutional neural network, a multi-path processing mechanism is fused, the multi-space scale characteristics of pixels can be effectively encoded, and the problem that multi-scale targets are difficult to distinguish is solved. Meanwhile, the invention greatly reduces the model parameters by combining depth-wise convolution, the constructed lightweight convolutional neural network has only 90 ten thousand parameters which are far lower than that of the existing method, the aim of model lightweight is realized, and the real-time processing requirement is met. In addition, the lightweight convolutional neural network is based on a full convolutional network, end-to-end training and reasoning are realized, and the training and deployment process of the model is greatly simplified.

Description

Real-time image semantic segmentation method based on lightweight convolutional neural network
Technical Field
The invention belongs to the field of computer vision, and particularly relates to a real-time image semantic segmentation method based on a lightweight convolutional neural network.
Background
The purpose of image semantic segmentation is to give a semantic category label to each pixel point in an image, and the semantic category label belongs to a pixel-level dense classification task. In the whole, semantic segmentation is one of basic tasks for realizing comprehensive scene understanding and paving roads, and more applications acquire knowledge from image data, including automatic driving, man-machine interaction, indoor navigation, image editing, augmented reality, virtual reality and the like.
Image semantic segmentation methods can be divided into two categories: one is the traditional methods, such as threshold-based segmentation, edge-based segmentation, region-based segmentation, graph theory-based segmentation, energy functional-based segmentation, etc.; another class is deep learning based methods. In recent years, with the development of deep neural networks, deep learning has shown an increasing advantage in the field of computer vision. The deep convolutional network is particularly effective for image data, can be used for efficiently extracting pixel features in an image, overcomes the limitation that the traditional method seriously depends on manual feature selection, and obtains a better segmentation effect.
In the text "full Convolutional Networks for Semantic Segmentation", Jonathan Long et al proposed that Full Convolutional Networks (FCN) be used for Semantic Segmentation, and the development of the Semantic Segmentation technology based on deep learning in recent years is greatly promoted. Various models based on FCN significantly improve the accuracy of semantic segmentation, but the models usually have millions of model parameters, so that the reasoning efficiency is low, and the practical application of the models is seriously hindered. In fields such as autopilot, indoor navigation, augmented reality, and virtual reality, accurate and efficient semantic segmentation mechanisms are needed to achieve low-latency processing.
Disclosure of Invention
In order to realize accurate and efficient semantic segmentation of various scenes and overcome the problem that target scales in the scenes are obviously changed, the invention provides an image semantic segmentation method based on a lightweight convolutional neural network. By constructing the lightweight convolutional neural network, the multi-scale features of the pixels are extracted, the distinguishing capability of the pixel features is enhanced, and the purpose of accurate and efficient semantic segmentation is achieved.
The purpose of the invention is realized by at least one of the following technical solutions.
A real-time image semantic segmentation method based on a lightweight convolutional neural network comprises the following steps:
s1, constructing a lightweight convolutional neural network;
s2, training the constructed lightweight convolutional neural network;
and S3, performing semantic segmentation on the image in the given scene by using the trained light weight neural network.
Further, step S1 includes the steps of:
s1.1, constructing a multi-scale processing unit for acquiring multi-scale features of pixels;
s1.2, replacing a first standard 3 multiplied by 3 convolution of a residual error network Basic block (Basic block of ResNet) by the constructed multi-scale processing unit to obtain a pyramid representation module;
s1.3, constructing a lightweight convolutional neural network according to a network structure and parameter setting; the first layer is standard 3 × 3 convolution and is used as an initial layer to expand the characteristic dimension of the pixels to 16; then, 8 pyramid representation modules are continuously used for effectively encoding the multi-scale features of the pixels, capturing the long-distance pixel dependency relationship, enhancing the distinguishing capability of the pixel features and improving the segmentation performance of the multi-scale target;
and S1.4, restoring the resolution of the segmentation result to be the same as that of the input image by using a bilinear difference function as an up-sampling operator.
Further, the multi-scale processing unit includes 4 parallel convolutional layer branches, each of which is a standard 1 × 1 convolution, with a hole rate (ratio) of { r }1,r2,r3Convolution of 3 holes (scaled convolution); the hole convolution is depth-wise convolution at the same time; the multi-scale processing unit is connected with 4 parallel convolution layer branch outputs in the channel dimension and obtains the outputs after a standard 1 x 1 convolution mapping; the multi-scale processing unit has 2 convolutional layers in total.
Further, the pyramid characterization module is obtained by replacing the first standard 3 × 3 convolution of the base Block (Basic Block) of the residual network (ResNet18) with a multi-scale processing unit; the pyramid representation module comprises 3 convolution layers in total; the lightweight convolutional neural network uses a parametric modified linear unit (PReLU) as an activation function.
Further, the convolutional neural network has a total of 27 convolutional layers, and the network structure and parameter settings are as follows:
the 1 st layer is standard 3 multiplied by 3 convolution, the step length is 2, and the number of output channels is 16; the 2 nd to 4 th layers comprise a pyramid representation module, the step length is 1, and the number of output channels is 32; the 5 th layer to the 7 th layer comprise a pyramid representation module, the step length is 2, and the number of output channels is 32; the 8 th layer to the 16 th layer comprise three pyramid representation modules, the step length is 1, and the number of output channels is 64; the 17 th layer to the 19 th layer comprise a pyramid representation module, the step length is 2, and the number of output channels is 64; the 20 th layer to the 25 th layer comprise two pyramid representation modules, the step length is 1, and the number of output channels is 128; the 26 th layer and the 27 th layer are both classified layers and respectively comprise a standard 3 x 3 convolution and a 1 x 1 convolution; the down-sampling multiple of the neural network is 8, namely the resolution of the output feature map is 1/8 of that of the input image.
Furthermore, the pyramids of the 2 nd layer to the 7 th layer represent the module voidage as {1,2,4 }; the pyramid representation module voidage of the 8 th layer to the 19 th layer is {3,6,9 }; the pyramid representation module voidage of the 20 th layer to the 22 th layer is {7,13,19}, and the pyramid representation module voidage of the 23 rd layer to the 25 th layer is {13,25,37 }.
Further, step S2 includes the steps of:
s2.1, inputting a training image and a corresponding semantic segmentation label;
s2.2, training parameters of the lightweight convolutional neural network by using a cross entropy loss function, wherein the parameters are as follows:
Figure BDA0002705129690000021
wherein N representsThe number of semantic categories; y isiIndicating a pixel class label, if a pixel belongs to class i, yi1, otherwise yi=0;
Figure BDA0002705129690000031
Representing the prediction output of the lightweight convolutional neural network, i.e. the probability that the predicted pixel belongs to class i;
and S2.3, training the lightweight convolutional neural network to converge by using a gradient descent method.
Further, step S3 includes the steps of:
s3.1, inputting an image to be segmented;
s3.2, carrying out forward propagation by the lightweight convolutional neural network to obtain probability distribution of each pixel prediction category;
and S3.3, selecting the class with the maximum probability value as the prediction class of the light weight convolutional neural network.
Compared with the prior art, the method has the following advantages and effects:
in the constructed convolutional neural network, a multi-path processing mechanism is fused, the multi-space scale characteristics of pixels can be effectively encoded, and the problem that multi-scale targets are difficult to distinguish is solved. Meanwhile, the invention greatly reduces the model parameters by combining depth-wise convolution, the constructed lightweight convolutional neural network has only 90 ten thousand parameters which are far lower than that of the existing method, the aim of model lightweight is realized, and the real-time processing requirement is met. In addition, the lightweight convolutional neural network is based on a full convolutional network, end-to-end training and reasoning are realized, and the training and deployment process of the model is greatly simplified.
Drawings
FIG. 1 is a schematic structural diagram of a multi-scale processing unit according to an embodiment of the present invention;
FIG. 2 is a schematic structural diagram of a residual network basic block in an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a pyramid representation module according to an embodiment of the present invention.
Detailed Description
In order to make the technical solutions and advantages of the present invention more apparent, the following detailed description of the embodiments of the present invention is provided with reference to the accompanying drawings and examples, but the embodiments and protection of the present invention are not limited thereto.
First, the meanings of the abbreviations in the attached figures are explained:
conv: refers to the convolutional layer (restriction).
BN: the layer of finger normalization (Batch normalization).
Concat: refers to the operation (collocation) that connects feature maps in channel dimensions.
PReLU: parametrically modified linear units (parametrical restifier linear units).
ReLU: a correction linear unit (Rectifier linearity unit).
DWC: depth-wise convolution (Depth-wise convolution).
ri: a hole rate (ratio).
Example (b):
a real-time image semantic segmentation method based on a lightweight convolutional neural network comprises the following steps:
s1, constructing a lightweight convolutional neural network, comprising the following steps:
s1.1, constructing a multi-scale processing unit for acquiring multi-scale features of pixels;
as shown in FIG. 1, the multi-scale processing unit includes 4 parallel convolutional layer branches, each of which is a standard 1 × 1 convolution, and has a void rate (ratio) of { r }1,r2,r3Convolution of 3 holes (scaled convolution); the hole convolution is depth-wise convolution at the same time; the multi-scale processing unit is connected with 4 parallel convolution layer branch outputs in the channel dimension and obtains the outputs after a standard 1 x 1 convolution mapping; the multi-scale processing unit has 2 convolutional layers in total.
S1.2, replacing a first standard 3 multiplied by 3 convolution of a residual error network Basic block (Basic block of ResNet) by the constructed multi-scale processing unit to obtain a pyramid representation module;
the pyramid characterization module shown in FIG. 3 was obtained by replacing the first standard 3 × 3 convolution of the Basic Block (Basic Block) of the residual network (ResNet18) shown in FIG. 2 with a multi-scale processing unit; the pyramid representation module has 3 convolution layers in total.
S1.3, constructing a lightweight convolutional neural network according to a network structure and parameter setting as shown in Table 1; the first layer is standard 3 × 3 convolution and is used as an initial layer to expand the characteristic dimension of the pixels to 16; then, 8 pyramid representation modules are continuously used for effectively encoding the multi-scale features of the pixels, capturing the long-distance pixel dependency relationship, enhancing the distinguishing capability of the pixel features and improving the segmentation performance of the multi-scale target;
table 1 network architecture and parameter settings
Figure BDA0002705129690000041
And S1.4, restoring the resolution of the segmentation result to be the same as that of the input image by using a bilinear difference function as an up-sampling operator.
The lightweight convolutional neural network uses a parametric modified linear unit (PReLU) as an activation function.
S2, training the constructed lightweight convolutional neural network, and the method comprises the following steps:
s2.1, inputting a training image and a corresponding semantic segmentation label;
s2.2, training parameters of the lightweight convolutional neural network by using a cross entropy loss function, wherein the parameters are as follows:
Figure BDA0002705129690000051
wherein N represents the number of semantic categories; y isiIndicating a pixel class label, if a pixel belongs to class i, yi1, otherwise yi=0;
Figure BDA0002705129690000052
Representing the prediction output of a lightweight convolutional neural network, i.e. the prediction pixels belong to a classThe probability of other i;
and S2.3, training the lightweight convolutional neural network to converge by using a gradient descent method.
S3, performing semantic segmentation on the image in the given scene by using the trained lightweight neural network, wherein the semantic segmentation method comprises the following steps:
s3.1, inputting an image to be segmented;
s3.2, carrying out forward propagation by the lightweight convolutional neural network to obtain probability distribution of each pixel prediction category;
and S3.3, selecting the class with the maximum probability value as the prediction class of the light weight convolutional neural network.
In this embodiment, the lightweight convolutional neural network of the present invention only includes 90 ten thousand model parameters, and obtains a segmentation performance of an average intersection over unit (mlou) of 73.9% on a multi-target complex street scene data set, referred to as "cityscaps"; in the text of "Real-Time High-Performance Semantic Image Segmentation of exhaust Street Scenes" by Genshun Dong et al in 2020, 620 ten thousand model parameters are used to obtain mIoU 73.6% of Segmentation Performance in the Cityscapes data set, and under the condition of not losing the Segmentation Performance, the model parameters are only 14.5% of the Segmentation Performance, so that the calculation efficiency is greatly improved; the method described by Yu Wang et al in Lednet A Lightweight Encoder-Decoder for Real-Time Semantic Segmentation comprises 94 ten thousand model parameters, mIoU 69.2% is obtained from the Cityscapes data set, and the invention obtains the performance improvement of mIoU 4.7% by using similar parameter scale; under the environment of NVIDIA RTX 2080Ti single display card, when the resolution of the input image is 1024 multiplied by 1024, the method has the segmentation speed of 42 Frames Per Second (FPS), and completely meets the real-time processing requirement.

Claims (8)

1. A real-time image semantic segmentation method based on a lightweight convolutional neural network is characterized by comprising the following steps:
s1, constructing a lightweight convolutional neural network;
s2, training the constructed lightweight convolutional neural network;
and S3, performing semantic segmentation on the image in the given scene by using the trained light weight neural network.
2. The method for semantically segmenting the real-time image based on the light-weighted convolutional neural network as claimed in claim 1, wherein the step S1 comprises the following steps:
s1.1, constructing a multi-scale processing unit for acquiring multi-scale features of pixels;
s1.2, replacing a first standard 3 multiplied by 3 convolution of a residual error network Basic block (Basic block of ResNet) by the constructed multi-scale processing unit to obtain a pyramid representation module;
s1.3, constructing a lightweight convolutional neural network according to a network structure and parameter setting; the first layer is standard 3 × 3 convolution and is used as an initial layer to expand the characteristic dimension of the pixels to 16; then, 8 pyramid representation modules are continuously used for effectively encoding the multi-scale features of the pixels, capturing the long-distance pixel dependency relationship, enhancing the distinguishing capability of the pixel features and improving the segmentation performance of the multi-scale target;
and S1.4, restoring the resolution of the segmentation result to be the same as that of the input image by using a bilinear difference function as an up-sampling operator.
3. The method of claim 2, wherein the multi-scale processing unit comprises 4 parallel convolutional layer branches, each branch is a standard 1 × 1 convolution, and the hole rate (ratio) is { r }1,r2,r3Convolution of 3 holes (scaled convolution); the hole convolution is depth-wise convolution at the same time; the multi-scale processing unit is connected with 4 parallel convolution layer branch outputs in the channel dimension and obtains the outputs after a standard 1 x 1 convolution mapping; the multi-scale processing unit has 2 convolutional layers in total.
4. The method for semantically segmenting the real-time image based on the lightweight convolutional neural network as claimed in claim 3, wherein the pyramid representation module is obtained by replacing the first standard 3 x 3 convolution of the Basic Block (Basic Block) of the residual network (ResNet18) with a multi-scale processing unit; the pyramid representation module comprises 3 convolution layers in total; the lightweight convolutional neural network uses a parametric modified linear unit (PReLU) as an activation function.
5. The method for semantically segmenting the real-time image based on the light-weight convolutional neural network as claimed in claim 4, wherein the convolutional neural network has a total of 27 convolutional layers, and the network structure and parameters are set as follows:
the 1 st layer is standard 3 multiplied by 3 convolution, the step length is 2, and the number of output channels is 16; the 2 nd to 4 th layers comprise a pyramid representation module, the step length is 1, and the number of output channels is 32; the 5 th layer to the 7 th layer comprise a pyramid representation module, the step length is 2, and the number of output channels is 32; the 8 th layer to the 16 th layer comprise three pyramid representation modules, the step length is 1, and the number of output channels is 64; the 17 th layer to the 19 th layer comprise a pyramid representation module, the step length is 2, and the number of output channels is 64; the 20 th layer to the 25 th layer comprise two pyramid representation modules, the step length is 1, and the number of output channels is 128; the 26 th layer and the 27 th layer are both classified layers and respectively comprise a standard 3 x 3 convolution and a 1 x 1 convolution; the down-sampling multiple of the neural network is 8, namely the resolution of the output feature map is 1/8 of that of the input image.
6. The method for semantically segmenting the real-time image based on the lightweight convolutional neural network as claimed in claim 5, wherein the pyramid representation module voidage of the 2 nd to 7 th layers is {1,2,4 }; the pyramid representation module voidage of the 8 th layer to the 19 th layer is {3,6,9 }; the pyramid representation module voidage of the 20 th layer to the 22 th layer is {7,13,19}, and the pyramid representation module voidage of the 23 rd layer to the 25 th layer is {13,25,37 }.
7. The method for semantically segmenting the real-time image based on the light-weighted convolutional neural network as claimed in claim 6, wherein the step S2 comprises the following steps:
s2.1, inputting a training image and a corresponding semantic segmentation label;
s2.2, training parameters of the lightweight convolutional neural network by using a cross entropy loss function, wherein the parameters are as follows:
Figure FDA0002705129680000021
wherein N represents the number of semantic categories; y isiIndicating a pixel class label, if a pixel belongs to class i, yi1, otherwise yi=0;
Figure FDA0002705129680000022
Representing the prediction output of the lightweight convolutional neural network, i.e. the probability that the predicted pixel belongs to class i;
and S2.3, training the lightweight convolutional neural network to converge by using a gradient descent method.
8. The method for semantically segmenting the real-time image based on the light-weighted convolutional neural network as claimed in claim 7, wherein the step S3 comprises the following steps:
s3.1, inputting an image to be segmented;
s3.2, carrying out forward propagation by the lightweight convolutional neural network to obtain probability distribution of each pixel prediction category;
and S3.3, selecting the class with the maximum probability value as the prediction class of the light weight convolutional neural network.
CN202011036023.XA 2020-09-27 2020-09-27 Real-time image semantic segmentation method based on lightweight convolutional neural network Active CN112164065B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011036023.XA CN112164065B (en) 2020-09-27 2020-09-27 Real-time image semantic segmentation method based on lightweight convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011036023.XA CN112164065B (en) 2020-09-27 2020-09-27 Real-time image semantic segmentation method based on lightweight convolutional neural network

Publications (2)

Publication Number Publication Date
CN112164065A true CN112164065A (en) 2021-01-01
CN112164065B CN112164065B (en) 2023-10-13

Family

ID=73861275

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011036023.XA Active CN112164065B (en) 2020-09-27 2020-09-27 Real-time image semantic segmentation method based on lightweight convolutional neural network

Country Status (1)

Country Link
CN (1) CN112164065B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112651468A (en) * 2021-01-18 2021-04-13 佛山职业技术学院 Multi-scale lightweight image classification method and storage medium thereof
CN113989206A (en) * 2021-10-20 2022-01-28 杭州深睿博联科技有限公司 Lightweight model-based bone age prediction method and device
CN114781483A (en) * 2022-03-18 2022-07-22 华南理工大学 Volvariella volvacea growth state identification method based on convolutional neural network
CN114937148A (en) * 2022-06-08 2022-08-23 华南理工大学 Small target feature enhanced image segmentation method and system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107480726A (en) * 2017-08-25 2017-12-15 电子科技大学 A kind of Scene Semantics dividing method based on full convolution and shot and long term mnemon
CN108062756A (en) * 2018-01-29 2018-05-22 重庆理工大学 Image, semantic dividing method based on the full convolutional network of depth and condition random field
CN109215034A (en) * 2018-07-06 2019-01-15 成都图必优科技有限公司 A kind of Weakly supervised image, semantic dividing method for covering pond based on spatial pyramid
CN109325534A (en) * 2018-09-22 2019-02-12 天津大学 A kind of semantic segmentation method based on two-way multi-Scale Pyramid
CN110188817A (en) * 2019-05-28 2019-08-30 厦门大学 A kind of real-time high-performance street view image semantic segmentation method based on deep learning
CN110232394A (en) * 2018-03-06 2019-09-13 华南理工大学 A kind of multi-scale image semantic segmentation method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107480726A (en) * 2017-08-25 2017-12-15 电子科技大学 A kind of Scene Semantics dividing method based on full convolution and shot and long term mnemon
CN108062756A (en) * 2018-01-29 2018-05-22 重庆理工大学 Image, semantic dividing method based on the full convolutional network of depth and condition random field
CN110232394A (en) * 2018-03-06 2019-09-13 华南理工大学 A kind of multi-scale image semantic segmentation method
CN109215034A (en) * 2018-07-06 2019-01-15 成都图必优科技有限公司 A kind of Weakly supervised image, semantic dividing method for covering pond based on spatial pyramid
CN109325534A (en) * 2018-09-22 2019-02-12 天津大学 A kind of semantic segmentation method based on two-way multi-Scale Pyramid
CN110188817A (en) * 2019-05-28 2019-08-30 厦门大学 A kind of real-time high-performance street view image semantic segmentation method based on deep learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
FAGUI LIU 等: "FTPN: Scene Text Detection With Feature Pyramid Based Text Proposal Network", 《IEEEACESS》, pages 44219 - 44228 *
蔡烁;胡航滔;王威;: "基于深度卷积网络的高分遥感图像语义分割", 信号处理, no. 12, pages 84 - 90 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112651468A (en) * 2021-01-18 2021-04-13 佛山职业技术学院 Multi-scale lightweight image classification method and storage medium thereof
CN112651468B (en) * 2021-01-18 2024-06-04 佛山职业技术学院 Multi-scale lightweight image classification method and storage medium thereof
CN113989206A (en) * 2021-10-20 2022-01-28 杭州深睿博联科技有限公司 Lightweight model-based bone age prediction method and device
CN114781483A (en) * 2022-03-18 2022-07-22 华南理工大学 Volvariella volvacea growth state identification method based on convolutional neural network
CN114781483B (en) * 2022-03-18 2024-05-28 华南理工大学 Straw mushroom growth state identification method based on convolutional neural network
CN114937148A (en) * 2022-06-08 2022-08-23 华南理工大学 Small target feature enhanced image segmentation method and system
CN114937148B (en) * 2022-06-08 2024-09-06 华南理工大学 Small target feature enhanced image segmentation method and system

Also Published As

Publication number Publication date
CN112164065B (en) 2023-10-13

Similar Documents

Publication Publication Date Title
CN110570371B (en) Image defogging method based on multi-scale residual error learning
CN112164065A (en) Real-time image semantic segmentation method based on lightweight convolutional neural network
US20210209395A1 (en) Method, electronic device, and storage medium for recognizing license plate
CN110197182A (en) Remote sensing image semantic segmentation method based on contextual information and attention mechanism
CN110659664B (en) SSD-based high-precision small object identification method
CN110738207A (en) character detection method for fusing character area edge information in character image
CN113870335A (en) Monocular depth estimation method based on multi-scale feature fusion
CN113052106B (en) Airplane take-off and landing runway identification method based on PSPNet network
CN113554032A (en) Remote sensing image segmentation method based on multi-path parallel network of high perception
CN111008979A (en) Robust night image semantic segmentation method
CN113822383A (en) Unmanned aerial vehicle detection method and system based on multi-domain attention mechanism
CN113436198A (en) Remote sensing image semantic segmentation method for collaborative image super-resolution reconstruction
CN111414938B (en) Target detection method for bubbles in plate heat exchanger
CN111739037B (en) Semantic segmentation method for indoor scene RGB-D image
CN117975418A (en) Traffic sign detection method based on improved RT-DETR
CN113963333B (en) Traffic sign board detection method based on improved YOLOF model
CN113255675B (en) Image semantic segmentation network structure and method based on expanded convolution and residual path
CN110110775A (en) A kind of matching cost calculation method based on hyper linking network
CN113902904B (en) Lightweight network architecture system
Qian et al. A semantic segmentation method for remote sensing images based on DeepLab V3
CN115578436A (en) Monocular depth prediction method based on multi-level feature parallel interaction fusion
CN115424012A (en) Lightweight image semantic segmentation method based on context information
CN113192009B (en) Crowd counting method and system based on global context convolutional network
US20240046601A1 (en) Deep recognition model training method, electronic device and readable storage medium
Li et al. Easily deployable real-time detection method for small traffic signs

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant