CN114266952A - Real-time semantic segmentation method based on deep supervision - Google Patents

Real-time semantic segmentation method based on deep supervision Download PDF

Info

Publication number
CN114266952A
CN114266952A CN202111600850.1A CN202111600850A CN114266952A CN 114266952 A CN114266952 A CN 114266952A CN 202111600850 A CN202111600850 A CN 202111600850A CN 114266952 A CN114266952 A CN 114266952A
Authority
CN
China
Prior art keywords
image
training
semantic segmentation
loss
deep supervision
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111600850.1A
Other languages
Chinese (zh)
Inventor
柯逍
蒋培龙
曾淦雄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuzhou University
Original Assignee
Fuzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuzhou University filed Critical Fuzhou University
Priority to CN202111600850.1A priority Critical patent/CN114266952A/en
Publication of CN114266952A publication Critical patent/CN114266952A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The invention provides a real-time semantic segmentation method based on deep supervision, which comprises the following steps of; step S1, collecting scene image data for deep supervision aiming at a specific application scene, and constructing a scene image database; step S2, carrying out pixel level annotation on the scene images in the database, and deriving an annotation file in a PASCAL VOC format so as to enable the annotation file to meet the training requirement of a semantic segmentation task; step S3, constructing a deep supervision-based real-time semantic segmentation network CFSegNet; step S4, training a CFSegNet neural network model by using the labeled data set; step S5, preprocessing image data collected in an application scene, and then inputting the preprocessed image data into a CFSegNet neural network model to obtain an image semantic segmentation result; the method has the advantages of high accuracy, good timeliness and low requirement on the calculation performance of the equipment, and is suitable for being deployed in terminal equipment with limited performance.

Description

Real-time semantic segmentation method based on deep supervision
Technical Field
The invention relates to the technical field of pattern recognition and computer vision, in particular to a real-time semantic segmentation method based on deep supervision.
Background
In recent years, computer vision related technologies are emerging in more and more fields including automatic driving, medical image segmentation, and so on, and it can be said that computer vision is leading a new research enthusiasm. Computer vision is similar to biological vision systems, and the computer and other hardware devices are used for processing pictures and videos to obtain scene information, so that people can make decisions.
The main task of computer vision is the localization and detection of objects, i.e. objects that need to be position-marked and their kind identified.
Usually, we are interested in only some objects or regions in the image, and how to distinguish the interested parts from the picture needs to perform image segmentation. The image segmentation is to divide the regions according to a certain rule (such as object edges and pixel value boundaries), so that the features of the same divided regions are similar, and the features of different regions are different. In brief, a picture can be divided into regions with different meanings through image segmentation, wherein the region with the important meaning is called as an object or a foreground, and the other regions are called as backgrounds, so that the region with the important meaning can be distinguished from the background and further analyzed, and the whole picture can be understood more clearly. Image semantic segmentation requires region division of different objects according to their boundaries and pixel-level class division of each region. In some scenarios such as automatic driving, the semantic segmentation model is deployed in the edge device, and at this time, the model is required to perform reasoning at a faster speed on the premise of maintaining high performance, and how to obtain a good compromise between the speed and the performance is a very challenging problem.
Disclosure of Invention
The real-time semantic segmentation method based on deep supervision provided by the invention has the advantages of high accuracy, good timeliness and low requirement on the calculation performance of equipment, and is suitable for being deployed in terminal equipment with limited performance.
The real-time semantic segmentation method based on deep supervision comprises the following steps;
step S1, collecting scene image data for deep supervision aiming at a specific application scene, and constructing a scene image database;
step S2, carrying out pixel level annotation on the scene images in the database, and deriving an annotation file in PASCALVOC format so as to enable the annotation file to meet the training requirement of the semantic segmentation task;
step S3, constructing a deep supervision-based real-time semantic segmentation network CFSegNet;
step S4, training a CFSegNet neural network model by using the labeled data set;
and step S5, preprocessing image data collected in the application scene, and then inputting the preprocessed image data into the CFSegNet neural network model to obtain an image semantic segmentation result.
The step S1 specifically includes the following steps:
step S11: analyzing the influence of various factors in an application scene on an image semantic segmentation result, wherein the factors comprise weather or illumination;
step S12: according to the analysis result in the step S11, adverse effects are overcome by using a multi-image sampling method including taking as many captured application scene images as possible to cover various scenes that are likely to occur;
step S13: and sorting the collected images, and eliminating images which are not suitable for the training task due to repeated and error factors to obtain a corresponding scene image database.
The step S2 specifically includes the following steps;
step S21: analyzing semantic categories to be divided under an application scene by combining the acquired image information according to application requirements;
step S22: downloading and installing image annotation software labelme, and configuring the labelme according to the semantic category obtained in the step S21;
step S23: framing the category edge of each image obtained in the step S1 by using labelme image labeling software, and storing labeling information into a json file with the same name as each image;
step S24: and converting the json file generated in the step S23 into the PASCALVOC format by using a labelme2voc script in labelme image annotation software so as to meet the training requirement of the semantic segmentation task.
Step S3 specifically includes the following steps:
step S31: adopting ResNet-18 as an encoder of CFSegNet, wherein a bottleneck layer of ResNet-18 performs down-sampling on an input image by 4 times, and except for a first stage, ResNet-18 performs down-sampling on the image by 2 times in three subsequent stages;
step S32: ResNet-18 saves the expression of the down-sampling stage through intensive connection in the first to third stages, and introduces a deep supervision module to supervise the expression output by the encoder in the second to fourth stages, thereby reducing the loss of space information in the encoding stage;
step S33: inputting the output result of the fourth stage of the encoder into a pyramid pooling module PPM to obtain an expression with rich multi-scale information;
step S34: inputting the expression obtained in the step S33 into a cascaded upsampling path, and performing 2 times upsampling on the expression for a total of 3 times by using a channel fusion module CFM in combination with the dense connection in the step S32 to obtain an expression of fusing semantic information and spatial information;
step S35: the expression obtained in step S34 is upsampled by 8 times by a bilinear interpolation algorithm, and a prediction result is output by a 1 × 1 convolution.
Step S4 specifically includes the following steps:
step S41: training the model proposed in step S3, and setting the initial parameters as follows:
initial learning rate, i.e., -learning rate: 0.01;
weight decay, namely-weight decay: 0.0005;
momentum, i.e., -momentum: 0.9;
in the training stage, polynomial weight attenuation is adopted as a learning rate attenuation strategy, wherein the minimum learning rate is set to be 0.0001, an attenuation factor is set to be 0.9, and the batch size is determined according to the size of the image collected in the application scene and the video memory of the training server;
step S42, the model final loss function is:
Figure BDA0003433052710000031
wherein Lossfinal,Lossmain,LossauxRespectively representing the final loss, the main body loss and the auxiliary loss of the model, wherein alpha is the weight of the auxiliary loss and is set to be 0.4, K is the number of the deep supervision modules and is set to be 3, and s is the serial number of the deep supervision modules; the loss function adopts cross information entropy, and the formula is shown as follows;
Figure BDA0003433052710000041
wherein Loss represents Loss value, M represents the number of semantic categories, c represents pixel point sequence number, and ycIs a one-hot vector, taking only two values 0 and 1; if the class is consistent with the sample class, 1 is selected, otherwise 0, p is selectedcRepresents the probability that the predicted pixel belongs to class c;
step S43: in the training stage, a random gradient descent method is used as an optimizer, and the weight value and the offset value after the convolutional neural network is updated are calculated;
step S44: carrying out random affine transformation on part of the training samples, carrying out corresponding transformation on the label file, and then adding the label file into the training samples of the model to participate in training;
step S45: random position cutting is carried out on part of the training samples, corresponding positions of the label files are cut, and then the cut labels are added into the training samples of the model to participate in training;
step S46: and stopping training after the iteration reaches 160000 times, and storing the trained model.
Step S5 specifically includes the following steps:
step S51: acquiring image data as input through a camera in an application scene;
step S52: adjusting an input image to 2048 × 1024 size;
step S53: obtaining a prediction result graph from the image obtained in the step S52 through CFSegNet;
step S54: and scaling the prediction result image obtained in the step S53 into an original input size through a bilinear interpolation algorithm to obtain a final result image.
The method provided by the invention focuses on implementation of semantic segmentation in an actual application scene, and has innovative significance and application value; the method has high accuracy and good timeliness, and is suitable for being deployed in terminal equipment with limited performance.
Compared with the prior art, the invention has the following beneficial effects:
1. the real-time semantic segmentation method based on deep supervision constructed by the invention can effectively perform semantic segmentation on different scenes, and improves the image segmentation effect.
2. The invention provides a combined training loss function, accelerates the training speed, has better convergence and has the advantage of smaller model volume.
3. Compared with the traditional method, the method has relatively better speed in processing the image data with higher resolution.
Drawings
The invention is described in further detail below with reference to the following figures and detailed description:
FIG. 1 is a schematic flow diagram of the present invention;
fig. 2 is a schematic diagram of a network structure of the method of the present invention.
Detailed Description
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
As shown in the figure, the real-time semantic segmentation method based on deep supervision comprises the following steps;
step S1, collecting scene image data for deep supervision aiming at a specific application scene, and constructing a scene image database;
step S2, carrying out pixel level annotation on the scene images in the database, and deriving an annotation file in PASCALVOC format so as to enable the annotation file to meet the training requirement of the semantic segmentation task;
step S3, constructing a deep supervision-based real-time semantic segmentation network CFSegNet;
step S4, training a CFSegNet neural network model by using the labeled data set;
and step S5, preprocessing image data collected in the application scene, and then inputting the preprocessed image data into the CFSegNet neural network model to obtain an image semantic segmentation result.
The step S1 specifically includes the following steps:
step S11: analyzing the influence of various factors in an application scene on an image semantic segmentation result, wherein the factors comprise weather or illumination;
step S12: according to the analysis result in the step S11, adverse effects are overcome by using a multi-image sampling method including taking as many captured application scene images as possible to cover various scenes that are likely to occur;
step S13: and sorting the collected images, and eliminating images which are not suitable for the training task due to repeated and error factors to obtain a corresponding scene image database.
The step S2 specifically includes the following steps;
step S21: analyzing semantic categories to be divided under an application scene by combining the acquired image information according to application requirements;
step S22: downloading and installing image annotation software labelme, and configuring the labelme according to the semantic category obtained in the step S21;
step S23: framing the category edge of each image obtained in the step S1 by using labelme image labeling software, and storing labeling information into a json file with the same name as each image;
step S24: and converting the json file generated in the step S23 into the PASCALVOC format by using a labelme2voc script in labelme image annotation software so as to meet the training requirement of the semantic segmentation task.
Step S3 specifically includes the following steps:
step S31: adopting ResNet-18 as an encoder of CFSegNet, wherein a bottleneck layer of ResNet-18 performs down-sampling on an input image by 4 times, and except for a first stage, ResNet-18 performs down-sampling on the image by 2 times in three subsequent stages;
step S32: ResNet-18 saves the expression of the down-sampling stage through intensive connection in the first to third stages, and introduces a deep supervision module to supervise the expression output by the encoder in the second to fourth stages, thereby reducing the loss of space information in the encoding stage;
step S33: inputting the output result of the fourth stage of the encoder into a pyramid pooling module PPM to obtain an expression with rich multi-scale information;
step S34: inputting the expression obtained in the step S33 into a cascaded upsampling path, and performing 2 times upsampling on the expression for a total of 3 times by using a channel fusion module CFM in combination with the dense connection in the step S32 to obtain an expression of fusing semantic information and spatial information;
step S35: the expression obtained in step S34 is upsampled by 8 times by a bilinear interpolation algorithm, and a prediction result is output by a 1 × 1 convolution.
Step S4 specifically includes the following steps:
step S41: training the model proposed in step S3, and setting the initial parameters as follows:
initial learning rate, i.e., -learning rate: 0.01;
weight decay, namely-weight decay: 0.0005;
momentum, i.e., -momentum: 0.9;
in the training stage, polynomial weight attenuation is adopted as a learning rate attenuation strategy, wherein the minimum learning rate is set to be 0.0001, an attenuation factor is set to be 0.9, and the batch size is determined according to the size of the image collected in the application scene and the video memory of the training server;
step S42, the model final loss function is:
Figure BDA0003433052710000071
wherein Lossfinal,Lossmain,LossauxRespectively representing the final loss, the main body loss and the auxiliary loss of the model, wherein alpha is the weight of the auxiliary loss and is set to be 0.4, K is the number of the deep supervision modules and is set to be 3, and s is the serial number of the deep supervision modules; the loss function adopts cross information entropy, and the formula is shown as follows;
Figure BDA0003433052710000072
wherein Loss represents Loss value, M represents the number of semantic categories, c represents pixel point sequence number, and ycIs a one-hot vector, taking only two values 0 and 1; if the class is consistent with the sample class, 1 is selected, otherwise 0, p is selectedcRepresents the probability that the predicted pixel belongs to class c;
step S43: in the training stage, a random gradient descent method is used as an optimizer, and the weight value and the offset value after the convolutional neural network is updated are calculated;
step S44: carrying out random affine transformation on part of the training samples, carrying out corresponding transformation on the label file, and then adding the label file into the training samples of the model to participate in training;
step S45: random position cutting is carried out on part of the training samples, corresponding positions of the label files are cut, and then the cut labels are added into the training samples of the model to participate in training;
step S46: and stopping training after the iteration reaches 160000 times, and storing the trained model.
Step S5 specifically includes the following steps:
step S51: acquiring image data as input through a camera in an application scene;
step S52: adjusting an input image to 2048 × 1024 size;
step S53: obtaining a prediction result graph from the image obtained in the step S52 through CFSegNet;
step S54: and scaling the prediction result image obtained in the step S53 into an original input size through a bilinear interpolation algorithm to obtain a final result image.
The above description is only a preferred embodiment of the present invention, and all equivalent changes and modifications made in accordance with the claims of the present invention should be covered by the present invention.

Claims (6)

1. The real-time semantic segmentation method based on deep supervision is characterized by comprising the following steps: comprises the following steps;
step S1, collecting scene image data for deep supervision aiming at a specific application scene, and constructing a scene image database;
step S2, carrying out pixel level annotation on the scene images in the database, and deriving an annotation file in PASCALVOC format so as to enable the annotation file to meet the training requirement of the semantic segmentation task;
step S3, constructing a deep supervision-based real-time semantic segmentation network CFSegNet;
step S4, training a CFSegNet neural network model by using the labeled data set;
and step S5, preprocessing image data collected in the application scene, and then inputting the preprocessed image data into the CFSegNet neural network model to obtain an image semantic segmentation result.
2. The deep supervision based real-time semantic segmentation method according to claim 1, characterized in that: the step S1 specifically includes the following steps:
step S11: analyzing the influence of various factors in an application scene on an image semantic segmentation result, wherein the factors comprise weather or illumination;
step S12: according to the analysis result in the step S11, adverse effects are overcome by using a multi-image sampling method including taking as many captured application scene images as possible to cover various scenes that are likely to occur;
step S13: and sorting the collected images, and eliminating images which are not suitable for the training task due to repeated and error factors to obtain a corresponding scene image database.
3. The deep supervision based real-time semantic segmentation method according to claim 1, characterized in that: the step S2 specifically includes the following steps;
step S21: analyzing semantic categories to be divided under an application scene by combining the acquired image information according to application requirements;
step S22: downloading and installing image annotation software labelme, and configuring the labelme according to the semantic category obtained in the step S21;
step S23: framing the category edge of each image obtained in the step S1 by using labelme image labeling software, and storing labeling information into a json file with the same name as each image;
step S24: and converting the json file generated in the step S23 into the PASCALVOC format by using a labelme2voc script in labelme image annotation software so as to meet the training requirement of the semantic segmentation task.
4. The deep supervision based real-time semantic segmentation method according to claim 1, characterized in that: step S3 specifically includes the following steps:
step S31: adopting ResNet-18 as an encoder of CFSegNet, wherein a bottleneck layer of ResNet-18 performs down-sampling on an input image by 4 times, and except for a first stage, ResNet-18 performs down-sampling on the image by 2 times in three subsequent stages;
step S32: ResNet-18 saves the expression of the down-sampling stage through intensive connection in the first to third stages, and introduces a deep supervision module to supervise the expression output by the encoder in the second to fourth stages, thereby reducing the loss of space information in the encoding stage;
step S33: inputting the output result of the fourth stage of the encoder into a pyramid pooling module PPM to obtain an expression with rich multi-scale information;
step S34: inputting the expression obtained in the step S33 into a cascaded upsampling path, and performing 2 times upsampling on the expression for a total of 3 times by using a channel fusion module CFM in combination with the dense connection in the step S32 to obtain an expression of fusing semantic information and spatial information;
step S35: the expression obtained in step S34 is upsampled by 8 times by a bilinear interpolation algorithm, and a prediction result is output by a 1 × 1 convolution.
5. The deep supervision based real-time semantic segmentation method according to claim 1, characterized in that: step S4 specifically includes the following steps:
step S41: training the model proposed in step S3, and setting the initial parameters as follows:
initial learning rate, i.e., -learning rate: 0.01;
weight decay, namely-weight decay: 0.0005;
momentum, i.e., -momentum: 0.9;
in the training stage, polynomial weight attenuation is adopted as a learning rate attenuation strategy, wherein the minimum learning rate is set to be 0.0001, an attenuation factor is set to be 0.9, and the batch size is determined according to the size of the image collected in the application scene and the video memory of the training server;
step S42, the model final loss function is:
Figure FDA0003433052700000021
wherein Lossfinal,Lossmain,LossauxRespectively representing the final loss, the main body loss and the auxiliary loss of the model, wherein alpha is the weight of the auxiliary loss and is set to be 0.4, K is the number of the deep supervision modules and is set to be 3, and s is the serial number of the deep supervision modules; the loss function adopts cross information entropy, and the formula is shown as follows;
Figure FDA0003433052700000031
wherein Loss represents Loss value, M represents the number of semantic categories, c represents pixel point sequence number, and ycIs a one-hot vector, taking only two values 0 and 1; if the class is consistent with the sample class, 1 is selected, otherwise 0, p is selectedcRepresents the probability that the predicted pixel belongs to class c;
step S43: in the training stage, a random gradient descent method is used as an optimizer, and the weight value and the offset value after the convolutional neural network is updated are calculated;
step S44: carrying out random affine transformation on part of the training samples, carrying out corresponding transformation on the label file, and then adding the label file into the training samples of the model to participate in training;
step S45: random position cutting is carried out on part of the training samples, corresponding positions of the label files are cut, and then the cut labels are added into the training samples of the model to participate in training;
step S46: and stopping training after the iteration reaches 160000 times, and storing the trained model.
6. The deep supervision based real-time semantic segmentation method according to claim 1, characterized in that: step S5 specifically includes the following steps:
step S51: acquiring image data as input through a camera in an application scene;
step S52: adjusting an input image to 2048 × 1024 size;
step S53: obtaining a prediction result graph from the image obtained in the step S52 through CFSegNet;
step S54: and scaling the prediction result image obtained in the step S53 into an original input size through a bilinear interpolation algorithm to obtain a final result image.
CN202111600850.1A 2021-12-24 2021-12-24 Real-time semantic segmentation method based on deep supervision Pending CN114266952A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111600850.1A CN114266952A (en) 2021-12-24 2021-12-24 Real-time semantic segmentation method based on deep supervision

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111600850.1A CN114266952A (en) 2021-12-24 2021-12-24 Real-time semantic segmentation method based on deep supervision

Publications (1)

Publication Number Publication Date
CN114266952A true CN114266952A (en) 2022-04-01

Family

ID=80829888

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111600850.1A Pending CN114266952A (en) 2021-12-24 2021-12-24 Real-time semantic segmentation method based on deep supervision

Country Status (1)

Country Link
CN (1) CN114266952A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114863223A (en) * 2022-06-30 2022-08-05 中国自然资源航空物探遥感中心 Hyperspectral weak supervision classification method combining denoising autoencoder and scene enhancement
CN117593528A (en) * 2024-01-18 2024-02-23 中数智科(杭州)科技有限公司 Rail vehicle bolt loosening detection method based on machine vision

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109145983A (en) * 2018-08-21 2019-01-04 电子科技大学 A kind of real-time scene image, semantic dividing method based on lightweight network
CN112541916A (en) * 2020-12-11 2021-03-23 华南理工大学 Waste plastic image segmentation method based on dense connection
CN113011427A (en) * 2021-03-17 2021-06-22 中南大学 Remote sensing image semantic segmentation method based on self-supervision contrast learning
WO2021243787A1 (en) * 2020-06-05 2021-12-09 中国科学院自动化研究所 Intra-class discriminator-based method for weakly supervised image semantic segmentation, system, and apparatus

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109145983A (en) * 2018-08-21 2019-01-04 电子科技大学 A kind of real-time scene image, semantic dividing method based on lightweight network
WO2021243787A1 (en) * 2020-06-05 2021-12-09 中国科学院自动化研究所 Intra-class discriminator-based method for weakly supervised image semantic segmentation, system, and apparatus
CN112541916A (en) * 2020-12-11 2021-03-23 华南理工大学 Waste plastic image segmentation method based on dense connection
CN113011427A (en) * 2021-03-17 2021-06-22 中南大学 Remote sensing image semantic segmentation method based on self-supervision contrast learning

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114863223A (en) * 2022-06-30 2022-08-05 中国自然资源航空物探遥感中心 Hyperspectral weak supervision classification method combining denoising autoencoder and scene enhancement
CN117593528A (en) * 2024-01-18 2024-02-23 中数智科(杭州)科技有限公司 Rail vehicle bolt loosening detection method based on machine vision
CN117593528B (en) * 2024-01-18 2024-04-16 中数智科(杭州)科技有限公司 Rail vehicle bolt loosening detection method based on machine vision

Similar Documents

Publication Publication Date Title
CN106960206B (en) Character recognition method and character recognition system
CN111210443A (en) Deformable convolution mixing task cascading semantic segmentation method based on embedding balance
CN113591968A (en) Infrared weak and small target detection method based on asymmetric attention feature fusion
CN114266952A (en) Real-time semantic segmentation method based on deep supervision
CN109614933B (en) Motion segmentation method based on deterministic fitting
CN111401293B (en) Gesture recognition method based on Head lightweight Mask scanning R-CNN
CN110532959B (en) Real-time violent behavior detection system based on two-channel three-dimensional convolutional neural network
CN113822314A (en) Image data processing method, apparatus, device and medium
CN109523558A (en) A kind of portrait dividing method and system
CN112183649A (en) Algorithm for predicting pyramid feature map
CN112784756A (en) Human body identification tracking method
CN113052170A (en) Small target license plate recognition method under unconstrained scene
CN116129291A (en) Unmanned aerial vehicle animal husbandry-oriented image target recognition method and device
Xi et al. Implicit motion-compensated network for unsupervised video object segmentation
WO2022205329A1 (en) Object detection method, object detection apparatus, and object detection system
CN113297956A (en) Gesture recognition method and system based on vision
CN116563553B (en) Unmanned aerial vehicle image segmentation method and system based on deep learning
CN111767919B (en) Multilayer bidirectional feature extraction and fusion target detection method
CN116363361A (en) Automatic driving method based on real-time semantic segmentation network
CN113313720B (en) Object segmentation method and device
CN113378598B (en) Dynamic bar code detection method based on deep learning
CN115311680A (en) Human body image quality detection method and device, electronic equipment and storage medium
CN113256528B (en) Low-illumination video enhancement method based on multi-scale cascade depth residual error network
CN112488115B (en) Semantic segmentation method based on two-stream architecture
CN114694133A (en) Text recognition method based on combination of image processing and deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination