CN114187230A - Camouflage object detection method based on two-stage optimization network - Google Patents

Camouflage object detection method based on two-stage optimization network Download PDF

Info

Publication number
CN114187230A
CN114187230A CN202111243490.4A CN202111243490A CN114187230A CN 114187230 A CN114187230 A CN 114187230A CN 202111243490 A CN202111243490 A CN 202111243490A CN 114187230 A CN114187230 A CN 114187230A
Authority
CN
China
Prior art keywords
stage
module
layer
network
decoder
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111243490.4A
Other languages
Chinese (zh)
Inventor
姜璇
张亚杰
苏荔
李国荣
黄庆明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Chinese Academy of Sciences
Original Assignee
University of Chinese Academy of Sciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Chinese Academy of Sciences filed Critical University of Chinese Academy of Sciences
Priority to CN202111243490.4A priority Critical patent/CN114187230A/en
Publication of CN114187230A publication Critical patent/CN114187230A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the technical field of camouflage object detection, in particular to a camouflage object detection method based on a two-stage optimization network, which aims to solve the problem of insufficient detection precision in the existing camouflage object detection technology and provides a detection method based on multi-task learning, wherein object boundary information is used as assistance to guide the network to better learn the difference between the texture of a camouflage object and the background texture at the boundary, so that the network can better position and segment the camouflage object; the two-stage optimization network is divided into two stages, wherein the first stage follows an encoder-decoder structure, and ResNet50 is used as a main stem of feature extraction and is used for positioning and identifying the camouflage objects to form rough mapping. In the second stage, a parallel decoder structure is used, the object edge is used as boundary information, the network is promoted to pay attention to the object edge, and the mapping generated in the first stage is optimized.

Description

Camouflage object detection method based on two-stage optimization network
Technical Field
The invention relates to the technical field of camouflage object detection, in particular to a camouflage object detection method based on a two-stage optimization network.
Background
With the trend of diversified requirements of people on intelligent life, the application range of target detection becomes wider, and the detection of the disguised object is one of the important branches of the detection. It focuses on the relationship of objects to the surroundings, aiming at detecting and segmenting out camouflaged objects that "blend in" the surrounding environment. Camouflaging phenomena are ubiquitous in human life and nature, and are particularly common in animals. In the process of catching or avoiding natural enemies of animals, a plurality of animals can reduce the difference and contrast between the animals and the surrounding environment by changing the body color, shape, action and other modes of the animals so as to improve the self viability. These masquerading strategies are typically implemented based on the decision-making ability of a fuzzy observer.
Biological studies have shown that the Human Visual System (HVS) is most sensitive to large areas and color features, which perceive objects mainly by observing the contrast between the object and its background. Therefore, the HVS may have difficulty identifying the disguise due to its low contrast with the environment.
However, in some cases, counterfeit object identification is very necessary. In addition to the task itself of detecting animal camouflage phenomena, which can provide technical support for animal protection, there are still many passive camouflage phenomena in life, where objects and backgrounds are highly similar: in the medical field, slight changes in background tissues with extremely high similarity are likely to represent a certain lesion; in the military field, the detection of camouflage on the battlefield may also reverse the situation. The development of this task is therefore of great importance.
In recent years, deep convolutional networks have been gradually developed in various computer vision tasks with strong feature representation capability, and some existing detection methods for camouflaged objects are also realized based on the following steps: fan et al propose to stratify the extracted features. And then the characteristics of different layers are fused and enhanced to help to acquire positioning and edge information, so that the accurate detection of the camouflage target is realized. Yan et al split the MirrorNet into a stream of original image segmentation and a stream of mirror image segmentation to find the visual difference between the original image and the flip image to better locate the camouflage object.
Although these methods are proposed according to the attributes of the camouflaged object, there is room for improvement in the edge processing. Therefore, in the invention, the boundary information of the camouflage object is further considered, so that the model can better learn the difference between the camouflage object and the environment at the boundary, and the camouflage object can be more accurately positioned and segmented.
Disclosure of Invention
The invention aims to solve the problem of insufficient detection precision in the existing camouflage object detection technology, and provides a detection method based on multi-task learning, which is used for guiding a network to better learn the difference between the texture of the camouflage object and the background texture at the boundary by taking object boundary information as assistance, so that the network can better position and segment the camouflage object.
The two-stage optimization network is divided into two stages, the first stage follows the structure of an encoder-decoder, and ResNet50 is used as a main stem of feature extraction and is used for positioning and identifying the camouflage object to form rough mapping. In the second stage, a parallel decoder structure is used, the object edge is used as boundary information, the network is promoted to pay attention to the object edge, and the mapping generated in the first stage is optimized.
Further, the first stage is a pre-feature fusion stage, and ResNet50 is selected as a backbone network to ensure that deep features can be effectively extracted;
the purpose of this stage is to obtain a rough map of the disguised object, based on considerations of computational efficiency and detection accuracy, the following two modules are proposed:
(1) channel attention module:
applying a channel attention mechanism to the output of each layer of the encoder to retain useful information in the shallow features and reduce redundant information;
it aims to extract valid information and can be expressed as:
Figure BDA0003320272030000021
wherein the Attention indicates the channel Attention module,
Figure BDA0003320272030000031
the output of the bottom-up ith channel attention module,
Figure BDA0003320272030000032
is the ith coding block in the coding stage.
The channel attention module has 4 layers: the size of the first convolution layer is 1 × 1 to reduce the number of channels to 32 layers; two 3 x 3 convolutional layers are arranged behind the image channel, normalization is used after each convolutional layer, and the image channel is still maintained to be 32 layers after the two layers and the size is unchanged; finally, obtaining final characteristics through a Relu function;
(2) global feature and local feature fusion module:
the module is realized at the decoder stage, the structure of the module is almost symmetrical to that of an encoder, each layer of the decoder comprises two 3 x 3 convolutional layers and then uses normalization and a ReLu function, the module also introduces cSE modules and sSE modules to obtain more accurate detection results, the modules can better establish the dependency relationship between different channels and guide the network to pay attention to the characteristics related to a camouflage object, in addition, a pyramid pooling module is used for the output result of the last layer of the encoder to obtain global characteristics, and each layer of the decoder input is the combination of the output result of a corresponding channel attention module and the output result of the upper layer after being sampled:
Figure BDA0003320272030000033
Figure BDA0003320272030000034
wherein GLFA represents a decoder module in the global feature and local feature fusion module, PPM represents an introduced pyramid pooling model, Cat represents the connection of a feature map, Upesample represents an upsampling process,
Figure BDA0003320272030000035
to focus on the output of the module for the ith channel from bottom to top,
Figure BDA0003320272030000036
and outputting the ith layer in the global feature and local feature fusion module.
Thus, the decoder can learn more comprehensive semantic information and construct a prediction module to obtain the final result, which contains a 3 × 3 convolutional layer, the ELU activation function, and a 1 × 1 convolutional layer, which can be expressed as:
Figure BDA0003320272030000037
where ELU denotes the ELU activation function, Conv denotes the two convolution layers applied here, Upesample denotes upsampling,
Figure BDA0003320272030000038
the output of the block from bottom to top level 4 is shown so that the prediction and final truth maps are of the same size.
Further, the second stage is an optimization stage, and the optimization stage aims to further distinguish the camouflage object from the background by using the edge information of the object; an edge truth map is introduced as supervision information at the stage, so that the model focuses more on the difference of the object at the edge; the method comprises the following steps:
the optimization module uses a decoder structure which is the same as that of the global feature and local feature fusion module, and forms a parallel corresponding relation with the decoder structure, and the input of each layer in the optimization module is the combination of the output result of the corresponding channel attention module and the output result of the upper layer after up-sampling, so that the optimization module can further utilize the features in the previous feature fusion stage to play a role in restricting the extraction process of the features and simultaneously enable the feature reconstruction process to be more comprehensive, thereby refining the final prediction graph;
the final prediction at this stage can be expressed as:
Figure BDA0003320272030000041
where ELU denotes the ELU activation function, Conv denotes the two convolution layers applied here, Upesample denotes upsampling,
Figure BDA0003320272030000042
the output of the encoder at this stage is shown from bottom to top layer 4 so that the prediction graph and the final edge true graph have the same size.
The loss of the two-stage optimization network is obtained by adding the predicted losses of the two decoders, the binary cross entropy loss is selected as a loss function, and the overall loss function is as follows:
Figure BDA0003320272030000043
wherein L istotalWhich represents the overall loss of the power,
Figure BDA0003320272030000044
the loss of the pre-fusion stage is shown, namely pred _ c is the prediction result of the pre-feature fusion module and GT is a truth diagram;
Figure BDA0003320272030000045
indicating the loss of the optimization module, pred _ e is the prediction result of the edge optimization module, GT _ edge is generalAnd (4) calculating an edge true value map through the true value map.
Compared with the prior art, the invention has the beneficial effects that:
(1) the performance is good, and the result on the disclosed camouflage object detection data set shows that the method can achieve the best effect in four different evaluation indexes;
(2) the method has high efficiency, and only the extracted useful features in the framework adopted by the method are input into the decoding process, so that the times of convolution operation are greatly reduced, and the method has more practical application significance.
Drawings
FIG. 1 is a schematic diagram of a model framework;
FIG. 2 is a schematic view of a channel attention module.
Detailed Description
The following detailed description of embodiments of the present invention is provided in connection with the accompanying drawings and examples. The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.
The two-stage optimization network is divided into two stages, wherein the first stage follows an encoder-decoder structure, and ResNet50 is used as a main stem of feature extraction and is used for positioning and identifying the camouflage objects to form rough mapping. In the second stage, a parallel decoder structure is used, the object edge is used as boundary information, the network is promoted to pay attention to the object edge, and the mapping generated in the first stage is optimized.
As a preferred scheme of the above embodiment, the first stage is a pre-feature fusion stage, and ResNet50 is selected as a backbone network to ensure that deep features can be effectively extracted;
the purpose of this stage is to obtain a rough map of the disguised object, based on considerations of computational efficiency and detection accuracy, the following two modules are proposed:
(1) channel attention module:
in the CNN, different channels respond to different semantics, and features of different levels contain detail information and full-text information of different degrees, so that in the process of extracting the features by an encoder based on ResNet50, although the range of an original image can be seen by the output of deep convolution is larger, a lot of detail information is lost; although the detail information is reserved in the output of the shallow layer, the detail information is not all useful information, so that a channel attention mechanism is applied to the output of each layer of the encoder to reserve the useful information in the shallow layer characteristics and reduce redundant information;
it aims to extract valid information and can be expressed as:
Figure BDA0003320272030000051
wherein the Attention indicates the channel Attention module,
Figure BDA0003320272030000052
the output of the bottom-up ith channel attention module,
Figure BDA0003320272030000053
is the ith coding block in the coding stage.
In addition, since the number of channels input to each layer of the decoder becomes 32 after passing through the channel attention module, the number of parameters in the model is greatly reduced, the size of the model is reduced, and the training and reasoning speed is increased. The channel attention module has 4 layers: the size of the first convolution layer is 1 × 1 to reduce the number of channels to 32 layers; two 3 x 3 convolutional layers are arranged behind the image channel, normalization is used after each convolutional layer, and the image channel is still maintained to be 32 layers after the two layers and the size is unchanged; finally, obtaining final characteristics through a Relu function;
(2) global feature and local feature fusion module:
the module is realized at the decoder stage, the structure of the module is almost symmetrical to that of an encoder, each layer of the decoder comprises two 3 x 3 convolutional layers and then uses normalization and a ReLu function, the module also introduces cSE modules and sSE modules to obtain more accurate detection results, the modules can better establish the dependency relationship between different channels and guide the network to pay attention to the characteristics related to a camouflage object, in addition, a pyramid pooling module is used for the output result of the last layer of the encoder to obtain global characteristics, and each layer of the decoder input is the combination of the output result of a corresponding channel attention module and the output result of the upper layer after being sampled:
Figure BDA0003320272030000061
Figure BDA0003320272030000062
wherein GLFA represents a decoder module in the global feature and local feature fusion module, PPM represents an introduced pyramid pooling model, Cat represents the connection of a feature map, Upesample represents an upsampling process,
Figure BDA0003320272030000063
to focus on the output of the module for the ith channel from bottom to top,
Figure BDA0003320272030000064
and outputting the ith layer in the global feature and local feature fusion module.
Thus, the decoder can learn more comprehensive semantic information and construct a prediction module to obtain the final result, which contains a 3 × 3 convolutional layer, the ELU activation function, and a 1 × 1 convolutional layer, which can be expressed as:
Figure BDA0003320272030000065
where ELU denotes the ELU activation function, Conv denotes the two convolution layers applied here, Upesample denotes upsampling,
Figure BDA0003320272030000066
representing the output of the module from the bottom up to layer 4. So that the prediction map and the final truth map have the same size.
As a preferred solution of the above embodiment, the second stage is an optimization stage, and the detection task of the disguised object is challenging due to the high similarity between the object and the environment, so the optimization stage aims to further distinguish the disguised object from the background by using the object edge information; an edge truth map is introduced as supervision information at the stage, so that the model focuses more on the difference of the object at the edge; the method comprises the following steps:
the optimization module uses a decoder structure which is the same as that of the global feature and local feature fusion module, and forms a parallel corresponding relation with the decoder structure, and the input of each layer in the optimization module is the combination of the output result of the corresponding channel attention module and the output result of the upper layer after up-sampling, so that the optimization module can further utilize the features in the previous feature fusion stage to play a role in restricting the extraction process of the features and simultaneously enable the feature reconstruction process to be more comprehensive, thereby refining the final prediction graph;
the final prediction at this stage can be expressed as:
Figure BDA0003320272030000071
where ELU denotes the ELU activation function, Conv denotes the two convolution layers applied here, Upesample denotes upsampling,
Figure BDA0003320272030000072
representing the output of the encoder from the bottom up to layer 4 at this stage. So that the prediction map and the final edge true value map have the same size.
The loss of the two-stage optimization network is obtained by adding the predicted losses of the two decoders, the binary cross entropy loss is selected as a loss function, and the overall loss function is as follows:
Figure BDA0003320272030000073
wherein L istot8lWhich represents the overall loss of the power,
Figure BDA0003320272030000074
the loss of the pre-fusion stage is shown, namely pred _ c is the prediction result of the pre-feature fusion module and GT is a truth diagram;
Figure BDA0003320272030000075
and the loss of the optimization module is represented, pred _ e is a prediction result of the edge optimization module, and GT _ edge is an edge true value graph obtained by calculating a true value graph.
The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the protection scope of the present invention.

Claims (3)

1. A camouflage object detection method based on a two-stage optimization network is characterized in that the two-stage optimization network is divided into two stages, the first stage follows the structure of a coder decoder, and ResNet50 is used as a main stem of feature extraction and is used for positioning and identifying a camouflage object to form rough mapping; in the second stage, a parallel decoder structure is used, the object edge is used as boundary information, the network is promoted to pay attention to the object edge, and the mapping generated in the first stage is optimized.
2. The method for detecting the disguised objects based on the two-stage optimization network as claimed in claim 1, wherein the first stage is a previous feature fusion stage, and ResNet50 is selected as a backbone network to ensure that deep features can be effectively extracted;
the purpose of this stage is to obtain a rough map of the disguised object, based on considerations of computational efficiency and detection accuracy, the following two modules are proposed:
(1) channel attention module:
applying a channel attention mechanism to the output of each layer of the encoder to retain useful information in the shallow features and reduce redundant information;
it aims to extract valid information and can be expressed as:
Figure FDA0003320272020000011
wherein the Attention indicates the channel Attention module,
Figure FDA0003320272020000012
the output of the bottom-up ith channel attention module,
Figure FDA0003320272020000013
the ith coding block in the coding stage;
the channel attention module has 4 layers: the size of the first convolution layer is 1 × 1 to reduce the number of channels to 32 layers; two 3 x 3 convolutional layers are arranged behind the image channel, normalization is used after each convolutional layer, and the image channel is still maintained to be 32 layers after the two layers and the size is unchanged; finally, obtaining final characteristics through a Relu function;
(2) global feature and local feature fusion module:
the module is realized at the decoder stage, the structure of the module is almost symmetrical to that of an encoder, each layer of the decoder comprises two 3 x 3 convolutional layers and then uses normalization and a ReLu function, the module also introduces cSE modules and sSE modules to obtain more accurate detection results, the modules can better establish the dependency relationship between different channels and guide the network to pay attention to the characteristics related to a camouflage object, in addition, a pyramid pooling module is used for the output result of the last layer of the encoder to obtain global characteristics, and each layer of the decoder input is the combination of the output result of a corresponding channel attention module and the output result of the upper layer after being sampled:
Figure FDA0003320272020000021
Figure FDA0003320272020000022
wherein GLFA represents a decoder module in the global feature and local feature fusion module, PPM represents an introduced pyramid pooling model, Cat represents the connection of a feature map, Upesample represents an upsampling process,
Figure FDA0003320272020000023
to focus on the output of the module for the ith channel from bottom to top,
Figure FDA0003320272020000024
outputting the ith layer in the global feature and local feature fusion module;
thus, the decoder can learn more comprehensive semantic information and construct a prediction module to obtain the final result, which contains a 3 × 3 convolutional layer, the ELU activation function, and a 1 × 1 convolutional layer, which can be expressed as:
Figure FDA0003320272020000025
where ELU denotes the ELU activation function, Conv denotes the two convolution layers applied here, Upesample denotes upsampling,
Figure FDA0003320272020000026
the output of the block from bottom to top level 4 is shown so that the prediction and final truth maps are of the same size.
3. The method for detecting a disguised object based on a two-stage optimization network as claimed in claim 2, wherein the second stage is an optimization stage, which aims to further distinguish the disguised object from the background by using the object edge information; an edge truth map is introduced as supervision information at the stage, so that the model focuses more on the difference of the object at the edge; the method comprises the following steps:
the optimization module uses a decoder structure which is the same as that of the global feature and local feature fusion module, and forms a parallel corresponding relation with the decoder structure, and the input of each layer in the optimization module is the combination of the output result of the corresponding channel attention module and the output result of the upper layer after up-sampling, so that the optimization module can further utilize the features in the previous feature fusion stage to play a role in restricting the extraction process of the features and simultaneously enable the feature reconstruction process to be more comprehensive, thereby refining the final prediction graph;
the final prediction at this stage can be expressed as:
Figure FDA0003320272020000031
where ELU denotes the ELU activation function, Conv denotes the two convolution layers applied here, Upesample denotes upsampling,
Figure FDA0003320272020000032
representing the output result of the encoder from the bottom to the top layer 4 in the stage, so that the prediction graph and the final edge true value graph have the same size;
the loss of the two-stage optimization network is obtained by adding the predicted losses of the two decoders, the binary cross entropy loss is selected as a loss function, and the overall loss function is as follows:
Figure FDA0003320272020000033
wherein L istotalWhich represents the overall loss of the power,
Figure FDA0003320272020000034
the loss of the pre-fusion stage is shown, namely pred _ c is the prediction result of the pre-feature fusion module and GT is a truth diagram;
Figure FDA0003320272020000035
indicating loss of the optimization block, pred _ e being an edgeAnd the GT _ edge is an edge true value graph obtained by calculating a truth value graph.
CN202111243490.4A 2021-10-25 2021-10-25 Camouflage object detection method based on two-stage optimization network Pending CN114187230A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111243490.4A CN114187230A (en) 2021-10-25 2021-10-25 Camouflage object detection method based on two-stage optimization network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111243490.4A CN114187230A (en) 2021-10-25 2021-10-25 Camouflage object detection method based on two-stage optimization network

Publications (1)

Publication Number Publication Date
CN114187230A true CN114187230A (en) 2022-03-15

Family

ID=80601455

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111243490.4A Pending CN114187230A (en) 2021-10-25 2021-10-25 Camouflage object detection method based on two-stage optimization network

Country Status (1)

Country Link
CN (1) CN114187230A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115223018A (en) * 2022-06-08 2022-10-21 东北石油大学 Cooperative detection method and device for disguised object, electronic device and storage medium
CN115631346A (en) * 2022-11-11 2023-01-20 南京航空航天大学 Disguised object detection method and system based on uncertainty modeling

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115223018A (en) * 2022-06-08 2022-10-21 东北石油大学 Cooperative detection method and device for disguised object, electronic device and storage medium
CN115631346A (en) * 2022-11-11 2023-01-20 南京航空航天大学 Disguised object detection method and system based on uncertainty modeling
CN115631346B (en) * 2022-11-11 2023-07-18 南京航空航天大学 Uncertainty modeling-based camouflage object detection method and system

Similar Documents

Publication Publication Date Title
CN111091130A (en) Real-time image semantic segmentation method and system based on lightweight convolutional neural network
Andonian et al. Robust cross-modal representation learning with progressive self-distillation
CN114187230A (en) Camouflage object detection method based on two-stage optimization network
CN111723786A (en) Method and device for detecting wearing of safety helmet based on single model prediction
CN115984172A (en) Small target detection method based on enhanced feature extraction
CN115809327B (en) Real-time social network rumor detection method based on multimode fusion and topics
CN112836625A (en) Face living body detection method and device and electronic equipment
CN114255403A (en) Optical remote sensing image data processing method and system based on deep learning
CN112149526B (en) Lane line detection method and system based on long-distance information fusion
CN115082928B (en) Method for asymmetric double-branch real-time semantic segmentation network facing complex scene
CN113269133A (en) Unmanned aerial vehicle visual angle video semantic segmentation method based on deep learning
CN114627441A (en) Unstructured road recognition network training method, application method and storage medium
CN114882351B (en) Multi-target detection and tracking method based on improved YOLO-V5s
Lu et al. Mfnet: Multi-feature fusion network for real-time semantic segmentation in road scenes
Zhao et al. YOLO‐Highway: An Improved Highway Center Marking Detection Model for Unmanned Aerial Vehicle Autonomous Flight
CN113052090A (en) Method and apparatus for generating subtitle and outputting subtitle
CN112598032A (en) Multi-task defense model construction method for anti-attack of infrared image
CN116342877A (en) Semantic segmentation method based on improved ASPP and fusion module in complex scene
Liu et al. IR ship target saliency detection based on lightweight non-local depth features
CN115019139A (en) Light field significant target detection method based on double-current network
CN114241470A (en) Natural scene character detection method based on attention mechanism
Liu et al. L2-LiteSeg: A Real-Time Semantic Segmentation Method for End-to-End Autonomous Driving
CN112966569B (en) Image processing method and device, computer equipment and storage medium
Zhou et al. FENet: Fast Real-time Semantic Edge Detection Network
CN117635628B (en) Sea-land segmentation method based on context attention and boundary perception guidance

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination