CN114359554A - Image semantic segmentation method based on multi-receptive-field context semantic information - Google Patents

Image semantic segmentation method based on multi-receptive-field context semantic information Download PDF

Info

Publication number
CN114359554A
CN114359554A CN202111413182.1A CN202111413182A CN114359554A CN 114359554 A CN114359554 A CN 114359554A CN 202111413182 A CN202111413182 A CN 202111413182A CN 114359554 A CN114359554 A CN 114359554A
Authority
CN
China
Prior art keywords
image
receptive
feature
semantic information
convolution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111413182.1A
Other languages
Chinese (zh)
Inventor
刘亮亮
常靖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Henan Agricultural University
Original Assignee
Henan Agricultural University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Henan Agricultural University filed Critical Henan Agricultural University
Priority to CN202111413182.1A priority Critical patent/CN114359554A/en
Publication of CN114359554A publication Critical patent/CN114359554A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses an image semantic segmentation method based on multi-receptive field context semantic information, which comprises the following steps of converting an input image into a pixel matrix through convolution operation; secondly, converting the same pixel matrix into a plurality of characteristic graphs with multi-receptive-field context semantic information by adopting expansion convolution with different expansion rates; thirdly, feature images with context semantic information of multiple receptive fields are respectively subjected to feature extraction and down-sampling processing through converter encoders in different subnets to obtain a plurality of down-sampling feature images with different receptive fields; step four, the down-sampling feature maps are subjected to up-sampling processing step by step through a decoder to obtain feature maps with the same size and dimension, and a final feature fusion map is generated; and fifthly, the feature fusion graph completes the prediction of image segmentation through a convolution neural network. The method can be effectively applied to image semantic segmentation, deep low-resolution features and fine-grained features cannot be lost, memory consumption is low, the effect is obvious, and the method is convenient to popularize.

Description

Image semantic segmentation method based on multi-receptive-field context semantic information
Technical Field
The invention belongs to the technical field of image recognition, and particularly relates to an image semantic segmentation method based on multi-receptive-field context semantic information.
Background
Image semantic segmentation is the basis for image analysis and also for many applications, such as object recognition in autonomous driving systems, unmanned aerial vehicle applications, and wearable device applications. An image is composed of pixels, and "semantic segmentation" as the name implies, is to group or segment pixels according to differences in the image that express semantic meaning. The goal of semantic segmentation of an image is to label the class to which each pixel of the image belongs. Thus, semantic segmentation refers to identifying the target tissue in an image at the pixel level, i.e., noting the class of objects to which each pixel in the image belongs.
Many segmentation methods have emerged in the development of image segmentation, such as: the method is based on simple pixel level 'Thresholding methods' (Thresholding methods) and pixel Clustering-based segmentation methods (Clustering-based segmentation methods), and 'graph partitioning' segmentation methods, but these segmentation methods have difficulty meeting the current requirements for high-precision segmentation performance. With the successive proposal of a series of convolutional neural network-based semantic segmentation methods represented by full convolutional neural networks (FCNs), at present, the architecture of advanced image semantic segmentation models is almost based on convolutional networks, and generally follows a pattern: the network is divided into an encoder and a decoder, the encoder is usually based on an image classification network, also called backbone, which is pre-trained on a large corpus (e.g. ImageNet); the decoder aggregates the features from the encoder and converts them into a final feature map for prediction. Previous segmentation architecture studies have generally focused on the decoder and its aggregation strategy, but in practice the size of the image features and the backbone architecture of the model are critical to the overall model, since information lost in the encoder is not likely to be recovered in the decoder. In addition, the existing models are not concerned with the acquisition of different feature information and the improvement in the extraction and screening of features in the encoder.
In the prior art, an encoder gradually down-samples an input image through operations such as convolution, pooling and the like, extracts feature information, and down-sampling gradually increases the receptive field of a model to abstract low-level features into high-level features. However, the downsampling operation has significant disadvantages, especially in the pixel-level prediction task, which can result in low-resolution features and fine-grained features being lost at deeper layers of the model, and such lost information is difficult to recover in the decoder. Although pixel feature resolution and granularity may be of little importance for certain tasks such as image classification, they are of vital importance for pixel-based segmentation tasks, and ideally, the model should minimize the loss of feature information during the downsampling process, i.e., enable the resolution of the input image to be equal to or close to the resolution of the input image, with no apparent difference between the input image and the output image.
In addition, in the prior art, convolution and nonlinear modules together form a basic computing unit of an image analysis network, convolution is a linear operator with a limited receptive field, and the limited receptive field and limited expressive power of a single convolution require sequence superposition into a very deep architecture to obtain a sufficiently wide background and a sufficiently high characterization capability. However, this requires the production of many intermediate characterizations, consuming a large amount of memory. In order to keep memory consumption at a level where existing computer architectures are feasible, it is necessary to reduce the sampling of the intermediate representation.
Disclosure of Invention
The technical problem to be solved by the present invention is to provide an image semantic segmentation method based on context semantic information of multiple receptive fields for overcoming the above-mentioned deficiencies in the prior art, the method has simple steps, reasonable design and convenient realization, can be effectively applied to the semantic segmentation of the image, adopts the structure of an encoder and a decoder, adopts the extended convolution of multiple reception fields to obtain the characteristics of different size resolutions from the same image, and the converter is used as a basic calculation building block of the encoder to be recombined into image-like feature representations under various resolutions, and a convolutional decoder is used for gradually combining the characteristic representations into the final pixel prediction to complete the identification and segmentation of the target tissue, meanwhile, the intermediate representation is reduced, deep low-resolution features and fine-grained features cannot be lost, the memory consumption is low, the effect is remarkable, and the popularization is convenient.
In order to solve the technical problems, the invention adopts the technical scheme that: an image semantic segmentation method based on multi-receptive field context semantic information comprises the following steps:
converting an input image into a pixel matrix through convolution operation;
step two, adopting the expansion convolution with different expansion rates to convert the same pixel matrix into a plurality of characteristic graphs with multi-receptive-field context semantic information;
thirdly, the feature map with the multi-receptive-field context semantic information is subjected to feature extraction and down-sampling processing through converter encoders in different subnets respectively to obtain a plurality of down-sampling feature maps with different receptive fields;
step four, the down-sampling feature map is subjected to up-sampling treatment step by step through a decoder to obtain feature maps with the same size and dimension, and a final feature fusion map is generated;
and fifthly, the feature fusion graph completes the prediction of image segmentation through a convolutional neural network.
In the above image semantic segmentation method based on multi-receptive field context semantic information, the output of the dilation convolution in step two is:
Figure BDA0003375049520000031
wherein, yiRepresents the ith output of the expansion convolution, the convolution kernel of the expansion convolution has the size of k x k, the expansion rate is r, xiFor the ith input signature mapping of the dilated convolution before the converter subnetwork, m is the filter matrix w k with convolution kernel size k x k]Length of (d).
In the image semantic segmentation method based on the multi-receptive-field context semantic information, in the third step, the converter comprises a multi-head self-attention mechanism model and a multi-layer perceptron model, and the input of the multi-head self-attention mechanism model and the input of the multi-layer perceptron model are normalized.
In the above image semantic segmentation method based on multi-receptor-field context semantic information, the output of the multi-head attention mechanism model is as follows:
Yout=concat[y1,y2,...yi...yh-1]
yi=Attentation(qWi q,jWi j,dWi d)
wherein, Yout∈Rq*j*d,concat[]Denotes a join operation, i ∈ [1, h-1 ]]H is the self-attention block number, each block has its own set of learnable weight matrices (W)i q,Wi j,Wi d) W is the weight matrix of the projection, W is the Rq*j*dAnd q, j and d are three-dimensional dimensions of the first characteristic diagram.
According to the image semantic segmentation method based on the multi-receptive-field context semantic information, the output end of the multilayer perceptron model is provided with the remodeling layer, and the remodeling layer comprises a reshape layer used for changing the dimensionality of input data.
In the image semantic segmentation method based on the multi-receptive-field context semantic information, the decoder performs gradual upsampling in the fourth step to obtain the feature fusion image with the same size and dimension, and the specific process of obtaining the feature fusion image comprises feature fusion between the encoder and the decoder and output feature fusion of the decoder.
In the above method for segmenting image semantics based on multi-receptive field context semantic information, the specific process of feature fusion between the encoder and the decoder includes: and fusing the characteristic diagrams between corresponding layers of the encoder and the decoder through skip-connection operation, and reducing gradient disappearance and network degradation through skip-connection jumping connection.
In the above method for segmenting image semantics based on multi-receptive field context semantic information, the specific process of the decoder output feature fusion includes: the decoder performs layer-by-layer and continuous up-sampling on the feature maps from different encoders, outputs features with the same size and dimension, and then fuses the output feature maps through a concat () method.
In the image semantic segmentation method based on the multi-receptive-field context semantic information, in the fifth step, the convolutional neural network adopts a softmax activation function to complete the prediction of image segmentation.
Compared with the prior art, the invention has the following advantages:
1. the method has simple steps, reasonable design and convenient realization.
2. The invention adopts an encoder and a decoder framework, adopts the extended convolution of multiple receptive fields to obtain the characteristics of different sizes and resolutions from the same image, utilizes a converter as a basic calculation building block of the encoder, recombines the characteristics into image-like characteristic representations under various resolutions, and gradually combines the characteristic representations into final pixel prediction by using a convolution decoder to complete the identification and the segmentation of target tissues, and simultaneously reduces intermediate characteristics.
3. The converter abandons explicit down-sampling operation after calculating initial image embedding, and keeps unchanged dimension representation in all processing stages, so that fine-grained pixels and global prediction results are kept consistent.
4. The invention has remarkable advantages in accuracy and training efficiency; in the method architecture, the method is a simple and expanded framework, and the multi-sensing-field spatial characteristics and the converter encoder can be increased or decreased according to the actual situation; compared with other semantic segmentation methods, the method has the advantages that the generalization performance of the model is improved due to the introduction of multiple receptive fields and the converter.
5. The method can be effectively applied to image semantic segmentation, deep low-resolution features and fine-grained features cannot be lost, memory consumption is low, the effect is obvious, and the method is convenient to popularize.
In conclusion, the method has the advantages of simple steps, reasonable design and convenient implementation, can be effectively applied to image semantic segmentation, does not lose deep low-resolution features and fine-grained features, consumes little memory, has obvious effect and is convenient to popularize.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
Detailed Description
As shown in fig. 1, the image semantic segmentation method based on context semantic information of multiple receptive fields of the present invention includes the following steps:
converting an input image into a pixel matrix through convolution operation;
step two, adopting the expansion convolution with different expansion rates to convert the same pixel matrix into a plurality of characteristic graphs with multi-receptive-field context semantic information;
the output of the dilation convolution is:
Figure BDA0003375049520000051
wherein, yiRepresents the ith output of the expansion convolution, the convolution kernel of the expansion convolution has the size of k x k, the expansion rate is r, xiFor the ith input signature mapping of the dilated convolution before the converter subnetwork, m is the filter matrix w k with convolution kernel size k x k]Length of (d).
For example, in a target segmentation neural network, generally, prediction needs to be performed on the last layer of feature map, and how many pixels of an original image can be mapped by one point on the feature map determines an upper limit of the size that can be detected by the network, that is, the receptive field. And the extraction of image characteristic information is ensured by down-sampling, and the result of the down-sampling is that small targets are not easy to detect. However, each pixel feature in the image determines the segmentation result more or less, and objects that are not detected will affect the segmentation result. In neural networks, convolution kernels are used to filter contextual semantic information of pixels from an input image. If the parameters of the convolution kernel are fixed, the size of the extracted feature information is fixed, and the obtained feature information is limited, which easily causes the model to focus on only a certain class or a certain size of segmentation target, thereby causing the deviation of the segmentation performance of the model. In addition, the acceptance of convolution kernels is limited, focusing on local features only. In order to alleviate the problems brought by the traditional convolution operation, the invention generates different receptive field characteristics according to different convolution kernel sizes, adopts the expansion convolution capable of setting the convolution kernel proportion (rate) as the initialization convolution for multi-receptive field characteristic acquisition, and adopts the multi-receptive field convolution to acquire characteristic information of different size spaces from the same image, thereby providing more and richer characteristics for a segmentation model and better decision support for the model.
Thirdly, the feature map with the multi-receptive-field context semantic information is subjected to feature extraction and down-sampling processing through converter encoders in different subnets respectively to obtain a plurality of down-sampling feature maps with different receptive fields;
the converter comprises a multi-head self-attention mechanism model and a multilayer perceptron model, and the input of the multi-head self-attention mechanism model and the input of the multilayer perceptron model are normalized;
the output of the multi-head self-attention mechanism model is as follows:
Yout=concat[y1,y2,...yi...yh-1]
yi=Attentation(qWi q,jWi j,dWi d)
wherein, Yout∈Rq*j*d,concat[]Denotes a join operation, i ∈ [1, h-1 ]]H is the self-attention block number, each block has its own set of learnable weight matrices (W)i q,Wi j,Wi d) W is the weight matrix of the projection, W is the Rq*j*dQ, j, d are the three-dimensional dimensions of the first feature map;
in the task of image analysis, the goal of the single-head self-attention mechanism is to extract the interaction relation among all pixels by performing global context feature coding on each pixel. However, the single-headed self-attention mechanism is limited in that it focuses only on one specific location. In the present invention, a multi-head attention Mechanism (MHA) is used as a component of the proposed converter. Multiple-head attention is one of the attention mechanisms, and multiple independent parallel attentions can be simultaneously performed on different important positions. In particular, in a multi-headed attention mechanism, different randomly initialized mapping matrices can map input vectors to different subspaces, which helps the model analyze the input sequence from different angles. In addition, the multi-head attention mechanism can connect multiple self-attentions, and the total computational consumption is reduced by reducing dimensionality.
The output end of the multilayer perceptron model is provided with a remolding layer, and the remolding layer comprises a reshape layer for changing the dimension of input data.
In order to enable the output feature vector in the global branch to have the same dimension as the output feature vector of the bottleneck layer in the local branch, the invention adds a remolding layer of a reshape layer behind the multilayer perceptron model, so that the dimension of input data can be changed, but the content is kept unchanged.
Step four, the down-sampling feature map is subjected to up-sampling treatment step by step through a decoder to obtain feature maps with the same size and dimension, and a final feature fusion map is generated;
the method comprises the steps of feature fusion between an encoder and a decoder and decoder output feature fusion; the specific process of feature fusion between the encoder and the decoder comprises the following steps: fusing the characteristic diagrams between corresponding layers of the encoder and the decoder through skip-connection operation, and reducing gradient elimination and network degradation through skip-connection jumping connection; the specific process of the decoder output feature fusion comprises the following steps: the decoder performs layer-by-layer and continuous up-sampling on the feature maps from different encoders, outputs features with the same size and dimension, and then fuses the output feature maps through a concat () method.
Through skip-connection, the problems of gradient loss and network degradation can be effectively reduced, and the training is easier; when normal propagation is carried out on the network, the deep gradient can be more easily transmitted back to the shallow layer, and due to the structure, the number of layers of the neural network can be set more randomly, so that the reuse of the features can be improved, and the problem of feature loss in the downsampling process of the encoder is solved.
Fifthly, the feature fusion graph completes the prediction of image segmentation through a convolutional neural network;
and the convolutional neural network adopts a softmax activation function to complete the prediction of image segmentation.
To verify the validity of the method of the present invention, the present invention verifies the object segmentation task in the PASCAL VOC 2012 data set whose main purpose is to establish a challenge task for identification of visual objects in the actual scene, the PASCAL VOC 2012 data set contains 20 object classes and 1 background class, wherein 1464 images are used for training, 1449 images are used for verification, 1456 images are used for testing, and the segmentation objects in each image in the data set sample are labeled. The results of some of the tests are shown in table 1 (mlou is the average segmentation accuracy).
TABLE 1 PASCAL VOC 2012 data set test results
Model bird bottle bus car chair cow mbike plant mIOU(%)
DSNA 76.8 53.9 80.6 67.7 21.0 70.2 74.8 45.2 60.1
SRA-CFM 69.5 65.6 81.0 69.2 30.0 68.7 71.7 50.4 61.8
SegNet 78 61.2 82.5 77.5 29.9 68.7 74.0 56.3 66.7
DGCM 79.8 69.8 86.7 81.2 30.8 70.5 78.0 56.7 69.8
GCRF 83.3 71.8 89.0 82.7 31.1 79.5 80.5 58.9 73.2
DPN 78.4 72.3 89.3 83.5 31.7 79.9 79.8 59.5 74.1
The invention 85.9 76.2 90.2 84.1 33.8 78.9 81.6 58.6 74.7
In addition, the invention also performs experiments on the Cityscapes data set, and further verifies the generalization of the method. Cityscaps is a data set for semantic urban scene understanding with 5000 images taken from 50 urban environments, with 19 semantic classes of high quality pixel-level labels, 2979 images in the training set, 500 images in the validation set, 1525 images in the test set, and some of the experimental results shown in table 2.
TABLE 2Cityscapes data set test results
model mIOU(%)
ICNet 67.7
BisNet 69.0
LRR 70.0
DeepLab 71.4
The invention 72.3
Test results show that the method can effectively improve the object segmentation precision. Compared with the traditional single-scale segmentation method, the method has remarkable advantages in accuracy and training efficiency; the invention is a simple and expanded framework on the method framework, and the multi-sensing field space characteristics and the converter encoder can be increased or decreased according to the actual situation; compared with other semantic segmentation methods, the method has the advantages that the generalization performance of the model is improved due to the introduction of multiple receptive fields and the converter.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing descriptions of specific exemplary embodiments of the present invention have been presented for purposes of illustration and description. It is not intended to limit the invention to the precise form disclosed, and obviously many modifications and variations are possible in light of the above teaching. The exemplary embodiments were chosen and described in order to explain certain principles of the invention and its practical application to enable one skilled in the art to make and use various exemplary embodiments of the invention and various alternatives and modifications as are suited to the particular use contemplated. It is intended that the scope of the invention be defined by the claims and their equivalents.

Claims (9)

1. An image semantic segmentation method based on multi-receptive field context semantic information is characterized by comprising the following steps:
converting an input image into a pixel matrix through convolution operation;
step two, adopting the expansion convolution with different expansion rates to convert the same pixel matrix into a plurality of characteristic graphs with multi-receptive-field context semantic information;
thirdly, the feature map with the multi-receptive-field context semantic information is subjected to feature extraction and down-sampling processing through converter encoders in different subnets respectively to obtain a plurality of down-sampling feature maps with different receptive fields;
step four, the down-sampling feature map is subjected to up-sampling treatment step by step through a decoder to obtain feature maps with the same size and dimension, and a final feature fusion map is generated;
and fifthly, the feature fusion graph completes the prediction of image segmentation through a convolutional neural network.
2. The image semantic segmentation method based on the context semantic information of multiple receptive fields according to claim 1, wherein the output of the dilation convolution in the second step is:
Figure FDA0003375049510000011
wherein, yiRepresents the ith output of the expansion convolution, the convolution kernel of the expansion convolution has the size of k x k, the expansion rate is r, xiFor the ith input signature mapping of the dilated convolution before the converter subnetwork, m is the filter matrix w k with convolution kernel size k x k]Length of (d).
3. The method for image semantic segmentation based on multi-receptive-field context semantic information as claimed in claim 1, wherein the converter in step three comprises a multi-head auto-attention mechanism model and a multi-layer perceptron model, and inputs of the multi-head auto-attention mechanism model and the multi-layer perceptron model are normalized.
4. The image semantic segmentation method based on the context semantic information of multiple receptive fields according to claim 3, wherein the output of the multi-head self-attention mechanism model is as follows:
Yout=concat[y1,y2,...yi...yh-1]
yi=Attentation(qWi q,jWi j,dWi d)
wherein, Yout∈Rq*j*d,concat[]Denotes a join operation, i ∈ [1, h-1 ]]H is the self-attention block number, each block has its own set of learnable weight matrices (W)i q,Wi j,Wi d) W is the weight matrix of the projection, W is the Rq*j*dAnd q, j and d are three-dimensional dimensions of the first characteristic diagram.
5. The image semantic segmentation method based on the multi-receptive-field context semantic information as claimed in claim 3, wherein a reshaping layer is arranged at an output end of the multi-layer perceptron model, and the reshaping layer comprises a reshape layer for changing the dimension of input data.
6. The image semantic segmentation method based on the multi-receptive-field context semantic information as claimed in claim 1, wherein the decoder performs the step-by-step upsampling in step four, and the specific process of obtaining the feature fusion map with the same size and dimension includes the feature fusion between the encoder and the decoder and the output feature fusion of the decoder.
7. The method as claimed in claim 6, wherein the specific process of feature fusion between the encoder and the decoder includes: and fusing the characteristic diagrams between corresponding layers of the encoder and the decoder through skip-connection operation, and reducing gradient disappearance and network degradation through skip-connection jumping connection.
8. The method for image semantic segmentation based on multi-receptive-field context semantic information as claimed in claim 6, wherein the specific process of the decoder output feature fusion comprises: the decoder performs layer-by-layer and continuous up-sampling on the feature maps from different encoders, outputs features with the same size and dimension, and then fuses the output feature maps through a concat () method.
9. The method for image semantic segmentation based on multi-receptive-field context semantic information as claimed in claim 1, wherein in step five, the convolutional neural network uses softmax activation function to complete the prediction of image segmentation.
CN202111413182.1A 2021-11-25 2021-11-25 Image semantic segmentation method based on multi-receptive-field context semantic information Pending CN114359554A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111413182.1A CN114359554A (en) 2021-11-25 2021-11-25 Image semantic segmentation method based on multi-receptive-field context semantic information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111413182.1A CN114359554A (en) 2021-11-25 2021-11-25 Image semantic segmentation method based on multi-receptive-field context semantic information

Publications (1)

Publication Number Publication Date
CN114359554A true CN114359554A (en) 2022-04-15

Family

ID=81096445

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111413182.1A Pending CN114359554A (en) 2021-11-25 2021-11-25 Image semantic segmentation method based on multi-receptive-field context semantic information

Country Status (1)

Country Link
CN (1) CN114359554A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116003126A (en) * 2023-03-22 2023-04-25 珠海和泽科技有限公司 Preparation method and system of electrostatic chuck surface ceramic material
WO2023236576A1 (en) * 2022-06-06 2023-12-14 京东科技控股股份有限公司 Image feature processing method and apparatus, product, medium, and device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112287940A (en) * 2020-10-30 2021-01-29 西安工程大学 Semantic segmentation method of attention mechanism based on deep learning
CN113688813A (en) * 2021-10-27 2021-11-23 长沙理工大学 Multi-scale feature fusion remote sensing image segmentation method, device, equipment and storage

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112287940A (en) * 2020-10-30 2021-01-29 西安工程大学 Semantic segmentation method of attention mechanism based on deep learning
CN113688813A (en) * 2021-10-27 2021-11-23 长沙理工大学 Multi-scale feature fusion remote sensing image segmentation method, device, equipment and storage

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LIANGLIANG LIU ET AL.: "Multi-Receptive-Field CNN for Semantic Segmentation of Medical Images", 《IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS》, vol. 24, no. 11, pages 3215 - 3225, XP011818004, DOI: 10.1109/JBHI.2020.3016306 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023236576A1 (en) * 2022-06-06 2023-12-14 京东科技控股股份有限公司 Image feature processing method and apparatus, product, medium, and device
CN116003126A (en) * 2023-03-22 2023-04-25 珠海和泽科技有限公司 Preparation method and system of electrostatic chuck surface ceramic material

Similar Documents

Publication Publication Date Title
CN108647585B (en) Traffic identifier detection method based on multi-scale circulation attention network
CN109389078B (en) Image segmentation method, corresponding device and electronic equipment
CN113807355B (en) Image semantic segmentation method based on coding and decoding structure
CN111696110B (en) Scene segmentation method and system
US11100358B2 (en) Method, artificial neural network, device, computer program and machine-readable memory medium for the semantic segmentation of image data
CN111666948B (en) Real-time high-performance semantic segmentation method and device based on multipath aggregation
CN110781744A (en) Small-scale pedestrian detection method based on multi-level feature fusion
Zhang et al. Lightweight and efficient asymmetric network design for real-time semantic segmentation
CN114359554A (en) Image semantic segmentation method based on multi-receptive-field context semantic information
CN111428664B (en) Computer vision real-time multi-person gesture estimation method based on deep learning technology
US20200110960A1 (en) Method, artificial neural network, device, computer program and machine-readable memory medium for the semantic segmentation of image data
CN114495029A (en) Traffic target detection method and system based on improved YOLOv4
CN109657538B (en) Scene segmentation method and system based on context information guidance
CN114863407A (en) Multi-task cold start target detection method based on visual language depth fusion
Yi et al. Elanet: effective lightweight attention-guided network for real-time semantic segmentation
CN115841464A (en) Multi-modal brain tumor image segmentation method based on self-supervision learning
CN116863194A (en) Foot ulcer image classification method, system, equipment and medium
Ayachi et al. An edge implementation of a traffic sign detection system for Advanced driver Assistance Systems
CN113223037A (en) Unsupervised semantic segmentation method and unsupervised semantic segmentation system for large-scale data
Cho et al. Modified perceptual cycle generative adversarial network-based image enhancement for improving accuracy of low light image segmentation
Chacon-Murguia et al. Moving object detection in video sequences based on a two-frame temporal information CNN
Wang et al. Lightweight bilateral network for real-time semantic segmentation
CN117115616A (en) Real-time low-illumination image target detection method based on convolutional neural network
CN115661198A (en) Target tracking method, device and medium based on single-stage target tracking model
CN112805723B (en) Image processing system and method and automatic driving vehicle comprising system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination