CN113283433A - Image semantic segmentation method, system, electronic device and storage medium - Google Patents

Image semantic segmentation method, system, electronic device and storage medium Download PDF

Info

Publication number
CN113283433A
CN113283433A CN202110394315.9A CN202110394315A CN113283433A CN 113283433 A CN113283433 A CN 113283433A CN 202110394315 A CN202110394315 A CN 202110394315A CN 113283433 A CN113283433 A CN 113283433A
Authority
CN
China
Prior art keywords
image
semantic segmentation
model
layer
feature extraction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110394315.9A
Other languages
Chinese (zh)
Inventor
李建强
彭浩然
吕思锐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN202110394315.9A priority Critical patent/CN113283433A/en
Publication of CN113283433A publication Critical patent/CN113283433A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the invention provides an image semantic segmentation method, an image semantic segmentation system, electronic equipment and a storage medium, wherein the method comprises the following steps: determining an image to be semantically segmented; inputting the image into an image semantic segmentation model to obtain an image semantic segmentation result output by the image semantic segmentation model; the image semantic segmentation model is obtained by training based on a sample image and corresponding pixel class labels, and the pixel class labels are predetermined. The invention solves the problems of mismatching of large pieces, multiple divisions and few divisions when the ultrasonic image is segmented, and can effectively improve the image segmentation effect.

Description

Image semantic segmentation method, system, electronic device and storage medium
Technical Field
The present invention relates to the field of image processing technologies, and in particular, to a method, a system, an electronic device, and a storage medium for semantic segmentation of an image.
Background
The medical science and the deep learning have more and more close relationship, the interdisciplinary project of the deep learning and the medical science is endless, and through deep learning, a plurality of achievements have been proved to be capable of saving a great amount of manpower and material resources on the treatment of various diseases.
Hydronephrosis is a common nephropathy, and ultrasonic examination is the basic examination commonly done by suspected hydronephrosis patients, and is convenient, rapid, low in price, harmless and radiationless. If the disease can be judged and graded in the ultrasonic examination stage by using a deep learning method, a large amount of capital, manpower and medical resources can be saved, and related patients can be helped.
Image segmentation semantic recognition is indispensable in ultrasound image classification. However, in the process of renal ultrasound image segmentation, the traditional Unet model cannot well outline the boundary of the segmented part, and the phenomena of large mismatch, large amount of segmentation and small amount of segmentation often occur.
Disclosure of Invention
The embodiment of the invention provides an image semantic segmentation method, an image semantic segmentation system, electronic equipment and a storage medium, which are used for solving the problems that the traditional Unet model cannot well outline the boundary of a segmentation part, and the phenomena of large mismatch, more segmentation and less segmentation often occur when an ultrasonic image is segmented.
In a first aspect, an embodiment of the present invention provides an image semantic segmentation method, including:
determining an image to be semantically segmented;
inputting the image into an image semantic segmentation model to obtain an image semantic segmentation result output by the image semantic segmentation model;
the image semantic segmentation model is obtained by training based on a sample image and corresponding pixel class labels, and the pixel class labels are predetermined.
Preferably, the image semantic segmentation model comprises a trunk feature extraction model, a reinforced feature extraction model, a classification model and a segmentation model;
inputting the image into an image semantic segmentation model to obtain an image semantic segmentation result output by the image semantic segmentation model, wherein the image semantic segmentation result comprises the following steps:
inputting the image into the trunk feature extraction model, and outputting image features of a plurality of effective feature layers;
inputting the image features of the effective feature layers into the enhanced feature extraction model, and outputting the image fusion feature of each effective feature layer;
inputting the image fusion characteristics of each effective characteristic layer into the classification model, and outputting the pixel classification result of the image;
and inputting the pixel classification result of the image into the segmentation model, and outputting the semantic segmentation result of the image.
Preferably, the sample image is selected from an image dataset;
the trunk feature extraction model is obtained by training a convolutional neural network VGG16 based on a sample image selected from an image data set and serving as a training sample image after being labeled;
the enhanced feature extraction model comprises a weight block;
inputting the image features of the plurality of effective feature layers into the enhanced feature extraction model, and outputting the image fusion feature of each effective feature layer, wherein the method comprises the following steps:
weighting the image characteristics of the effective characteristic layers by weight values respectively to obtain the image fusion characteristics of each effective characteristic layer; wherein the weight value is adjustable by the weight block.
Preferably, the weighting of the image features of the plurality of effective feature layers is performed to obtain the image fusion feature of each effective feature layer, and the formula is as follows:
Figure BDA0003018002990000031
wherein, Un+1Is the result of sampling at layer n +1, RnIs an adjustment function for adjusting the resolution of the n +1 th layer to be consistent with the resolution of the n-th layer, PnThe method is a channel number adjusting function based on the weight and the number of channels of the nth layer, and delta is a feature extraction operation function.
In a second aspect, an embodiment of the present invention provides an image semantic segmentation system, including an image determination module and an image semantic segmentation module:
the image determining module is used for determining an image to be subjected to semantic segmentation;
the image semantic segmentation module is used for inputting the image into an image semantic segmentation model to obtain an image semantic segmentation result output by the image semantic segmentation model;
the image semantic segmentation model is obtained by training based on a sample image and corresponding pixel class labels, and the pixel class labels are predetermined.
Preferably, the image semantic segmentation module comprises a trunk feature extraction module, a reinforced feature extraction module, a classification module and a segmentation module;
the trunk feature extraction module is used for obtaining image features of a plurality of effective feature layers based on the determined image;
the enhanced feature extraction module is used for obtaining image fusion features of each effective feature layer based on the image features of the effective feature layers;
the classification module is used for obtaining an image pixel classification result based on the image fusion characteristics of each effective characteristic layer;
and the segmentation module is used for obtaining an image semantic segmentation result based on the image pixel classification result.
Preferably, the sample image is selected from an image dataset;
the trunk feature extraction module comprises a trunk feature extraction model, and the trunk feature extraction model is obtained by training a convolutional neural network VGG16 based on a sample image selected from an image data set and serving as a training sample image after being labeled;
the enhanced feature extraction module comprises a weight block;
the weight block is used for weighting the image characteristics of the effective characteristic layers respectively to obtain the image fusion characteristics of each effective characteristic layer; wherein the weight value is adjustable by the weight block.
Preferably, the weight value is configured to weight the image features of the plurality of effective feature layers respectively to obtain an image fusion feature of each effective feature layer, where the formula is as follows:
Figure BDA0003018002990000041
wherein, Un+1Is the result of sampling at layer n +1, RnIs an adjustment function for adjusting the resolution of the n +1 th layer to be consistent with the resolution of the n-th layer, PnThe method is a channel number adjusting function based on the weight and the number of channels of the nth layer, and delta is a feature extraction operation function.
In a third aspect, an embodiment of the present invention provides an electronic device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the steps of the image semantic segmentation method according to any one of the above-mentioned first aspects when executing the program.
In a fourth aspect, an embodiment of the present invention provides a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the image semantic segmentation method according to any one of the above-mentioned first aspects.
According to the image semantic segmentation method, the image semantic segmentation system, the electronic equipment and the storage medium, the weight block is added on the basis of the Unet through the novel semantic segmentation network structure based on the Unet, and a plurality of layers can be combined on the basis of self-defining the weight, so that the receptive field is enlarged, the network can extract context information better, and the effect of semantic segmentation network is improved.
Drawings
In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a flow chart of a semantic segmentation method for an image according to the present invention;
FIG. 2 is a block diagram of an image semantic segmentation model provided by the present invention;
FIG. 3 is a diagram of a Unet model network architecture provided by the present invention;
FIG. 4 is an optimized view of the MwUnet network structure provided by the present invention;
FIG. 5 is a diagram of a weighted skip connection scheme provided by the present invention;
FIG. 6 is a schematic structural diagram of an image semantic segmentation system provided by the present invention;
FIG. 7 is a schematic structural diagram of an image semantic segmentation module provided by the present invention;
FIG. 8 is a schematic structural diagram of an electronic device provided by the present invention;
reference numerals:
1: down-sampling; 2: jump connection; 3: upsampling;
4: performing convolution operation; 5: and (6) weighting the weight.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
An image semantic segmentation method, system, electronic device and storage medium provided by the present invention are described below with reference to fig. 1 to 8.
The embodiment of the invention provides an image semantic segmentation method. Fig. 1 is a schematic flow chart of an image semantic segmentation method according to an embodiment of the present invention, as shown in fig. 1, the method includes:
step 110, determining an image to be semantically segmented;
in particular, renal ultrasound images are used in modern medical image recognition for practical applications.
Step 120, inputting the image into an image semantic segmentation model to obtain an image semantic segmentation result output by the image semantic segmentation model;
the image semantic segmentation model is obtained by training based on a sample image and corresponding pixel class labels, and the pixel class labels are predetermined.
In particular, the goal of semantic segmentation of images is to label the class of each pixel in the image, and this task is often referred to as dense prediction because every pixel in the image needs to be predicted.
According to the method provided by the embodiment of the invention, the image semantic segmentation model is obtained after sample image training, and the image pixels to be subjected to semantic segmentation are classified by inputting the image to be subjected to semantic segmentation, so that the image segmentation effect can be effectively improved.
Based on any of the above embodiments, as shown in fig. 2, the image semantic segmentation model 200 includes a trunk feature extraction model 210, an enhanced feature extraction model 220, a classification model 230, and a segmentation model 240;
inputting the image into an image semantic segmentation model 200 to obtain an image semantic segmentation result output by the image semantic segmentation model 200, wherein the image semantic segmentation result comprises:
inputting the image into the trunk feature extraction model 210, and outputting image features of a plurality of effective feature layers;
inputting the image features of the plurality of effective feature layers into the enhanced feature extraction model 220, and outputting the image fusion feature of each effective feature layer;
inputting the image fusion features of each effective feature layer into the classification model 230, and outputting the pixel classification result of the image;
the pixel classification result of the image is input into the segmentation model 240, and the semantic segmentation result of the image is output.
Specifically, the image semantic segmentation method of the embodiment of the present invention is a novel semantic segmentation network structure based on the Unet, wherein a weight block is added on the basis of the Unet, and an Unet model structure can be divided into three parts:
1. the first part is a trunk feature extraction part, and a feature layer is acquired by using the trunk part. The backbone feature extraction part of the network is similar to VGG, a convolutional and maximally pooled stack. The five preliminary valid feature layers obtained in this step will be used for feature fusion in the next step.
2. The second section is an enhanced feature extraction section. And performing up-sampling on the five preliminary effective characteristic layers obtained in the first step, and performing characteristic fusion to obtain a final effective characteristic layer fused with all the characteristics.
3. The third part is the classification prediction part. And classifying each feature point by using the finally obtained last effective feature layer, namely classifying each pixel point.
The loss function used by Unet is CE loss (cross entropy), which is defined as follows:
Figure BDA0003018002990000071
wherein, p (x)i) Represents a ground truth, i.e., label information of the split network, q (x)i) Representing information after network segmentation.
In any of the above embodiments, the sample image is selected from an image dataset;
specifically, the image dataset of Imagenet is widely used for training data of an object recognition network in a deep learning network, and at present, 14197122 images are totally contained in the Imagenet, and the images are totally divided into 21841 categories (syncets), and the major categories include: animal, application, bird, coverage, device, fabric, fish, etc.
The trunk feature extraction model is obtained by training a convolutional neural network VGG16 based on a sample image selected from an image data set and serving as a training sample image after being labeled;
in particular, the Encoder feature extraction network part adopts VGG16 as a backbone to facilitate the transfer learning of pre-training network parameters downloaded from the official website in VGG 16. And the correctly labeled data is used as the basis for supervised learning of correct samples.
The enhanced feature extraction model comprises a weight block;
inputting the image features of the plurality of effective feature layers into the enhanced feature extraction model, and outputting the image fusion feature of each effective feature layer, wherein the method comprises the following steps:
weighting the image characteristics of the effective characteristic layers by weight values respectively to obtain the image fusion characteristics of each effective characteristic layer; wherein the weight value is adjustable by the weight block.
Specifically, the embodiment of the invention constructs a novel semantic segmentation network structure based on Unet, adds a weight block on the basis of Unet, and can combine a plurality of layers on the basis of self-defined weight, thereby enlarging the receptive field and better enabling the network to extract context information. The receptive field is the area size of the mapping of the pixel points on the feature map (feature map) output by each layer of the neural network on the input picture. The explanation of the restyle point is that one point on the feature map corresponds to an area on the input map, which is also an area that can be noticed by the neural network in the present hierarchy. Fig. 3 is a simplified schematic diagram of a structure diagram of the network of the Unet model. In contrast to the Unet shown in FIG. 3, the MwUnet network architecture is optimized as shown in FIG. 4.
The overall structure of the MwUnet is greatly changed from the original Unet, and different from the original Unet, the resolution of the network input diagram of the MwUnet is the same as the resolution of the final output diagram. Although a U-type network is adopted, it is not the same to each layer.
Based on any of the above embodiments, the weighting of the image features of the plurality of effective feature layers is performed to obtain the image fusion feature of each effective feature layer, and the formula is as follows:
Figure BDA0003018002990000081
wherein, Un+1Is on the n +1 th layerResult of sampling, RnIs an adjustment function for adjusting the resolution of the n +1 th layer to be consistent with the resolution of the n-th layer, PnThe method is a channel number adjusting function based on the weight and the number of channels of the nth layer, and delta is a feature extraction operation function.
Specifically, as shown in fig. 3 and 4, unlike the jump connection 2skip connection of the Unet, the MwUnet does not directly connect the same level information in the encoder to the decoder, but adds a weight block 5, and realizes different multi-level combination by manually adjusting the weight. The decoder of each level can receive semantic information extracted after the encoder samples 1 on different levels, and the reception fields of different levels are different, so that the decoder of each level can receive the semantic information extracted by the feature extraction network under different resolutions. This connection is called weighted skip connection (weighted skip connection 2), and the specific operation method is shown in fig. 5. Weighted skip connection is obtained by weighting the result given by the weight 5weight block encoder, and the four levels of the encoder generate a feature matrix of the corresponding channel number according to the weight calculation, for example, the weight in fig. 5 is 1:1:1:1, all the four layers generate 128 channels, and the four-layer result concatemate is to combine the four 128-channel matrices into a 512-channel matrix, and then to combine the 512-channel matrix with the operation result concatemate of the previous layer, that is, to combine the 512-channel matrix with the result of the upsampling 3 of the previous layer, as shown in fig. 5, the result of the previous layer is a 512-channel matrix, that is, to decode the three-dimensional concatemate.
The Weight block can manually change the Weight of each layer. For example, if a segmentation task needs to focus on information of the whole picture, the model should have a wider receptive field, and the weight value can be set to a default value of 1:1:1:1 as shown in fig. 5. If one segmentation task needs attention but background knowledge is de-emphasized, the weight value may be set to 1:1:1: 9. Note that to ensure structural invariance, the sum of the weight values should be a multiple of 4.
As shown in FIG. 5, X3,1The method of operation of (1), in conjunction with fig. 3 and 4. X0,0To X3,0Each of (1)The weights of the layers are the same, so that finally each layer is subjected to convolution operation 4 of maxporoling and 3X 3 to finally become feature maps with the resolution of 64X 64 (consistent with the resolution of the upsampling 3 of the layer) and the channel number of 128, the feature maps of the 4 layers are aggregated to form a Total cube of 512 channels, and then the aggregate operation is performed with the result of the upsampling 3 to obtain X3,1
The following describes an image semantic segmentation system provided by the present invention, and the following description and the above-described image semantic segmentation method can be referred to with each other.
Fig. 6 is a schematic structural diagram of an image semantic segmentation system according to an embodiment of the present invention, as shown in fig. 6, the system includes an image determination module 610 and an image semantic segmentation module 620:
the image determining module 610 is configured to determine an image to be semantically segmented;
the image semantic segmentation module 620 is configured to input the image into an image semantic segmentation model to obtain an image semantic segmentation result output by the image semantic segmentation model;
the image semantic segmentation model is obtained by training based on a sample image and corresponding pixel class labels, and the pixel class labels are predetermined.
The system provided by the embodiment of the invention obtains the image semantic segmentation model based on sample image training, and can effectively improve the image segmentation effect by inputting the image to be subjected to semantic segmentation to classify the image pixels.
Based on any of the above embodiments, as shown in fig. 7, the image semantic segmentation module includes a trunk feature extraction module 710, an enhanced feature extraction module 720, a classification module 730, and a segmentation module 740;
the trunk feature extraction module 710 is configured to obtain image features of a plurality of valid feature layers based on the determined image;
the enhanced feature extraction module 720 is configured to obtain an image fusion feature of each effective feature layer based on the image features of the plurality of effective feature layers;
the classification module 730 is configured to obtain an image pixel classification result based on the image fusion feature of each effective feature layer;
the segmentation module 740 is configured to obtain an image semantic segmentation result based on the image pixel classification result.
In any of the above embodiments, the sample image is selected from an image dataset;
the trunk feature extraction module comprises a trunk feature extraction model, and the trunk feature extraction model is obtained by training a convolutional neural network VGG16 based on a sample image selected from an image data set and serving as a training sample image after being labeled;
the enhanced feature extraction module comprises a weight block;
the weight block is used for weighting the image characteristics of the effective characteristic layers respectively to obtain the image fusion characteristics of each effective characteristic layer; wherein the weight value is adjustable by the weight block.
Based on any of the above embodiments, the weight block is configured to perform weight weighting on the image features of the plurality of effective feature layers respectively to obtain an image fusion feature of each effective feature layer, and the formula is as follows:
Figure BDA0003018002990000111
wherein, Un+1Is the result of sampling at layer n +1, RnIs an adjustment function for adjusting the resolution of the n +1 th layer to be consistent with the resolution of the n-th layer, PnThe method is a channel number adjusting function based on the weight and the number of channels of the nth layer, and delta is a feature extraction operation function.
Fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 8, the electronic device may include: a processor (processor)810, a communication Interface 820, a memory 830 and a communication bus 840, wherein the processor 810, the communication Interface 820 and the memory 830 communicate with each other via the communication bus 840. The processor 810 may call logic instructions in the memory 830 to perform an image semantic segmentation method comprising: determining an image to be semantically segmented; inputting the image into an image semantic segmentation model to obtain an image semantic segmentation result output by the image semantic segmentation model; the image semantic segmentation model is obtained by training based on a sample image and corresponding pixel class labels, and the pixel class labels are predetermined.
In addition, the logic instructions in the memory 830 may be implemented in software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In another aspect, an embodiment of the present invention further provides a computer program product, where the computer program product includes a computer program stored on a non-transitory computer-readable storage medium, the computer program includes program instructions, and when the program instructions are executed by a computer, the computer can execute the image semantic segmentation method provided by the above methods, where the method includes: determining an image to be semantically segmented; inputting the image into an image semantic segmentation model to obtain an image semantic segmentation result output by the image semantic segmentation model; the image semantic segmentation model is obtained by training based on a sample image and corresponding pixel class labels, and the pixel class labels are predetermined.
In yet another aspect, an embodiment of the present invention further provides a non-transitory computer-readable storage medium, on which a computer program is stored, the computer program being implemented by a processor to perform the image semantic segmentation method provided in the foregoing aspects, the method including: determining an image to be semantically segmented; inputting the image into an image semantic segmentation model to obtain an image semantic segmentation result output by the image semantic segmentation model; the image semantic segmentation model is obtained by training based on a sample image and corresponding pixel class labels, and the pixel class labels are predetermined.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. An image semantic segmentation method, comprising:
determining an image to be semantically segmented;
inputting the image into an image semantic segmentation model to obtain an image semantic segmentation result output by the image semantic segmentation model;
the image semantic segmentation model is obtained by training based on a sample image and corresponding pixel class labels, and the pixel class labels are predetermined.
2. The image semantic segmentation method according to claim 1, wherein the image semantic segmentation model comprises a trunk feature extraction model, an enhanced feature extraction model, a classification model and a segmentation model;
inputting the image into an image semantic segmentation model to obtain an image semantic segmentation result output by the image semantic segmentation model, wherein the image semantic segmentation result comprises the following steps:
inputting the image into the trunk feature extraction model, and outputting image features of a plurality of effective feature layers;
inputting the image features of the effective feature layers into the enhanced feature extraction model, and outputting the image fusion feature of each effective feature layer;
inputting the image fusion characteristics of each effective characteristic layer into the classification model, and outputting the pixel classification result of the image;
and inputting the pixel classification result of the image into the segmentation model, and outputting the semantic segmentation result of the image.
3. The image semantic segmentation method according to claim 2, characterized in that the sample image is selected from an image dataset;
the trunk feature extraction model is obtained by training a convolutional neural network VGG16 based on a sample image selected from an image data set and serving as a training sample image after being labeled;
the enhanced feature extraction model comprises a weight block;
inputting the image features of the plurality of effective feature layers into the enhanced feature extraction model, and outputting the image fusion feature of each effective feature layer, wherein the method comprises the following steps:
weighting the image characteristics of the effective characteristic layers by weight values respectively to obtain the image fusion characteristics of each effective characteristic layer; wherein the weight value is adjustable by the weight block.
4. The image semantic segmentation method according to claim 3, wherein the weighted value weighting is performed on the image features of the plurality of effective feature layers respectively to obtain the image fusion feature of each effective feature layer, and a formula thereof is as follows:
Figure FDA0003018002980000021
wherein, Un+1Is the result of sampling at layer n +1, RnIs an adjustment function for adjusting the resolution of the n +1 th layer to be consistent with the resolution of the n-th layer, PnThe method is a channel number adjusting function based on the weight and the number of channels of the nth layer, and delta is a feature extraction operation function.
5. An image semantic segmentation system, characterized by comprising an image determination module and an image semantic segmentation module:
the image determining module is used for determining an image to be subjected to semantic segmentation;
the image semantic segmentation module is used for inputting the image into an image semantic segmentation model to obtain an image semantic segmentation result output by the image semantic segmentation model;
the image semantic segmentation model is obtained by training based on a sample image and corresponding pixel class labels, and the pixel class labels are predetermined.
6. The image semantic segmentation system according to claim 5, wherein the image semantic segmentation module comprises a trunk feature extraction module, an enhanced feature extraction module, a classification module, and a segmentation module;
the trunk feature extraction module is used for obtaining image features of a plurality of effective feature layers based on the determined image;
the enhanced feature extraction module is used for obtaining image fusion features of each effective feature layer based on the image features of the effective feature layers;
the classification module is used for obtaining an image pixel classification result based on the image fusion characteristics of each effective characteristic layer;
and the segmentation module is used for obtaining an image semantic segmentation result based on the image pixel classification result.
7. The image semantic segmentation system of claim 6 wherein the sample image is selected from an image dataset;
the trunk feature extraction module comprises a trunk feature extraction model, and the trunk feature extraction model is obtained by training a convolutional neural network VGG16 based on a sample image selected from an image data set and serving as a training sample image after being labeled;
the enhanced feature extraction module comprises a weight block;
the weight block is used for weighting the image characteristics of the effective characteristic layers respectively to obtain the image fusion characteristics of each effective characteristic layer; wherein the weight value is adjustable by the weight block.
8. The image semantic segmentation system according to claim 7, wherein the weight block is configured to weight the image features of the plurality of effective feature layers respectively to obtain the image fusion features of each effective feature layer, and the formula is as follows:
Figure FDA0003018002980000031
wherein, Un+1Is the result of sampling at layer n +1, RnIs an adjustment function for adjusting the resolution of the n +1 th layer to be consistent with the resolution of the n-th layer, PnThe method is a channel number adjusting function based on the weight and the number of channels of the nth layer, and delta is a feature extraction operation function.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the image semantic segmentation method according to any one of claims 1 to 4 when executing the program.
10. A non-transitory computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the image semantic segmentation method according to any one of claims 1 to 4.
CN202110394315.9A 2021-04-13 2021-04-13 Image semantic segmentation method, system, electronic device and storage medium Pending CN113283433A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110394315.9A CN113283433A (en) 2021-04-13 2021-04-13 Image semantic segmentation method, system, electronic device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110394315.9A CN113283433A (en) 2021-04-13 2021-04-13 Image semantic segmentation method, system, electronic device and storage medium

Publications (1)

Publication Number Publication Date
CN113283433A true CN113283433A (en) 2021-08-20

Family

ID=77276611

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110394315.9A Pending CN113283433A (en) 2021-04-13 2021-04-13 Image semantic segmentation method, system, electronic device and storage medium

Country Status (1)

Country Link
CN (1) CN113283433A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115713535A (en) * 2022-11-07 2023-02-24 阿里巴巴(中国)有限公司 Image segmentation model determination method and image segmentation method
CN115713535B (en) * 2022-11-07 2024-05-14 阿里巴巴(中国)有限公司 Image segmentation model determination method and image segmentation method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115713535A (en) * 2022-11-07 2023-02-24 阿里巴巴(中国)有限公司 Image segmentation model determination method and image segmentation method
CN115713535B (en) * 2022-11-07 2024-05-14 阿里巴巴(中国)有限公司 Image segmentation model determination method and image segmentation method

Similar Documents

Publication Publication Date Title
CN109711481B (en) Neural networks for drawing multi-label recognition, related methods, media and devices
CN111784671B (en) Pathological image focus region detection method based on multi-scale deep learning
CN110097554B (en) Retina blood vessel segmentation method based on dense convolution and depth separable convolution
US20190180154A1 (en) Text recognition using artificial intelligence
CN110210542B (en) Picture character recognition model training method and device and character recognition system
CN111145181B (en) Skeleton CT image three-dimensional segmentation method based on multi-view separation convolutional neural network
CN110188195B (en) Text intention recognition method, device and equipment based on deep learning
CN110706214B (en) Three-dimensional U-Net brain tumor segmentation method fusing condition randomness and residual error
CN106651887A (en) Image pixel classifying method based convolutional neural network
CN108629772A (en) Image processing method and device, computer equipment and computer storage media
CN112767417A (en) Multi-modal image segmentation method based on cascaded U-Net network
JP2022509030A (en) Image processing methods, devices, equipment and storage media
CN109034218B (en) Model training method, device, equipment and storage medium
CN111680755A (en) Medical image recognition model construction method, medical image recognition device, medical image recognition medium and medical image recognition terminal
CN112949654A (en) Image detection method and related device and equipment
CN115147862A (en) Benthonic animal automatic identification method, system, electronic device and readable storage medium
US20220301106A1 (en) Training method and apparatus for image processing model, and image processing method and apparatus
CN111429468A (en) Cell nucleus segmentation method, device, equipment and storage medium
CN115908363B (en) Tumor cell statistics method, device, equipment and storage medium
CN116486156A (en) Full-view digital slice image classification method integrating multi-scale feature context
CN113283433A (en) Image semantic segmentation method, system, electronic device and storage medium
Lin et al. Dilated generative adversarial networks for underwater image restoration
CN114943670A (en) Medical image recognition method and device, electronic equipment and storage medium
CN113139581A (en) Image classification method and system based on multi-image fusion
CN113223730B (en) Malaria classification method and device based on artificial intelligence

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination