CN114549913A - Semantic segmentation method and device, computer equipment and storage medium - Google Patents

Semantic segmentation method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN114549913A
CN114549913A CN202210438646.2A CN202210438646A CN114549913A CN 114549913 A CN114549913 A CN 114549913A CN 202210438646 A CN202210438646 A CN 202210438646A CN 114549913 A CN114549913 A CN 114549913A
Authority
CN
China
Prior art keywords
classifier
target
image
probability distribution
pixel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210438646.2A
Other languages
Chinese (zh)
Other versions
CN114549913B (en
Inventor
田倬韬
陈亦新
赖昕
蒋理
刘枢
吕江波
沈小勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Smartmore Technology Co Ltd
Original Assignee
Shenzhen Smartmore Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Smartmore Technology Co Ltd filed Critical Shenzhen Smartmore Technology Co Ltd
Priority to CN202210438646.2A priority Critical patent/CN114549913B/en
Publication of CN114549913A publication Critical patent/CN114549913A/en
Application granted granted Critical
Publication of CN114549913B publication Critical patent/CN114549913B/en
Priority to PCT/CN2022/120480 priority patent/WO2023206944A1/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate

Abstract

The application relates to an image semantic segmentation method, an image semantic segmentation device, a computer device, a storage medium and a computer program product. The method comprises the following steps: acquiring a target image; performing feature extraction on the target image to obtain image extraction features corresponding to the target image; processing the image extraction features based on a first classifier to obtain intermediate probability distribution; fusing the intermediate probability distribution and the image extraction features to obtain a target classifier corresponding to the image extraction features; processing the image extraction features based on a target classifier and a first classifier to obtain target probability distribution corresponding to each pixel class; and obtaining target pixel categories corresponding to all pixel points in the target image respectively based on the target probability distribution. By adopting the method, the image semantic segmentation accuracy can be improved.

Description

Semantic segmentation method and device, computer equipment and storage medium
Technical Field
The present application relates to the field of artificial intelligence technologies, and in particular, to a semantic segmentation method, apparatus, computer device, and storage medium.
Background
With the development of artificial intelligence technology, image recognition is more and more widely applied. For example, for an image needing to be segmented, the image semantic segmentation model is used to greatly improve the image segmentation efficiency.
In the related technology, an image can be input into a feature extractor and a classifier of an image semantic segmentation model for image segmentation to obtain categories corresponding to each pixel point, however, the current image semantic segmentation model uses a general classifier for different samples, cannot well depict feature distribution of different samples, and has the problem of low segmentation accuracy.
Disclosure of Invention
In view of the above, it is necessary to provide a method, an apparatus, a computer device, a computer readable storage medium, and a computer program product capable of semantic segmentation.
In a first aspect, the present application provides a method for semantic segmentation of an image. The method comprises the following steps: acquiring a target image; performing feature extraction on the target image to obtain image extraction features corresponding to the target image; processing the image extraction features based on a first classifier to obtain intermediate probability distribution, wherein the intermediate probability distribution comprises the probability that each pixel point in the target image belongs to each pixel category; fusing the intermediate probability distribution with the image extraction features to obtain a target classifier corresponding to the image extraction features; processing the image extraction features based on the target classifier and the first classifier to obtain target probability distribution corresponding to each pixel class, wherein the target probability distribution comprises the probability that each pixel point in the target image belongs to each pixel class; and obtaining target pixel categories corresponding to all pixel points in the target image respectively based on the target probability distribution.
In one embodiment, the processing the image extraction features based on the target classifier and the first classifier to obtain a target probability distribution corresponding to each pixel class includes: acquiring a first weighting coefficient corresponding to the target classifier and a second weighting coefficient corresponding to the general classifier; weighting the target classifier based on the first weighting coefficient to obtain a weighted target classifier; weighting the general classifier based on the second weighting coefficient to obtain a weighted general classifier; and synthesizing the weighted general classifier and the weighted target classifier to obtain a comprehensive classifier, and processing the image extraction features by using the comprehensive classifier to obtain target probability distribution corresponding to each pixel class.
In one embodiment, the obtaining the first weighting coefficient corresponding to the target classifier and the second weighting coefficient corresponding to the generic classifier includes: obtaining classifier similarity between the general classifier and the target classifier, and obtaining the first weighting coefficient based on the classifier similarity, wherein the first weighting coefficient and the classifier similarity form a positive correlation; and obtaining the second weighting coefficient based on the first weighting coefficient, wherein the second weighting coefficient and the first weighting coefficient are in a negative correlation relationship.
In one embodiment, the deriving the first weighting factor based on the classifier similarity includes: acquiring a sensitivity coefficient; and multiplying the sensitivity coefficient by the similarity of the classifier to obtain the first weighting coefficient.
In one embodiment, the processing the image extraction features based on the first classifier to obtain an intermediate probability distribution includes: processing the image extraction features based on the general classifier in the image semantic segmentation model to obtain initial probability distribution, wherein the initial probability distribution comprises an initial probability matrix corresponding to each pixel class; shifting elements in the initial probability matrix to the same row or the same column to obtain a shifted probability matrix; and splicing the shift probability matrixes corresponding to each pixel category to obtain intermediate probability distribution.
In one embodiment, the calculation method corresponding to the target classifier includes: shifting and splicing the initial probability distribution obtained by processing the image extraction features, and activating the spliced initial probability distribution by using an activation function to obtain an intermediate probability distribution; and performing matrix multiplication operation on the intermediate probability distribution and the image extraction features to obtain the target classifier.
In one embodiment, the fusing the intermediate probability distribution with the image extraction features to obtain the target classifier corresponding to the image extraction features includes: obtaining a confidence threshold; comparing each probability in the intermediate probability distribution with the confidence coefficient threshold value, and obtaining a mask matrix based on the magnitude relation between the probability and the confidence coefficient threshold value; screening the intermediate probability distribution based on the mask matrix to obtain denoising probability distribution; and multiplying the denoising probability distribution and the image extraction characteristics to obtain the target classifier.
In a second aspect, the application further provides an image semantic segmentation device. The device comprises: the target image acquisition module is used for acquiring a target image; the image extraction feature extraction module is used for extracting features of the target image to obtain image extraction features corresponding to the target image; a middle probability distribution obtaining module, configured to process the image extraction features based on a first classifier to obtain a middle probability distribution, where the middle probability distribution includes a probability that each pixel point in the target image belongs to each pixel class; a target classifier obtaining module, configured to fuse the intermediate probability distribution with the image extraction features to obtain a target classifier corresponding to the image extraction features; a target probability distribution obtaining module, configured to process the image extraction features based on the target classifier and the first classifier to obtain a target probability distribution corresponding to each pixel class, where the target probability distribution includes a probability that each pixel point in the target image belongs to each pixel class; and the target pixel category obtaining module is used for obtaining the target pixel categories corresponding to the pixel points in the target image respectively based on the target probability distribution.
In one embodiment, the target classifier obtaining module is configured to: acquiring a first weighting coefficient corresponding to the target classifier and a second weighting coefficient corresponding to the general classifier; weighting the target classifier based on the first weighting coefficient to obtain a weighted target classifier; weighting the general classifier based on the second weighting coefficient to obtain a weighted general classifier; and synthesizing the weighted general classifier and the weighted target classifier to obtain a comprehensive classifier, and processing the image extraction features by using the comprehensive classifier to obtain target probability distribution corresponding to each pixel class.
In one embodiment, the target classifier obtaining module is configured to: obtaining classifier similarity between the general classifier and the target classifier, and obtaining the first weighting coefficient based on the classifier similarity, wherein the first weighting coefficient and the classifier similarity form a positive correlation; and obtaining the second weighting coefficient based on the first weighting coefficient, wherein the second weighting coefficient and the first weighting coefficient are in a negative correlation relationship.
In one embodiment, the target classifier obtaining module is configured to: based on obtaining the sensitivity factor; and multiplying the sensitivity coefficient by the similarity of the classifier to obtain the first weighting coefficient.
In one embodiment, the intermediate probability distribution obtaining module is configured to: processing the image extraction features based on the general classifier in the image semantic segmentation model to obtain initial probability distribution, wherein the initial probability distribution comprises an initial probability matrix corresponding to each pixel class; shifting elements in the initial probability matrix to the same row or the same column to obtain a shifted probability matrix; and splicing the shift probability matrixes corresponding to each pixel category to obtain intermediate probability distribution.
In one embodiment, the calculation method corresponding to the target classifier includes: shifting and splicing the initial probability distribution obtained by processing the image extraction features, and activating the spliced initial probability distribution by using an activation function to obtain an intermediate probability distribution; and performing matrix multiplication operation on the intermediate probability distribution and the image extraction features to obtain the target classifier.
In one embodiment, the target classifier obtaining module is configured to: obtaining a confidence threshold; comparing each probability in the intermediate probability distribution with the confidence coefficient threshold value, and obtaining a mask matrix based on the magnitude relation between the probability and the confidence coefficient threshold value; screening the intermediate probability distribution based on the mask matrix to obtain denoising probability distribution; and multiplying the denoising probability distribution and the image extraction characteristics to obtain the target classifier.
In a third aspect, the present application also provides a computer device. The computer device comprises a memory storing a computer program and a processor implementing the following steps when executing the computer program: acquiring a target image; performing feature extraction on the target image to obtain image extraction features corresponding to the target image; processing the image extraction features based on a first classifier to obtain intermediate probability distribution, wherein the intermediate probability distribution comprises the probability that each pixel point in the target image belongs to each pixel category; fusing the intermediate probability distribution with the image extraction features to obtain a target classifier corresponding to the image extraction features; processing the image extraction features based on the target classifier and the first classifier to obtain target probability distribution corresponding to each pixel class, wherein the target probability distribution comprises the probability that each pixel point in the target image belongs to each pixel class; and obtaining target pixel categories corresponding to all pixel points in the target image respectively based on the target probability distribution.
In a fourth aspect, the present application further provides a computer-readable storage medium. The computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of: acquiring a target image; performing feature extraction on the target image to obtain image extraction features corresponding to the target image; processing the image extraction features based on a first classifier to obtain intermediate probability distribution, wherein the intermediate probability distribution comprises the probability that each pixel point in the target image belongs to each pixel category; fusing the intermediate probability distribution with the image extraction features to obtain a target classifier corresponding to the image extraction features; processing the image extraction features based on the target classifier and the first classifier to obtain target probability distribution corresponding to each pixel class, wherein the target probability distribution comprises the probability that each pixel point in the target image belongs to each pixel class; and obtaining target pixel categories corresponding to all pixel points in the target image respectively based on the target probability distribution.
In a fifth aspect, the present application further provides a computer program product. The computer program product comprising a computer program which when executed by a processor performs the steps of: acquiring a target image; performing feature extraction on the target image to obtain image extraction features corresponding to the target image; processing the image extraction features based on a first classifier to obtain intermediate probability distribution, wherein the intermediate probability distribution comprises the probability that each pixel point in the target image belongs to each pixel category; fusing the intermediate probability distribution with the image extraction features to obtain a target classifier corresponding to the image extraction features; processing the image extraction features based on the target classifier and the first classifier to obtain target probability distribution corresponding to each pixel class, wherein the target probability distribution comprises the probability that each pixel point in the target image belongs to each pixel class; and obtaining target pixel categories corresponding to all pixel points in the target image respectively based on the target probability distribution.
The image semantic segmentation method, the image semantic segmentation device, the computer equipment, the storage medium and the computer program product are used for obtaining the target image; performing feature extraction on the target image to obtain image extraction features corresponding to the target image; processing the image extraction features based on a first classifier to obtain intermediate probability distribution corresponding to each pixel category, wherein the intermediate probability distribution comprises the probability that each pixel point in the target image belongs to each pixel category; fusing the intermediate probability distribution and the image extraction features to obtain a target classifier corresponding to the image extraction features; processing the image extraction features based on a target classifier and a first classifier to obtain target probability distribution corresponding to each pixel class, wherein the target probability distribution comprises the probability that each pixel point in a target image belongs to each pixel class; and obtaining target pixel categories corresponding to all pixel points in the target image respectively based on the target probability distribution. The feature vector of the target image is extracted through the feature extractor to obtain image extraction features, then the image extraction features are processed through the general classifier to obtain intermediate probability distribution used for constructing a target classifier, then the intermediate probability distribution and the image extraction features are fused to obtain the target classifier after fusion, therefore, the target classifier is adaptively changed based on the image extraction features of the target image, the adaptive perception capability of the classifier on different image contents can be enhanced based on the processing of the general classifier and the target classifier, the accuracy of obtaining the target pixel category corresponding to the target pixel point is improved, and the accuracy of image semantic segmentation is improved.
Drawings
FIG. 1 is a diagram of an embodiment of an application environment of a semantic segmentation method for an image;
FIG. 2 is a flow diagram illustrating a method for semantic segmentation of images according to one embodiment;
FIG. 3 is a graph showing comparison of simulation results of the PANet model in one embodiment;
FIG. 4 is a schematic diagram showing comparison of simulation results of the PFENet model in one embodiment;
FIG. 5 is a flow chart illustrating the image semantic segmentation step in one embodiment;
FIG. 6 is a flowchart illustrating the semantic segmentation step in another embodiment;
FIG. 7 is a flowchart illustrating the semantic segmentation step in yet another embodiment;
FIG. 8 is a flowchart illustrating the semantic segmentation step in yet another embodiment;
FIG. 9 is a flow chart illustrating the semantic segmentation step of the image in one embodiment;
FIG. 10 is a block diagram showing the structure of an image semantic segmentation apparatus according to an embodiment;
FIG. 11 is a diagram illustrating an internal structure of a computer device in one embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more clearly understood, the present application is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of and not restrictive on the broad application.
The image semantic segmentation method provided by the embodiment of the application can be applied to the application environment shown in fig. 1. Wherein the terminal 102 communicates with the server 104 via a network. The data storage system may store data that the server 104 needs to process. The data storage system may be integrated on the server 104, or may be located on the cloud or other network server. The terminal 102 responds to the receiving operation, sends an image semantic segmentation instruction to the server, and the server 104 acquires a target image; performing feature extraction on the target image to obtain image extraction features corresponding to the target image; processing the image extraction features based on a first classifier to obtain intermediate probability distribution corresponding to each pixel category, wherein the intermediate probability distribution comprises the probability that each pixel point in the target image belongs to each pixel category; fusing the intermediate probability distribution and the image extraction features to obtain a target classifier corresponding to the image extraction features; processing the image extraction features based on a target classifier and a first classifier to obtain target probability distribution corresponding to each pixel class, wherein the target probability distribution comprises the probability that each pixel point in a target image belongs to each pixel class; and obtaining target pixel categories corresponding to all pixel points in the target image respectively based on the target probability distribution. The terminal 102 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, internet of things devices and portable wearable devices, and the internet of things devices may be smart speakers, smart televisions, smart air conditioners, smart car-mounted devices, and the like. The portable wearable device can be a smart watch, a smart bracelet, a head-mounted device, and the like. The server 104 may be implemented as a stand-alone server or as a server cluster comprised of multiple servers.
In one embodiment, as shown in fig. 2, there is provided an image semantic segmentation method, which is described by taking the method as an example applied to the server in fig. 1, and includes the following steps:
step S202, a target image is acquired.
The target image can be an image used as an image semantic segmentation of the artificial intelligence model, the artificial intelligence model is used for carrying out semantic segmentation on the image, and meanwhile, parameters of the artificial intelligence model can be changed in a self-adaptive mode.
Specifically, the server may obtain one or more images that need to be segmented using an image semantic segmentation model.
In one embodiment, the camera takes a billboard photo, and semantic segmentation is performed on the photo by using an image semantic segmentation model, and the server acquires the photo from the camera terminal.
And step S204, performing feature extraction on the target image to obtain image extraction features corresponding to the target image.
The image semantic segmentation model can classify the image at a pixel level, and the model can sense the surrounding environment and the content of each pixel to judge the category of the single pixel; the feature extractor may be a function that performs feature extraction on the image, the feature extracted by the function being represented using a feature vector; the image extraction features may be vectors representing the image features obtained by the target image through a feature extractor, and the feature extractor may be based on a Convolutional Neural Network (CNN), which is a type of feed-forward Neural network that includes convolution calculation and has a depth structure. The image semantic segmentation model comprises a feature extractor and a universal classifier, wherein the feature extractor can extract feature vectors in an image, and the universal classifier classifies the feature vectors to obtain the probability of each pixel point in each category.
Specifically, after the feature extractor obtains the target image, the feature extractor performs feature extraction on the target image to obtain an image extraction feature used for representing the image feature.
In one embodiment, the target image includes a cat, a dog and a background, the feature extractor performs feature extraction on the cat, the dog and the background after obtaining the target image, and represents features of the cat, the dog and the background in a vector form after the feature extraction.
Step S206, processing the image extraction features based on the first classifier to obtain intermediate probability distribution corresponding to each pixel category, wherein the intermediate probability distribution comprises the probability that each pixel point in the target image belongs to each pixel category.
The image extraction features may be processed based on a first classifier, and the general classifier may be trained based on many training images and is therefore general for each image, and the result of classification is a probability that each pixel in the image belongs to each class; the intermediate probability distribution may be a vector calculated by converting the image extracted features by a general classifier.
Specifically, the height, width and number of characteristic channels of extracted features of the extracted image are represented by [ h, w, d ], the number of classes to be predicted is represented by n, so the size of the initial probability distribution p should be [ h, w, n ], the attribution probability of each pixel point to n classes is represented, and the size [ n, d ] corresponding to the general classifier is represented by a d-dimension vector for each class in the n classes. The original initial probability distribution p is [ h, w, n ], is adjusted to a vector with the size [ hw, n ] by reshape operation, is further adjusted to a vector with the size [ n, hw ] by exchanging rows and columns, after image extraction features are adjusted by reshape, softmax is a function for obtaining normalized classification probability, a vector (vector elements are arbitrary real numbers) is input to the softmax, probability values of the vector belonging to a certain class are output, and the softmax acts on a 2-dimensional vector (size n) so as to become a weighting matrix of pairs on n classes (the sum of weights on the hw positions in space is 1), and then an intermediate probability distribution is obtained.
In one embodiment, the target image contains a dog, a universal classifier in the image semantic segmentation model performs reshape on the probability of each pixel point of the dog to obtain a vector which is adjusted to the size [ hw, n ] and corresponds to each pixel point of the dog, and then the vector [ n, hw ] corresponding to each pixel point of the dog is obtained through the exchange of vector elements.
And S208, fusing the intermediate probability distribution and the image extraction features to obtain a target classifier corresponding to the image extraction features.
The target classifier may be a function obtained by multiplying the intermediate probability distribution by the image extraction features. It is understood that the generic classifier is a classifier generic to the target classifier, i.e. the generic classifier is trained based on different training images and is a classifier generic to different images, and the target classifier is derived based on the extracted image features of an image and the corresponding intermediate probability distribution and is a specific classifier corresponding to the target image.
In particular, the intermediate probability distribution and the extracted image features (size [ hw, c)]) Performing matrix multiplication to obtain a target classifier Gc(x) (size of [ n, c ]]). The specific expressions of reshape, softmax and matrix multiplication steps are as follows:
Figure 50921DEST_PATH_IMAGE001
whereinX is an image extraction feature, Gc(x) Here, the class information extracted from the picture feature x, i.e. the target classifier, is shown to be the same size as the classifier.
In one embodiment, the probability of each pixel point of the image of the dog is subjected to Reshape, the process of Reshape is [ h, w, n ] to obtain a matrix [ hw, n ] through dimensionality reduction, the [ hw, n ] is shifted to obtain [ n, hw ], Reshape operation is performed to adjust the matrix [ hw, hw ] into a vector with the size [ n, hw ], a second-dimensional vector of the vector is acted by softmax to become a weighting matrix of pairs in n categories (the sum of weights at hw positions in space is 1), and then matrix multiplication is performed on each pixel point of the image of the dog and the image extraction features of the dog.
Step S210, processing the image extraction features based on the target classifier and the first classifier to obtain a target probability distribution corresponding to each pixel class, where the target probability distribution includes a probability that each pixel point in the target image belongs to each pixel class.
The target probability distribution may be a probability distribution vector corresponding to each pixel class obtained by classifying the target image by the target classifier and a general classifier in the first classifier, and there are many pixel classes.
Specifically, the classifier composed of the target classifier and the general classifier is a comprehensive classifier, and the comprehensive classifier can automatically change parameters thereof according to the change of image extraction features, so that the model performance is better. The image extraction features are input into a target classifier and a general classifier, the two functions classify each pixel point of the image features respectively, and finally, target probability distribution corresponding to each pixel class is given. Comprehensive classifier CxGenerated in dependence on the content of the data itself, a comprehensive classifier CxCan be expressed as the following formula:
Figure 763662DEST_PATH_IMAGE002
wherein the content of the first and second substances,
Figure 750072DEST_PATH_IMAGE004
is a weighting coefficient which can adjust the weight of the original classifier and the newly introduced features of the classifier to complete the construction of the adaptive classifier.
In one embodiment, the object classifier Gc(x) The classifier formed by the general classifier C is a comprehensive classifier CxFeature vector input comprehensive classifier C of target imagexIn, through a comprehensive classifier CxAnd processing to obtain the probability of each pixel point of the target image to each category.
Step S212, target pixel categories corresponding to all pixel points in the target image are obtained based on the target probability distribution.
The target pixel category may be a category to which each pixel in the target image can be classified, one image may include pixels of different categories, and image segmentation may segment pixels of the same category into the same region.
Specifically, after passing through the target classifier and the general classifier, each pixel point has a probability corresponding to each category, the category with the highest probability is selected as the target pixel category corresponding to the pixel point, and after the categories are obtained, segmentation is performed to obtain a plurality of regions.
In one embodiment, the probability that the pixel point B is located in the 1 category is 0.07, the probability that the pixel point B is located in the 2 category is 0.1, and the probability that the pixel point B is located in the 3 category is 0.83, so that the probability that the pixel point B is located in the 3 category is the highest, and the output pixel point B is of the 3 category.
In one embodiment, under the condition of general semantic segmentation, the performance improvement is achieved by applying the method to the prior art on a PSPNet segmentation model based on ResNet-50. Under the condition of small sample segmentation, the method is applied to the PANET and PFENet models, and the data set is remarkably improved. As shown in FIGS. 3 and 4, both Fold-0 to Fold-3 and Mean in the figures are data sets, Methods are models, wherein the PANet and PFENet models are existing models, the PANet + Ours and the PFENet + Ours are models identified on the PANet and the PFENet by adopting the embodiment of the application, 1-shot is single-case learning, 5-shot is 5-case learning, and the figure in the figures is the ratio of the intersection and the union of two sets of the calculated true value and the predicted value, and the unit is mIOU.
In the image semantic segmentation method, a target image is obtained; performing feature extraction on the target image to obtain image extraction features corresponding to the target image; processing the image extraction features based on a first classifier to obtain intermediate probability distribution corresponding to each pixel category, wherein the intermediate probability distribution comprises the probability that each pixel point in the target image belongs to each pixel category; fusing the intermediate probability distribution and the image extraction features to obtain a target classifier corresponding to the image extraction features; processing the image extraction features based on a target classifier and a first classifier to obtain target probability distribution corresponding to each pixel class, wherein the target probability distribution comprises the probability that each pixel point in a target image belongs to each pixel class; and obtaining target pixel categories corresponding to all pixel points in the target image respectively based on the target probability distribution. The feature vector of the target image is extracted through the feature extractor to obtain image extraction features, then the image extraction features are processed through the general classifier to obtain intermediate probability distribution used for constructing a target classifier, then the intermediate probability distribution and the image extraction features are fused to obtain the target classifier after fusion, therefore, the target classifier is adaptively changed based on the image extraction features of the target image, the adaptive perception capability of the classifier on different image contents can be enhanced based on the processing of the general classifier and the target classifier, the accuracy of obtaining the target pixel category corresponding to the target pixel point is improved, and the accuracy of image semantic segmentation is improved. Namely, for different pictures, the obtained target classifier is adaptively changed according to the content of the pictures, so that the feature distribution of different images can be better described, and the image classification accuracy at the pixel level is improved.
In an embodiment, as shown in fig. 5, the first classifier includes a general classifier in the image semantic segmentation model, and the processing the image extraction features based on the target classifier and the first classifier to obtain the target probability distribution corresponding to each pixel class includes:
step S302, a first weighting coefficient corresponding to the target classifier and a second weighting coefficient corresponding to the general classifier are obtained.
The first weighting coefficient may be a weight occupied by the target classifier in the comprehensive classifier; the second weighting coefficient may be a weight occupied by the general classifier in the integrated classifier; the first weighting coefficient and the second weighting coefficient may be fixed values set in advance, or may be obtained adaptively based on the target classifier.
Specifically, the server obtains a first weighting coefficient corresponding to the target classifier and a second weighting coefficient corresponding to the general classifier, the first weighting coefficient and the second weighting coefficient are in a negative correlation relationship, and the sum of the first weighting coefficient and the second weighting coefficient is one.
In one embodiment, the server obtains the first weighting factor with a value of 0.7, and the server obtains the second weighting factor with a value of 0.3 because the sum of the first weighting factor and the second weighting factor is 1.
And step S304, weighting the target classifier based on the first weighting coefficient to obtain a weighted target classifier.
The weighted target classifier may be a target classifier with a weighting effect obtained by multiplying the target classifier by the first weighting coefficient.
Specifically, the first weighting coefficient is a weight, the target classifier is a classifier, the classifier comprises model parameters for classification, the weighted target classifier is obtained by multiplying the first weighting coefficient and the target classifier, and the weighted target classifier is expressed by a calculation formula
Figure 649895DEST_PATH_IMAGE005
×Gc
In one embodiment, the first weighting factor is 0.7 and the target classifier is GxThe weighted target classifier obtained by multiplying the two is 0.7Gx
And S306, weighting the general classifier based on the second weighting coefficient to obtain a weighted general classifier.
The weighted general classifier may be a general classifier with a weighting effect obtained by multiplying the general classifier by the second weighting coefficient.
Specifically, the second weighting coefficient is a weight, the general classifier is a classifier, and the weighted general classifier is obtained by multiplying the two, and is expressed as (1-
Figure 996432DEST_PATH_IMAGE004
)×C。
In one embodiment, the second weighting factor is 0.3, the generic classifier is C, and the weighted generic classifier obtained by multiplying the two is 0.3C.
And S308, synthesizing the weighted general classifier and the weighted target classifier to obtain a comprehensive classifier, and processing the image extraction features by using the comprehensive classifier to obtain target probability distribution corresponding to each pixel class.
The comprehensive classifier can be a classifier obtained by adding a general classifier and a target classifier.
Specifically, the image extraction features are processed by a comprehensive classifier composed of a general classifier and a target classifier to obtain target probability distributions for different pixel classes, and a specific expression is as follows:
Figure 563679DEST_PATH_IMAGE007
wherein the content of the first and second substances,
Figure 189833DEST_PATH_IMAGE004
is the first weighting coefficient, 1-
Figure 108110DEST_PATH_IMAGE008
Is a second weighting coefficient, CxIs an integrated classifier in which
Figure 743491DEST_PATH_IMAGE009
Can be in accordance with Gc(x) And C, determining the cosine similarity between the C and the C
Figure 899666DEST_PATH_IMAGE009
In positive correlation. For example, if Gc(x) Cosine similarity Cos (C, G) between C and Cc(x) Higher, say Gc(x) The higher the confidence level of C, the higher the proportion of C can be substituted to accurately predict the current sample characteristic x. Conversely, if cosine Cos (C, G)c(x) G) similarity is low, which means that the classification information introduced from the current sample is greatly different from the original classifier, and the introduction may damage the original generalization capability of the model, thereforec(x) The weight of (a) needs to be correspondingly reduced,
Figure 978611DEST_PATH_IMAGE010
is the coefficient of sensitivity.
In one embodiment, an image extraction feature comprising a cat, a dog, and a background is processed by a general classifier and an object classifier to obtain a probability of 0.02 for a dog pixel in pixel class 1, 0.04 for pixel class 2, and 0.94 for pixel class 3.
In this embodiment, the image extraction features are processed by the general classifier and the target classifier, so that the target probability distribution after two times of processing is obtained, and the pixel points of the target image can be classified more accurately.
In one embodiment, as shown in fig. 6, the obtaining the first weighting coefficient corresponding to the target classifier and the second weighting coefficient corresponding to the generic classifier includes:
step S402, obtaining the classifier similarity between the general classifier and the target classifier, and obtaining a first weighting coefficient based on the classifier similarity, wherein the first weighting coefficient and the classifier similarity form a positive correlation relationship.
The classifier similarity may be a similarity between the general classifier and the target classifier, and if the similarity is higher, it indicates that the reliability of the target classifier is higher, the content of the general classifier may be replaced with a higher proportion.
Specifically, similarity calculation is performed between the general classifier and the target classifier, the obtained result is classifier similarity, and a first weighting coefficient can be obtained through the classifier similarity. The classifier similarity may be in a positive correlation with the first weighting factor, for example, the classifier similarity is directly used as the first weighting factor.
In one embodiment, the general classifier and the target classifier use cosine similarity to represent their similarity, and the cosine similarity obtained through calculation is 0.8, so that the first weighting coefficient is 0.8.
Step S404, a second weighting coefficient is obtained based on the first weighting coefficient, and the second weighting coefficient and the first weighting coefficient are in a negative correlation relationship.
Specifically, the first weighting factor and the second weighting factor are in a negative correlation, for example, the sum of the two is one, so that the second weighting factor can be obtained after the first weighting factor is obtained.
In one embodiment, the first weighting factor is 0.8 and the sum of the first weighting factor and the second weighting factor is one, thus resulting in a second weighting factor of 0.2.
In this embodiment, by introducing the classifier similarity and obtaining the relationship between the classifier similarity and the first weighting coefficient, the adjustment of the first weighting coefficient can have more bases, and a better weighting coefficient can be obtained more easily.
In one embodiment, as shown in fig. 7, deriving the first weighting factor based on the classifier similarity includes:
step S502, a sensitivity coefficient is obtained.
The sensitivity coefficient can be the sensitivity of the comprehensive classifier to the similarity of the classifier, and the adjustment of the sensitivity coefficient can change the magnitude of the first weighting coefficient and the second weighting coefficient.
Specifically, the server is preset with a sensitivity coefficient, and the server obtains the sensitivity coefficient, for example, the sensitivity coefficient may be 1.
In one embodiment, 0.75 is input to the server as the sensitivity factor of the comprehensive classifier.
Step S504, the sensitivity coefficient is multiplied by the similarity of the classifier to obtain a first weighting coefficient.
Specifically, the classifier similarity is a value representing the similarity, and the sensitivity coefficient and the classifier similarity are multiplied to obtain a first weighting coefficient.
In one embodiment, the sensitivity coefficient is 0.75, the classifier similarity is 0.8, and the two are multiplied to obtain a first weighting coefficient of 0.6.
In this embodiment, the proportion occupied by the first weighting coefficient is adjusted by adding the first weighting coefficient obtained by multiplying the sensitivity coefficient by the similarity of the classifier, and the proportion of the target classifier can be adjusted as needed when the similarity of the classifier is high. For example, a corresponding relationship between the similarity and the sensitivity coefficient may be set, and when the similarity is smaller than a preset threshold, the value of the sensitivity coefficient is increased.
In one embodiment, as shown in fig. 8, the processing the image extraction features based on the first classifier to obtain the intermediate probability distribution includes:
step S602, processing the image extraction features based on a first classifier to obtain initial probability distribution, wherein the initial probability distribution comprises initial probability matrixes respectively corresponding to each pixel category.
The initial distribution probability can be an initial probability matrix corresponding to each pixel category obtained after processing the image extraction features; the initial probability matrix may be a probability that each pixel point corresponding to the target image in each pixel category is included therein. For example, assuming there are 6 classes and the target image is an image of 200 × 300 pixels, an initial probability matrix of 6 200 rows by 300 columns can be obtained. Each initial probability matrix represents the probability that a pixel of 200 x 300 pixels belongs to one of the 6 classes.
Specifically, the image extraction features are classified by a general classifier to obtain an initial probability distribution corresponding to the target image, and the initial probability distribution includes the probability of each pixel point in each pixel category.
In one embodiment, the extracted features of the target image including the cat, the dog and the background are classified by a general classifier, because the target image has different pixel classes, each class has many pixel points, thus an initial probability distribution about the target image is obtained, and the initial probability distribution of the target image includes the probability of each pixel point of each pixel class about the target image.
Step S604, shifting the elements in the initial probability matrix to the same row or the same column to obtain a shifted probability matrix.
The shift probability matrix may be a matrix obtained by shifting elements in the initial probability matrix and shifting all elements to the same row or column.
Specifically, the initial probability matrix is a multidimensional matrix, and elements in the initial probability matrix are shifted to the same row or the same column by shifting, so as to obtain a shifted probability matrix.
In one embodiment, the initial probability matrix is a 200 × 300 matrix, and the shift results in a 1 × 600 matrix.
Step S606, the shift probability matrixes corresponding to the pixel categories are spliced to obtain intermediate probability distribution.
The intermediate probability distribution may be obtained by splicing different shifted probability matrices, and the multidimensional matrix is the intermediate probability distribution.
Specifically, each pixel category has a corresponding shift probability matrix, the shift probability matrices of all pixel categories are spliced, and the result obtained by splicing obtains intermediate probability distribution.
In one embodiment, there are 6 categories of pixel points of the target image, each category includes 200 × 300 pixel points, there are 6 shift probability matrices of 1 × 600 after reshape, and these 6 matrices are spliced to obtain a middle probability distribution of 6 × 600. And n rows with h x w columns are distributed in the middle probability, and each row represents the probability that each pixel point belongs to the pixel category corresponding to the row in the target image.
In this embodiment, the intermediate probability distribution can be obtained by shifting and splicing the image extraction features by the general classifier.
In one embodiment, the calculation method corresponding to the target classifier includes: shifting and splicing the initial probability distribution obtained by processing the image extraction features, and activating the spliced initial probability distribution by using an activation function to obtain intermediate probability distribution; and performing matrix multiplication operation on the intermediate probability distribution and the image extraction characteristics to obtain the target classifier. For example, the corresponding calculation formula of the target classifier can be expressed as follows:
Figure 118606DEST_PATH_IMAGE001
Gc(x) For the target classifier, p is the initial probability distribution obtained by processing the image extraction features by the general classifier, reshape (p) is to shift and splice p, and x is the image extraction features. Since p is the probability that each pixel point corresponding to n pixel categories belongs to the corresponding pixel category obtained based on x, the key information, G, in each category for n categories can be extracted from x by the above formulac(x) Rather, key information in each of the n classes is extracted from x to adapt the original generic classifier C to the content of the different inputs.
In one embodiment, as shown in fig. 9, fusing the intermediate probability distribution and the image extraction features to obtain a target classifier corresponding to the image extraction features includes:
step S702, a confidence threshold is obtained.
The confidence threshold may be a value that filters out the contribution of low confidence feature values to avoid introducing noise.
Specifically, given a confidence threshold, the server obtains the confidence threshold.
In one embodiment, the target classifier needs to filter out the intermediate probability distributions with confidence thresholds less than 0.2, so the server is preset with a confidence threshold of 0.2.
Step S704, comparing each probability in the intermediate probability distribution with a confidence threshold, and obtaining a mask matrix based on a magnitude relationship between the probabilities and the confidence thresholds.
The mask matrix may be a matrix for removing a probability that the intermediate probability distribution is smaller than the confidence threshold. The size of the mask matrix is the same as that of the intermediate probability distribution, the value of one position in the mask matrix is determined according to the size relation between the probability corresponding to the position in the intermediate probability distribution and the threshold, when the probability is greater than the threshold, the value of the position in the mask matrix is 1, otherwise, the value is 0.
Specifically, a confidence threshold is set, and each pixel point of each category in the intermediate probability distribution is compared with the confidence threshold to obtain a mask matrix for removing noise.
In one embodiment, the confidence threshold of the intermediate probability distribution of the target image is set to 0.5, and each probability in the intermediate probability distribution of the target image is compared with the confidence threshold and a mask matrix is generated.
And step S706, screening the intermediate probability distribution based on the mask matrix to obtain the denoising probability distribution.
The denoising probability distribution may be a probability distribution obtained by removing noise in the intermediate probability distribution through a mask matrix. For example, each value in the mask matrix may be multiplied by the probability of the corresponding position in the intermediate probability distribution to obtain the denoising probability distribution.
Specifically, the intermediate probability distribution is screened through the mask matrix, the probability smaller than the confidence threshold is removed, the probability larger than or equal to the confidence threshold is reserved, and the probability distribution formed by the reserved probabilities is the denoising probability distribution.
In one embodiment, a masking matrix formed by setting a confidence threshold to be 0.5 is used for screening the intermediate probability distribution, the pixel points of the intermediate probability distribution have values of several probabilities of 0.3, 0.4, 0.5, 0.6, 0.7 and 0.8, two probabilities of 0.3 and 0.4 are removed through the masking matrix, and the corresponding pixel points are not included in the denoising probability distribution.
Step S708, the denoising probability distribution is multiplied by the image extraction characteristics to obtain a target classifier.
Specifically, the denoising probability distribution subjected to denoising and the image extraction features are multiplied, and the result obtained by multiplying is the target classifier, wherein a specific expression is as follows:
Figure 292098DEST_PATH_IMAGE011
wherein M is the denoising probability distribution and the size is [ n, hw ]]Like the median probability distribution, sum (-1) is the sum of all values of hw inside M, h and w are the height and width of the image feature, x is the image extraction feature, Gc(x) Is a target classifier.
In the embodiment, the confidence threshold is set, so that the probability of insufficient confidence is removed, the noise of image semantic segmentation is reduced, the performance of the image semantic segmentation model can be improved, and the success rate of image semantic segmentation is higher.
It should be understood that the classifier of the semantic segmentation model is adaptively changed based on the image extraction features of the target image, and based on the processing of the general classifier and the target classifier, the adaptive perception capability of the classifier on different image contents can be enhanced, the accuracy of obtaining the target pixel category corresponding to the target pixel point is improved, and thus the accuracy of image semantic segmentation is improved.
It should be understood that, although the steps in the flowcharts related to the above embodiments are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a part of the steps in the flowcharts related to the above embodiments may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the steps or stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least a part of the steps or stages in other steps.
Based on the same inventive concept, the embodiment of the application also provides an image semantic segmentation device for realizing the image semantic segmentation method. The implementation scheme for solving the problem provided by the device is similar to the implementation scheme recorded in the method, so that specific limitations in one or more embodiments of the image semantic segmentation device provided below can be referred to the limitations of the image semantic segmentation method in the foregoing, and details are not repeated herein.
In one embodiment, as shown in fig. 10, there is provided an image semantic segmentation apparatus including: the system comprises a target image acquisition module, an image extraction feature extraction module, an intermediate probability distribution acquisition module, a target classifier acquisition module, a target probability distribution acquisition module and a target pixel category acquisition module, wherein:
and a target image obtaining module 802, configured to obtain a target image.
An image extraction feature extraction module 804, configured to perform feature extraction on the target image to obtain an image extraction feature corresponding to the target image.
An intermediate probability distribution obtaining module 806, configured to process the image extraction features based on the first classifier to obtain an intermediate probability distribution, where the intermediate probability distribution includes a probability that each pixel point in the target image belongs to each pixel category.
And a target classifier obtaining module 808, configured to fuse the intermediate probability distribution with the image extraction features to obtain a target classifier corresponding to the image extraction features.
The target probability distribution obtaining module 810 is configured to process the image extraction features based on the target classifier and the first classifier to obtain a target probability distribution corresponding to each pixel class, where the target probability distribution includes a probability that each pixel point in the target image belongs to each pixel class.
A target pixel category obtaining module 812, configured to obtain, based on the target probability distribution, a target pixel category corresponding to each pixel point in the target image.
In one embodiment, the target classifier obtaining module is configured to: acquiring a first weighting coefficient corresponding to a target classifier and a second weighting coefficient corresponding to a general classifier; weighting the target classifier based on the first weighting coefficient to obtain a weighted target classifier; weighting the general classifier based on the second weighting coefficient to obtain a weighted general classifier; and the comprehensive weighted general classifier and the weighted target classifier are synthesized to obtain a comprehensive classifier, and the comprehensive classifier is used for processing the image extraction features to obtain the target probability distribution corresponding to each pixel class.
In one embodiment, the target classifier obtaining module is configured to: obtaining the similarity of a classifier between a general classifier and a target classifier, and obtaining a first weighting coefficient based on the similarity of the classifier, wherein the first weighting coefficient and the similarity of the classifier form a positive correlation; and obtaining a second weighting coefficient based on the first weighting coefficient, wherein the second weighting coefficient and the first weighting coefficient are in a negative correlation relationship.
In one embodiment, the target classifier obtaining module is configured to: based on obtaining the sensitivity factor; and multiplying the sensitivity coefficient by the similarity of the classifier to obtain a first weighting coefficient.
In one embodiment, the intermediate probability distribution obtaining module is configured to: processing the image extraction features based on a first classifier to obtain initial probability distribution, wherein the initial probability distribution comprises an initial probability matrix corresponding to each pixel class; shifting elements in the initial probability matrix to the same row or the same column to obtain a shifted probability matrix; and splicing the shift probability matrixes corresponding to each pixel category to obtain intermediate probability distribution.
In one embodiment, the calculation method corresponding to the target classifier is as follows: shifting and splicing the initial probability distribution obtained by processing the image extraction features, and activating the spliced initial probability distribution by using an activation function to obtain an intermediate probability distribution; and performing matrix multiplication operation on the intermediate probability distribution and the image extraction features to obtain the target classifier.
In one embodiment, the target classifier obtaining module is configured to: obtaining a confidence threshold; comparing each probability in the intermediate probability distribution with a confidence coefficient threshold value, and obtaining a mask matrix based on the size relationship between the probabilities and the confidence coefficient threshold values; screening the intermediate probability distribution based on the mask matrix to obtain denoising probability distribution; and multiplying the denoising probability distribution and the image extraction characteristics to obtain the target classifier.
The modules in the image semantic segmentation device can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 11. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing server data. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a method of semantic segmentation of images.
Those skilled in the art will appreciate that the architecture shown in fig. 11 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is further provided, which includes a memory and a processor, the memory stores a computer program, and the processor implements the steps of the above method embodiments when executing the computer program.
In an embodiment, a computer-readable storage medium is provided, in which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.
In one embodiment, a computer program product or computer program is provided that includes computer instructions stored in a computer readable storage medium. The computer instructions are read by a processor of a computer device from a computer-readable storage medium, and the computer instructions are executed by the processor to cause the computer device to perform the steps in the above-mentioned method embodiments.
It should be noted that, the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data for analysis, stored data, presented data, etc.) referred to in the present application are information and data authorized by the user or sufficiently authorized by each party.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high-density embedded nonvolatile Memory, resistive Random Access Memory (ReRAM), Magnetic Random Access Memory (MRAM), Ferroelectric Random Access Memory (FRAM), Phase Change Memory (PCM), graphene Memory, and the like. Volatile Memory can include Random Access Memory (RAM), external cache Memory, and the like. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others. The databases referred to in various embodiments provided herein may include at least one of relational and non-relational databases. The non-relational database may include, but is not limited to, a block chain based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic devices, quantum computing based data processing logic devices, etc., without limitation.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present application shall be subject to the appended claims.

Claims (10)

1. A method for semantic segmentation of an image, the method comprising:
acquiring a target image;
performing feature extraction on the target image to obtain image extraction features corresponding to the target image;
processing the image extraction features based on a first classifier to obtain intermediate probability distribution, wherein the intermediate probability distribution comprises the probability that each pixel point in the target image belongs to each pixel category;
fusing the intermediate probability distribution with the image extraction features to obtain a target classifier corresponding to the image extraction features;
processing the image extraction features based on the target classifier and the first classifier to obtain target probability distribution corresponding to each pixel class, wherein the target probability distribution comprises the probability that each pixel point in the target image belongs to each pixel class;
and obtaining target pixel categories corresponding to all pixel points in the target image respectively based on the target probability distribution.
2. The method of claim 1, wherein the first classifier comprises a general classifier in an image semantic segmentation model, and the processing the image extraction features based on the target classifier and the first classifier to obtain the target probability distribution corresponding to each pixel class comprises:
acquiring a first weighting coefficient corresponding to the target classifier and a second weighting coefficient corresponding to the general classifier;
weighting the target classifier based on the first weighting coefficient to obtain a weighted target classifier;
weighting the general classifier based on the second weighting coefficient to obtain a weighted general classifier;
and synthesizing the weighted general classifier and the weighted target classifier to obtain a comprehensive classifier, and processing the image extraction features by using the comprehensive classifier to obtain target probability distribution corresponding to each pixel class.
3. The method of claim 2, wherein the obtaining the first weighting factor corresponding to the target classifier and the second weighting factor corresponding to the generic classifier comprises:
obtaining classifier similarity between the general classifier and the target classifier, and obtaining the first weighting coefficient based on the classifier similarity, wherein the first weighting coefficient and the classifier similarity form a positive correlation;
and obtaining the second weighting coefficient based on the first weighting coefficient, wherein the second weighting coefficient and the first weighting coefficient are in a negative correlation relationship.
4. The method of claim 3, wherein the deriving the first weighting factor based on the classifier similarity comprises:
acquiring a sensitivity coefficient;
and multiplying the sensitivity coefficient by the similarity of the classifier to obtain the first weighting coefficient.
5. The method of claim 1, wherein the first classifier comprises a general classifier in an image semantic segmentation model, and the processing the image extraction features based on the first classifier in the image semantic segmentation model to obtain an intermediate probability distribution comprises:
processing the image extraction features based on the general classifier in the image semantic segmentation model to obtain initial probability distribution, wherein the initial probability distribution comprises an initial probability matrix corresponding to each pixel class;
shifting elements in the initial probability matrix to the same row or the same column to obtain a shifted probability matrix;
and splicing the shift probability matrixes corresponding to each pixel category to obtain intermediate probability distribution.
6. The method of claim 5, wherein the objective classifier is computed in a manner that includes:
shifting and splicing the initial probability distribution obtained by processing the image extraction features, and activating the spliced initial probability distribution by using an activation function to obtain an intermediate probability distribution;
and performing matrix multiplication operation on the intermediate probability distribution and the image extraction features to obtain the target classifier.
7. The method of claim 1, wherein fusing the intermediate probability distribution with the image extraction features to obtain a target classifier corresponding to the image extraction features comprises:
obtaining a confidence threshold;
comparing each probability in the intermediate probability distribution with the confidence coefficient threshold value, and obtaining a mask matrix based on the magnitude relation between the probability and the confidence coefficient threshold value;
screening the intermediate probability distribution based on the mask matrix to obtain denoising probability distribution;
and multiplying the denoising probability distribution and the image extraction characteristics to obtain the target classifier.
8. An apparatus for semantic segmentation of an image, the apparatus comprising:
the target image acquisition module is used for acquiring a target image;
the image extraction feature extraction module is used for extracting features of the target image to obtain image extraction features corresponding to the target image;
a middle probability distribution obtaining module, configured to process the image extraction features based on a first classifier to obtain a middle probability distribution, where the middle probability distribution includes a probability that each pixel point in the target image belongs to each pixel class;
a target classifier obtaining module, configured to fuse the intermediate probability distribution with the image extraction features to obtain a target classifier corresponding to the image extraction features;
a target probability distribution obtaining module, configured to process the image extraction features based on the target classifier and the first classifier to obtain a target probability distribution corresponding to each pixel class, where the target probability distribution includes a probability that each pixel point in the target image belongs to each pixel class;
and the target pixel category obtaining module is used for obtaining the target pixel categories corresponding to the pixel points in the target image respectively based on the target probability distribution.
9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 7.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.
CN202210438646.2A 2022-04-25 2022-04-25 Semantic segmentation method and device, computer equipment and storage medium Active CN114549913B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202210438646.2A CN114549913B (en) 2022-04-25 2022-04-25 Semantic segmentation method and device, computer equipment and storage medium
PCT/CN2022/120480 WO2023206944A1 (en) 2022-04-25 2022-09-22 Semantic segmentation method and apparatus, computer device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210438646.2A CN114549913B (en) 2022-04-25 2022-04-25 Semantic segmentation method and device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN114549913A true CN114549913A (en) 2022-05-27
CN114549913B CN114549913B (en) 2022-07-19

Family

ID=81667042

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210438646.2A Active CN114549913B (en) 2022-04-25 2022-04-25 Semantic segmentation method and device, computer equipment and storage medium

Country Status (2)

Country Link
CN (1) CN114549913B (en)
WO (1) WO2023206944A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115345895A (en) * 2022-10-19 2022-11-15 深圳市壹倍科技有限公司 Image segmentation method and device for visual detection, computer equipment and medium
CN115620013A (en) * 2022-12-14 2023-01-17 深圳思谋信息科技有限公司 Semantic segmentation method and device, computer equipment and computer readable storage medium
CN115761239A (en) * 2023-01-09 2023-03-07 深圳思谋信息科技有限公司 Semantic segmentation method and related device
WO2023206944A1 (en) * 2022-04-25 2023-11-02 深圳思谋信息科技有限公司 Semantic segmentation method and apparatus, computer device, and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109461157A (en) * 2018-10-19 2019-03-12 苏州大学 Image, semantic dividing method based on multi-stage characteristics fusion and Gauss conditions random field
CN109872374A (en) * 2019-02-19 2019-06-11 江苏通佑视觉科技有限公司 A kind of optimization method, device, storage medium and the terminal of image, semantic segmentation
CN113902913A (en) * 2021-08-31 2022-01-07 际络科技(上海)有限公司 Image semantic segmentation method and device
CN114187311A (en) * 2021-12-14 2022-03-15 京东鲲鹏(江苏)科技有限公司 Image semantic segmentation method, device, equipment and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110335277A (en) * 2019-05-07 2019-10-15 腾讯科技(深圳)有限公司 Image processing method, device, computer readable storage medium and computer equipment
CN114549913B (en) * 2022-04-25 2022-07-19 深圳思谋信息科技有限公司 Semantic segmentation method and device, computer equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109461157A (en) * 2018-10-19 2019-03-12 苏州大学 Image, semantic dividing method based on multi-stage characteristics fusion and Gauss conditions random field
CN109872374A (en) * 2019-02-19 2019-06-11 江苏通佑视觉科技有限公司 A kind of optimization method, device, storage medium and the terminal of image, semantic segmentation
CN113902913A (en) * 2021-08-31 2022-01-07 际络科技(上海)有限公司 Image semantic segmentation method and device
CN114187311A (en) * 2021-12-14 2022-03-15 京东鲲鹏(江苏)科技有限公司 Image semantic segmentation method, device, equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陈孝如: "基于全卷积网络的图像语义分割算法", 《电脑知识与技术》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023206944A1 (en) * 2022-04-25 2023-11-02 深圳思谋信息科技有限公司 Semantic segmentation method and apparatus, computer device, and storage medium
CN115345895A (en) * 2022-10-19 2022-11-15 深圳市壹倍科技有限公司 Image segmentation method and device for visual detection, computer equipment and medium
CN115345895B (en) * 2022-10-19 2023-01-06 深圳市壹倍科技有限公司 Image segmentation method and device for visual detection, computer equipment and medium
CN115620013A (en) * 2022-12-14 2023-01-17 深圳思谋信息科技有限公司 Semantic segmentation method and device, computer equipment and computer readable storage medium
CN115761239A (en) * 2023-01-09 2023-03-07 深圳思谋信息科技有限公司 Semantic segmentation method and related device

Also Published As

Publication number Publication date
WO2023206944A1 (en) 2023-11-02
CN114549913B (en) 2022-07-19

Similar Documents

Publication Publication Date Title
CN114549913B (en) Semantic segmentation method and device, computer equipment and storage medium
CN112052787B (en) Target detection method and device based on artificial intelligence and electronic equipment
CN111738244B (en) Image detection method, image detection device, computer equipment and storage medium
CN109829506B (en) Image processing method, image processing device, electronic equipment and computer storage medium
CN110110689B (en) Pedestrian re-identification method
CN114529825B (en) Target detection model, method and application for fire fighting access occupied target detection
CN111368672A (en) Construction method and device for genetic disease facial recognition model
US20220230282A1 (en) Image processing method, image processing apparatus, electronic device and computer-readable storage medium
WO2021042857A1 (en) Processing method and processing apparatus for image segmentation model
CN113298096B (en) Method, system, electronic device and storage medium for training zero sample classification model
CN114511576B (en) Image segmentation method and system of scale self-adaptive feature enhanced deep neural network
CN112926654A (en) Pre-labeling model training and certificate pre-labeling method, device, equipment and medium
CN111709415B (en) Target detection method, device, computer equipment and storage medium
CN111179270A (en) Image co-segmentation method and device based on attention mechanism
Rafique et al. Deep fake detection and classification using error-level analysis and deep learning
CN116863194A (en) Foot ulcer image classification method, system, equipment and medium
CN112183602A (en) Multi-layer feature fusion fine-grained image classification method with parallel rolling blocks
CN114155388B (en) Image recognition method and device, computer equipment and storage medium
CN116012841A (en) Open set image scene matching method and device based on deep learning
CN113674383A (en) Method and device for generating text image
CN112084371A (en) Film multi-label classification method and device, electronic equipment and storage medium
CN112329925B (en) Model generation method, feature extraction method, device and electronic equipment
CN114049634B (en) Image recognition method and device, computer equipment and storage medium
CN115761239B (en) Semantic segmentation method and related device
CN117197086A (en) Image detection method, device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20220527

Assignee: Beijing simou Intelligent Technology Co.,Ltd.

Assignor: Shenzhen Si Mou Information Technology Co.,Ltd.

Contract record no.: X2024980002106

Denomination of invention: A semantic segmentation method, device, computer equipment, and storage medium

Granted publication date: 20220719

License type: Common License

Record date: 20240221

EE01 Entry into force of recordation of patent licensing contract