CN115760864A

CN115760864A - Image segmentation method and device, electronic equipment and storage medium

Info

Publication number: CN115760864A
Application number: CN202211436512.3A
Authority: CN
Inventors: 王珊珊; 韩华; 李程; 郑海荣
Original assignee: Shenzhen Institute of Advanced Technology of CAS
Current assignee: Shenzhen Institute of Advanced Technology of CAS
Priority date: 2022-11-16
Filing date: 2022-11-16
Publication date: 2023-03-07

Abstract

The invention discloses an image segmentation method, an image segmentation device, electronic equipment and a storage medium. Wherein, the method comprises the following steps: acquiring an image to be segmented and first indication information corresponding to the image to be segmented, wherein the first indication information is used for indicating a category corresponding to an object to be segmented of the image to be segmented, and the image to be segmented comprises a plurality of categories of objects to be segmented; obtaining a target segmentation image corresponding to an object to be segmented based on the image to be segmented, the first indication information and a pre-trained image segmentation model; the image segmentation model is obtained by training an initial segmentation model in a semi-supervised learning mode based on a sample segmentation image and second indication information corresponding to the sample segmentation image, and the second indication information is used for indicating a sample segmentation object of the sample segmentation image. The image segmentation model with higher precision can be obtained based on less labeled data training, the segmentation precision of the image to be segmented comprising a plurality of objects to be segmented can be improved, and the applicability is wide.

Description

Image segmentation method and device, electronic equipment and storage medium

Technical Field

The present invention relates to the field of computer application technologies, and in particular, to an image segmentation method and apparatus, an electronic device, and a storage medium.

Background

At present, the method of segmenting the image by training the neural network model has been widely used. However, in the medical field, the method for segmenting the multi-organ image by training the neural network model has certain limitations. For example, in a medical image, there are problems such as a lack of a certain organ segmentation label, and training samples such as a lack of irregularity in all organs appearing on one image.

In the related art, the precision of a segmentation model obtained by adopting a neural network is poor, and the precision of an obtained target segmentation image is low.

Disclosure of Invention

The invention provides an image segmentation method, an image segmentation device, electronic equipment and a storage medium, and aims to solve the technical problem that the accuracy of an obtained target segmentation image is low.

According to an aspect of the present invention, there is provided an image segmentation method, wherein the method comprises:

acquiring an image to be segmented and first indication information corresponding to the image to be segmented, wherein the first indication information is used for indicating an object to be segmented of the image to be segmented;

obtaining a target segmentation image corresponding to the object to be segmented based on the image to be segmented, the first indication information and a pre-trained image segmentation model;

the image segmentation model is obtained by training an initial segmentation model in a semi-supervised learning mode based on a sample segmentation image and second indication information corresponding to the sample segmentation image, wherein the second indication information is used for indicating a sample segmentation object of the sample segmentation image.

According to another aspect of the present invention, there is provided an image segmentation apparatus, wherein the apparatus comprises:

the image acquisition module is used for acquiring an image to be segmented and first indication information corresponding to the image to be segmented, wherein the first indication information is used for indicating an object to be segmented of the image to be segmented;

the image segmentation module is used for obtaining a target segmentation image corresponding to the object to be segmented based on the image to be segmented, the first indication information and a pre-trained image segmentation model;

According to another aspect of the present invention, there is provided an electronic apparatus including:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores a computer program executable by the at least one processor, the computer program being executable by the at least one processor to enable the at least one processor to perform the image segmentation method according to any of the embodiments of the present invention.

According to another aspect of the present invention, there is provided a computer-readable storage medium storing computer instructions for causing a processor to implement the image segmentation method according to any one of the embodiments of the present invention when the computer instructions are executed.

According to the technical scheme of the embodiment of the invention, the image to be segmented and the first indication information corresponding to the image to be segmented are obtained, wherein the first indication information is used for indicating the object to be segmented of the image to be segmented, and the current object to be segmented can be indicated in the image to be segmented comprising a plurality of objects to be segmented, so that a basis is provided for accurate segmentation of the model object to be segmented; obtaining a target segmentation image corresponding to the object to be segmented based on the image to be segmented, the first indication information and a pre-trained image segmentation model; the image segmentation model is obtained by training an initial segmentation model in a semi-supervised learning mode based on a sample segmentation image and second indication information corresponding to the sample segmentation image, the second indication information is used for indicating a sample segmentation object of the sample segmentation image, and the image segmentation model with higher accuracy can be obtained by training based on less labeled data in the semi-supervised learning mode.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present invention, nor do they necessarily limit the scope of the invention. Other features of the present invention will become apparent from the following description.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a flowchart of an image segmentation method according to an embodiment of the present invention;

FIG. 2 is a flowchart of an image segmentation method according to a second embodiment of the present invention;

FIG. 3 is a schematic diagram of a discriminant model training process according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a training process for generating a model according to an embodiment of the present invention;

FIG. 5 is a schematic structural diagram of an image segmentation apparatus according to a third embodiment of the present invention;

FIG. 6 is a schematic structural diagram of an electronic device implementing an image segmentation method according to an embodiment of the present invention;

Detailed Description

In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

Example one

Fig. 1 is a flowchart of an embodiment of the present invention, which provides an image segmentation method, and this embodiment is applicable to an image processing situation, and the method may be executed by an image segmentation apparatus, which may be implemented in a form of hardware and/or software, and the image segmentation apparatus may be configured in a computer. As shown in fig. 1, the method includes:

s110, obtaining an image to be segmented and first indication information corresponding to the image to be segmented.

The first indication information is used for indicating a category corresponding to an object to be segmented of the image to be segmented, and the image to be segmented comprises a plurality of categories of objects to be segmented. The image to be segmented may be understood as an image to be segmented. In the embodiment of the present invention, the image to be segmented may be set according to a scene requirement, and is not specifically limited herein.

For example, in a medical scene, the image to be segmented may be a multi-organ image to be segmented, a tumor image, or the like; in a traffic scene, the image to be segmented may be a multi-vehicle image to be segmented; in a home scene, the image to be segmented may be a plurality of furniture images to be segmented. For example, in a medical scene, the image to be segmented may be a head and neck organ image, an abdominal multi-organ image, a liver, pancreas, kidney multi-organ image, a tumor image, and the like to be segmented. In a traffic scene, the image to be segmented can be a high-speed multi-vehicle image to be segmented or an intersection multi-vehicle image and the like. In a home scene, the image to be segmented may be a living room multi-furniture image, a kitchen multi-furniture image, a bedroom multi-furniture image and the like to be segmented.

Wherein the first indication information may be understood as information for distinguishing a plurality of objects to be segmented of the image to be segmented. In this embodiment of the present invention, the first indication information may be used to indicate a category corresponding to an object to be segmented of the image to be segmented, where the image to be segmented includes a plurality of categories of objects to be segmented. Alternatively, in a medical scenario, the first indication information may be information for distinguishing a category of an eye, brain, or neck awaiting segmentation object in a head and neck organ image, or information for distinguishing a category of a liver, kidney, or pancreas awaiting segmentation object in a liver-pancreas-kidney multiple organ image. In a traffic scene, the first indication information may be information that distinguishes a category of a motorcycle, a truck, or a car waiting segmentation object in a high-speed multi-vehicle image. In a home scenario, the first indication information may be information that distinguishes a category of a bowl, chopstick, or spoon waiting for a segmented object in a kitchen multi-furniture image. It is to be understood that each type of the first indication information may indicate a type of the object to be segmented, where the type of the object to be segmented may include one or more objects to be segmented. For example, when the first indication information is information indicating a kidney in a liver, pancreas and kidney multi-organ image, the first indication information may indicate two objects to be segmented.

Optionally, the first indication information is preset coding information corresponding to the object to be segmented, or the first indication information is information obtained by splicing the preset coding information corresponding to the object to be segmented and the image to be segmented. In the embodiment of the invention, the image to be segmented can be segmented by the image segmentation model based on the indication relationship between different first indication information and different objects to be segmented in the image to be segmented, so that the image segmentation accuracy and the applicability can be improved.

The preset encoding information may be understood as information obtained by encoding in a certain manner and used for distinguishing and classifying different objects to be segmented. In the embodiment of the present invention, the preset coding information may be preset according to a scene requirement, and is not specifically limited herein. Specifically, optionally, the preset encoding information includes encoding information generated based on a one-hot encoding manner.

Wherein the one-hot encoding is a manner of encoding the N states using N-bit state registers. In the embodiment of the invention, the problem that the classifier cannot process attribute data well can be solved based on the coding information generated in the mode of single hot coding, and the function of expanding the characteristics is achieved. Optionally, the preset coding information may be first coding information, second coding information, third coding information, and the like. For example, in a liver-pancreas-kidney multi-organ image of a medical scene, the first encoding information may be used for indicating the organ liver to be segmented, the second encoding information may be used for indicating the organ pancreas to be segmented, the third encoding information may be used for indicating the organ kidney to be segmented, and the like. In a high-speed multi-vehicle image of a traffic scene, an object to be segmented which can be used for indicating a motorcycle class based on first coding information, an object to be segmented which can be used for indicating a truck class based on second coding information, an object to be segmented which can be used for indicating a sedan class based on third coding information, and the like.

The object to be segmented may be understood as an object to be segmented in the image to be segmented, or may be understood as a region of interest to be segmented in the image to be segmented. In the embodiment of the present invention, the object to be segmented may be set according to a scene requirement, and is not specifically limited herein. Alternatively, the object to be segmented may be an organ or a tumor to be segmented in the image to be segmented, or the like. Illustratively, in the liver, pancreas, or kidney multi-organ image of the medical scene, the object to be segmented may be a liver, pancreas, or kidney, etc. in the liver, pancreas, or kidney multi-organ image. In the bedroom multi-furniture image of the home scene, the object to be segmented can be a bed, a wardrobe or a lamp in the bedroom multi-furniture image.

And S120, obtaining a target segmentation image corresponding to the object to be segmented based on the image to be segmented, the first indication information and a pre-trained image segmentation model.

Wherein, the image segmentation model can be understood as an artificial intelligence model for segmenting the image to be segmented. The target segmentation image may be understood as an image determined by the image segmentation model for the image to be segmented and the first indication information. Optionally, the target segmentation image may be an image of the to-be-segmented object segmented from the to-be-segmented image. It can be understood that the target segmentation image may be the image to be segmented in which the object to be segmented is marked, or an image obtained by segmenting an image region containing the object to be segmented in the image to be segmented, that is, a partial image of the image to be segmented.

Optionally, the image segmentation model is obtained by training an initial segmentation model in a semi-supervised learning manner based on the sample segmentation image and second indication information corresponding to the sample segmentation image.

Wherein the sample segmentation image can be understood as a sample image used for training the image segmentation model. In the embodiment of the present invention, the sample segmentation image may be set according to a scene requirement, and is not specifically limited herein. Optionally, the sample segmentation images may include training images of one or more sample segmentation objects.

Wherein the second indication information may be understood as information indicating a sample segmentation object of the sample segmentation image. In this embodiment of the present invention, the second indication information may be the same type of information as the first indication information. In the embodiment of the present invention, the second indication information corresponding to the sample segmentation image may accurately indicate each sample segmentation object. Moreover, for a sample segmentation image including a plurality of sample segmentation objects, different second indication information can be combined to be used as a training sample, and the method is particularly suitable for a training scene with a scarce sample amount. An image segmentation model with high precision can be obtained through training of fewer sample segmentation images.

Wherein the sample segmentation object may be understood as an object corresponding to the second indication information in the sample segmentation image. In the embodiment of the present invention, the indication relationship between the second indication information and the object to be segmented may be preset according to a scene requirement, and is not specifically limited herein.

Optionally, the initial segmentation model generates a countermeasure model, the countermeasure model includes a generation model and a discrimination model, and the image segmentation model is trained based on the following method:

obtaining a semi-supervised sample set, wherein the semi-supervised sample set comprises a first number of labeled sample segmented images and a second number of unlabeled sample segmented images;

determining second indication information corresponding to the sample segmentation image, wherein the second indication information is used for indicating a sample segmentation object of the sample segmentation image, and the label is a desired segmentation image corresponding to the second indication information;

generating a countermeasure model training based on a sample segmentation image in the semi-supervised sample set and second indication information corresponding to the sample segmentation image, and taking a training generated model as the image segmentation model.

The semi-supervised sample set can be understood as a sample set required for training an initial segmentation model by adopting a semi-supervised learning mode. Optionally, the semi-supervised sample set may comprise a first number of labeled sample segmented images and a second number of unlabeled sample segmented images.

Wherein the first number may be understood as the number of labeled exemplar segmentation images in the semi-supervised exemplar set. Alternatively, the first number and the second number may be thousands of sets, tens of thousands of sets, or the like. Illustratively, the first number may be 1 thousand, 5 thousand, or 1 ten thousand, etc. The second number may be understood as the number of unlabeled exemplar segmentation images in the semi-supervised exemplar set. In the embodiment of the present invention, the first number and the second number may be preset according to a scene requirement, and are not specifically limited herein. The first number and the second number may be the same or different.

Wherein the label may be understood as a desired segmented image corresponding to the second indication information. The expected segmentation image can be understood as an image expected to be output by the initial segmentation model for the second indication information corresponding to the sample segmentation image. In consideration of the fact that in practical application scenarios, for example, in medical scenarios, there may be cases where the difficulty of obtaining the labels of the samples is large, optionally, the first number may be smaller than the second number.

The generative confrontation model can be understood as a network model composed of a generative model and a discriminant model. The generation model may be understood as a model for generating a segmented image corresponding to the sample segmented image. The discriminant model may be understood as determining whether an image similar to the sample segmentation image generated based on the generation model is the real sample segmentation image. In an embodiment of the present invention, the discriminant model may be trained based on the first model output image, second indication information corresponding to the first model output image, and a desired segmentation image corresponding to the second model output image.

Specifically, based on the alternate iteration of a generative model and a discriminant model in the generative confrontation model, the model parameters of the generative model are adjusted to obtain the image segmentation model.

Optionally, the generating a countermeasure model training based on the sample segmentation image in the semi-supervised sample set and second indication information corresponding to the sample segmentation image, and using a generated model after training as the image segmentation model includes:

inputting the sample segmentation images with labels in the semi-supervised sample set and second indication information corresponding to the sample segmentation images into a generation model in the generation countermeasure model to obtain a first model output image;

adjusting model parameters of the generated model according to the first model output image, the sample segmentation image and an expected segmentation image corresponding to the sample segmentation image to obtain an initial segmentation model;

inputting unlabeled sample segmentation images in the semi-supervised sample set and second indication information corresponding to the sample segmentation images into the initial segmentation model to obtain a second model output image;

adjusting model parameters of the generated model based on a discrimination result of the trained discrimination model on the first model output image and the second model output image, wherein the discrimination model is obtained by training based on the first model output image, second indication information corresponding to the first model output image and an expected segmentation image corresponding to the second model output image;

and under the condition that the training end condition is detected to be reached, taking the generated model after training as the image segmentation model.

The first model output image may be an image generated by generating a model based on the labeled specimen segmentation image and the second indication information corresponding to the specimen segmentation image. It is to be understood that the first model output image may be an image of a sample segmentation object corresponding to the segmentation second instruction information output by the generative model after the labeled sample segmentation image is input to the generative model. Similarly, the second model output image may be understood as an image generated by the initial segmentation model based on an unlabeled exemplar segmentation image and second indication information corresponding to the exemplar segmentation image.

Wherein the discrimination result may be understood as a probability for discriminating the first model output image and the second model output image as true or false by a discrimination model. In other words, the discrimination result may be a result of determining that the image to which the discrimination model is input is the model output image or the desired segmentation image.

Wherein the model parameters can be understood as configuration parameters of the generative model. The model parameters generally include layer-to-layer weights and offset values.

Wherein the training end condition may be understood as a condition that may end training the generative confrontation model. In the embodiment of the present invention, the training end condition may be preset according to a scene requirement, and is not specifically limited herein. Optionally, the training end condition may be that the model parameters of the generated model no longer change in the same direction, that is, the model parameters converge; or the preset iteration times are reached, or the judgment error rate of the output image of the generated model by the judgment model reaches a preset value, and the like.

Specifically, the training mode for generating the confrontation model may be:

1. and training to generate a discrimination model of the confrontation model. First, a discrimination model of the countermeasure model is trained based on the second indication information, the desired segmentation image corresponding to the second indication information, and the gaussian noise image. Specifically, the second indication information is input into the first submodel of the discriminant model in the generated countermeasure model to obtain initial network parameters, and the network parameters of the second submodel of the discriminant model are updated based on the initial network parameters.

2. Training a generative model that generates a confrontation model. Specifically, the second indication information is firstly input into a first sub-model of a generation model in the generation countermeasure model to obtain initial network parameters, and the network parameters of the attention mechanism in a second sub-model are obtained based on the initial network parameters as the generation model; then, the sample segmentation image is input into a second sub-model of the generation model, and a model segmentation image output by the generation model is obtained.

Further, in the process of inputting the sample segmentation image into the second sub-model of the generation model, the labeled sample segmentation image is firstly adopted to input the second sub-model of the generation model, and the first model output image is obtained. Then, the label-free sample segmentation image is input into the generation model, and a second model output image is obtained.

3. And adjusting the model parameters of the generated model based on the discrimination result. Specifically, the second type output image and the expected segmentation image corresponding to the second type output image may be input into the discrimination model to obtain a discrimination result corresponding to the second type output image; and adjusting the model parameters of the generated model based on the discrimination result. And further training the discrimination model based on the discrimination result to obtain the trained discrimination model. Further, the model parameters of the generated model can be further adjusted based on the discrimination result of the newly trained discrimination model.

Finally, the trained generated model may be used as the image segmentation model in the case that the detection reaches the convergence of the model parameters.

In the embodiment of the invention, a countermeasure model is trained based on the sample segmentation images in the semi-supervised sample set and second indication information corresponding to the sample segmentation images, and the trained generation model is used as the image segmentation model, so that the problem of insufficient labels of the sample set is solved, and the accuracy of the obtained image segmentation model is improved while the countermeasure model is trained and generated based on a semi-supervised learning mode.

Example two

Fig. 2 is a flowchart of an image segmentation method according to a second embodiment of the present invention, where in this embodiment, a target segmentation image corresponding to the object to be segmented is obtained based on the image to be segmented, the first indication information, and a pre-trained image segmentation model, and is refined.

As shown in fig. 2, the method includes:

s210, obtaining an image to be segmented and first indication information corresponding to the image to be segmented, wherein the first indication information is used for indicating an object to be segmented of the image to be segmented.

S220, inputting the first indication information to a first sub-model of a pre-trained image segmentation model to obtain initial network parameters of a second sub-model of the image segmentation model, wherein the initial network parameters at least comprise initial weights and offset values.

In an embodiment of the invention, the image segmentation model may comprise a first sub-model and a second sub-model.

Wherein the first sub-model may be understood as a model for obtaining initial network parameters of a second sub-model of the image segmentation model based on the first indication information.

Wherein, the initial network parameters can be understood as initial weights of each node and initial values of offset values before the second submodel is trained. The initial weight may be understood as an initial value of the weight in the initial network parameter. The offset value can be understood as the difference between the logical address of the program and the segment header.

Optionally, the first submodel includes a plurality of convolutional layers, where at least two convolutional layers are connected based on a nonlinear activation function layer, and the second submodel includes an attention mechanism;

the updating of the network parameters of the second submodel based on the initial network parameters comprises:

updating a network parameter of the attention mechanism based on the initial network parameter.

Wherein the convolutional layer may be understood as a network layer extracting different characteristics of the input first indication information. The non-linear activation function layer may be understood as a network layer that adds non-linearity to the image segmentation model. The attention mechanism may be understood as a mechanism that may focus the second sub-model on the object to be segmented to which the input first indication corresponds, i.e. select a specific input. In the embodiment of the invention, the attention mechanism can allocate resources under the condition of limited computing capacity, allocate the computing resources to more important tasks and solve the problem of information overload. The calculation speed is improved, and the efficiency of obtaining the target segmentation image is improved.

And S230, updating the network parameters of the second sub-model based on the initial network parameters, and inputting the image to be segmented into the second sub-model after the network parameters are updated to obtain a target segmented image corresponding to the object to be segmented.

The second sub-model can be understood as a model for obtaining a target segmentation image corresponding to the object to be segmented based on the image to be segmented and the initial network parameters.

Specifically, the first indication information is input to a first sub-model of a pre-trained image segmentation model, and an initial weight and an offset value of a second sub-model of the image segmentation model are obtained; updating the network parameters of the second sub-model based on the initial weight and the offset value, and inputting the image to be segmented into the second sub-model after the network parameters are updated to obtain a target segmented image corresponding to the object to be segmented.

According to the technical scheme of the embodiment of the invention, the initial network parameters of the second sub-model of the image segmentation model are obtained by inputting the first indication information into the pre-trained first sub-model of the image segmentation model, wherein the initial network parameters at least comprise initial weight and offset value; updating the network parameters of the second sub-model based on the initial network parameters, and inputting the image to be segmented into the second sub-model after the network parameters are updated to obtain a target segmentation image corresponding to the object to be segmented. The target segmentation image is accurately acquired through processing of the image to be segmented and the first indication information corresponding to the image to be segmented.

Optionally, fig. 3 is a schematic diagram of a training process of a recognition model according to an embodiment of the present invention; as shown in fig. 3, the training process of the discriminant model may be:

1. and inputting the second indication information into the first sub-model of the discrimination model to obtain the initial network parameters of the second sub-model. Performing task coding on the input organ number based on a one-dimensional hot coding mode to generate m-dimensional one-dimensional hot coding, namely second indication information; and inputting the second indication information into the first sub-model of the discriminant model to obtain the weight and bias of the n convolutional layers, namely the initial network parameters.

2. And taking the obtained initial network parameters as the initial network parameters of the second submodel of the discriminant model.

3. The second sub-model is trained based on image labels (desired segmentation images) and a gaussian noise map (pseudo label images). And inputting the image label and the Gaussian noise into a second submodel of the discrimination model to obtain a discrimination result of 0 or 1, wherein 0 can represent that the discrimination result is a forged image, and 1 can represent that the discrimination result is a real image.

Optionally, fig. 4 is a schematic diagram of a training process of a generative model provided according to an embodiment of the present invention; as shown in fig. 4, the training process for generating the model may be:

1. and inputting the second indication information into the first sub-model of the generating model to obtain the initial network parameters of the second sub-model. And inputting the second indication information into the first sub-model of the generative model to obtain the weight and bias of the n convolution layers of the generative model, namely the initial network parameters.

2. And taking the obtained initial network parameters as the initial network parameters of the attention mechanism in the second sub-model of the generative model.

3. And training the second sub-model based on the labeled sample segmentation image, the unlabeled sample image and the discrimination model to obtain a model output image. And inputting the labeled sample segmentation image and the unlabeled sample image into a second sub-model of the generation model to obtain a model output image. In the process, the model parameters of the second submodel are adjusted by combining the discrimination result of the discrimination model to obtain a trained generation model, namely an image segmentation model.

4. And inputting the image to be segmented and the first indication information into the trained image segmentation model to obtain a target segmentation image.

The technical scheme provides a multi-task semi-supervised segmentation method based on target recognition, which can be used for segmenting multiple classes of objects to be segmented of a single slice and multiple slices. Compared with the related multi-class object to be segmented technology, the technical scheme of the invention does not need to provide all labels of all objects to be segmented and does not need all labels of all objects to be segmented on one picture. Besides, the problem of insufficient labels for segmenting the multi-class object to be segmented can be solved more flexibly, and the problem of segmenting the multi-class object to be segmented on the large data set is solved more favorably.

EXAMPLE III

Fig. 5 is a schematic structural diagram of an image segmentation apparatus according to a third embodiment of the present invention. As shown in fig. 5, the apparatus includes: an image acquisition module 310 and an image segmentation module 320.

The image obtaining module 310 is configured to obtain an image to be segmented and first indication information corresponding to the image to be segmented, where the first indication information is used to indicate an object to be segmented of the image to be segmented; an image segmentation module 320, configured to obtain a target segmentation image corresponding to the object to be segmented based on the image to be segmented, the first indication information, and a pre-trained image segmentation model; the image segmentation model is obtained by training an initial segmentation model in a semi-supervised learning mode based on a sample segmentation image and second indication information corresponding to the sample segmentation image, wherein the second indication information is used for indicating a sample segmentation object of the sample segmentation image.

Optionally, the image segmentation model includes a first sub-model and a second sub-model;

an image segmentation module 320 to:

inputting the first indication information into a first sub-model of a pre-trained image segmentation model to obtain initial network parameters of a second sub-model of the image segmentation model, wherein the initial network parameters at least comprise initial weights and offset values;

updating the network parameters of the second sub-model based on the initial network parameters, and inputting the image to be segmented into the second sub-model after the network parameters are updated to obtain a target segmentation image corresponding to the object to be segmented.

Optionally, the first indication information is preset coding information corresponding to the object to be segmented, or the first indication information is information obtained by splicing the preset coding information corresponding to the object to be segmented and the image to be segmented.

Optionally, the preset encoding information includes encoding information generated based on a one-hot encoding manner.

Optionally, the first submodel includes a plurality of convolutional layers, wherein at least two convolutional layers are connected based on a nonlinear activation function layer.

Optionally, the initial segmentation model generates a countermeasure model, and the countermeasure model includes a generation model and a discriminant model; the image segmentation model can be obtained by training based on a model training module, wherein the model training module comprises: the device comprises a sample set acquisition sub-module, an indication information determination sub-module and a model training sub-module.

The system comprises a sample set acquisition submodule and a sample set acquisition submodule, wherein the sample set acquisition submodule is used for acquiring a semi-supervised sample set, and the semi-supervised sample set comprises a first number of labeled sample segmentation images and a second number of unlabeled sample segmentation images;

the indication information determining submodule is configured to determine second indication information corresponding to the sample segmentation image, where the second indication information is used to indicate a sample segmentation object of the sample segmentation image, and the label is a desired segmentation image corresponding to the second indication information;

and the model training submodule is used for training a generation countermeasure model based on the sample segmentation images in the semi-supervised sample set and second indication information corresponding to the sample segmentation images, and taking the generated model after training as the image segmentation model.

Optionally, the model training sub-module is configured to:

The image segmentation device provided by the embodiment of the invention can execute the image segmentation method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.

Example four

FIG. 6 illustrates a schematic structural diagram of an electronic device 10 that may be used to implement an embodiment of the present invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital assistants, cellular phones, smart phones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.

As shown in fig. 6, the electronic device 10 includes at least one processor 11, and a memory communicatively connected to the at least one processor 11, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, and the like, wherein the memory stores a computer program executable by the at least one processor, and the processor 11 can perform various suitable actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from a storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data necessary for the operation of the electronic apparatus 10 can also be stored. The processor 11, the ROM 12, and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to the bus 14.

A number of components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, or the like; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.

The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The processor 11 performs the various methods and processes described above, such as an image segmentation method.

In some embodiments, the image segmentation method may be implemented as a computer program tangibly embodied in a computer-readable storage medium, such as storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19. When the computer program is loaded into the RAM 13 and executed by the processor 11, one or more steps of the image segmentation method described above may be performed. Alternatively, in other embodiments, the processor 11 may be configured to perform the image segmentation method by any other suitable means (e.g. by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

A computer program for implementing the methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be performed. A computer program can execute entirely on a machine, partly on a machine, as a stand-alone software package partly on a machine and partly on a remote machine or entirely on a remote machine or server.

In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. A computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user may provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user can be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the Internet.

The computing system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service are overcome.

It should be understood that various forms of the flows shown above, reordering, adding or deleting steps, may be used. For example, the steps described in the present invention may be executed in parallel, sequentially, or in different orders, and are not limited herein as long as the desired results of the technical solution of the present invention can be achieved.

The above-described embodiments should not be construed as limiting the scope of the invention. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. An image segmentation method, comprising:

acquiring an image to be segmented and first indication information corresponding to the image to be segmented, wherein the first indication information is used for indicating a category corresponding to an object to be segmented of the image to be segmented, and the image to be segmented comprises a plurality of categories of objects to be segmented;

2. The method of claim 1, wherein the image segmentation model comprises a first sub-model and a second sub-model;

generating a target segmentation image corresponding to the object to be segmented based on the image to be segmented, the first indication information and a pre-trained image segmentation model, including:

3. The method of claim 2, wherein the first submodel comprises a plurality of convolutional layers, wherein at least two convolutional layers are connected based on a nonlinear activation function layer, and wherein the second submodel comprises an attention mechanism;

4. The method according to claim 1, wherein the first indication information is preset coding information corresponding to the object to be segmented, or the first indication information is information obtained by splicing the preset coding information corresponding to the object to be segmented and the image to be segmented.

5. The method according to claim 4, wherein the preset encoding information comprises encoding information generated based on a one-hot encoding manner.

6. The method of claim 1, wherein the initial segmentation model is a generative confrontation model, the generative confrontation model comprises a generative model and a discriminant model, and the image segmentation model is trained based on:

7. The method according to claim 6, wherein training a generative countermeasure model based on the semi-supervised sample set sample segmentation image and second indication information corresponding to the sample segmentation image, and using a trained generative model as the image segmentation model comprises:

8. An image segmentation apparatus, comprising:

the image segmentation model is obtained by training an initial segmentation model in a semi-supervised learning mode based on a sample segmentation image and second indication information corresponding to the sample segmentation image, and the second indication information is used for indicating a sample segmentation object of the sample segmentation image.

9. An electronic device, characterized in that the electronic device comprises:

at least one processor; and

the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the image segmentation method of any one of claims 1-7.

10. A computer-readable storage medium having stored thereon computer instructions for causing a processor to execute the image segmentation method according to any one of claims 1 to 7.