WO2022183984A1 - 图像分割方法、装置、计算机设备及存储介质 - Google Patents

图像分割方法、装置、计算机设备及存储介质 Download PDF

Info

Publication number
WO2022183984A1
WO2022183984A1 PCT/CN2022/077951 CN2022077951W WO2022183984A1 WO 2022183984 A1 WO2022183984 A1 WO 2022183984A1 CN 2022077951 W CN2022077951 W CN 2022077951W WO 2022183984 A1 WO2022183984 A1 WO 2022183984A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
feature map
segmented
sample
model
Prior art date
Application number
PCT/CN2022/077951
Other languages
English (en)
French (fr)
Inventor
余双
冀炜
马锴
郑冶枫
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to EP22762449.1A priority Critical patent/EP4287117A4/en
Publication of WO2022183984A1 publication Critical patent/WO2022183984A1/zh
Priority to US18/074,906 priority patent/US20230106468A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4038Image mosaicing, e.g. composing plane images from plane sub-images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/32Indexing scheme for image data processing or generation, in general involving image mosaicing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20224Image subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30041Eye; Retina; Ophthalmic

Definitions

  • the embodiments of the present application relate to the field of computer technologies, and in particular, to an image segmentation method, apparatus, computer device, and storage medium.
  • image segmentation techniques can be used to extract an image of a body part from an image.
  • the embodiments of the present application provide an image segmentation method, apparatus, computer equipment and storage medium, which can improve the accuracy of image segmentation.
  • the technical solution includes the following contents.
  • an image segmentation method executed by a computer device, the method comprising:
  • the original image is encoded based on the prior knowledge vector to obtain the target feature map, the original image includes the target object, the prior knowledge vector includes multiple prior knowledge weights, and each prior knowledge weight is used to represent an annotator Corresponding accuracy, the accuracy is the accuracy with which the labeler labels the region where any object is located in any image;
  • image reconstruction is performed on the first segmented image to obtain a plurality of labeled segmented images, each labeled segmented image corresponds to a prior knowledge weight, and each labeled segmented image indicates a corresponding the target area marked by the annotator;
  • the target feature map is processed to obtain a second segmented image of the original image.
  • an image segmentation device comprising:
  • the encoding module is used to encode the original image based on the prior knowledge vector to obtain the target feature map, the original image includes the target object, the prior knowledge vector includes multiple prior knowledge weights, and each prior knowledge weight uses is the accuracy corresponding to an annotator, and the accuracy is the accuracy with which the annotator annotates the region where any object is located in any image;
  • a decoding module configured to decode the target feature map to obtain a first segmented image of the original image, where the first segmented image indicates a target area where the target object is located in the original image;
  • the reconstruction module is configured to perform image reconstruction on the first segmented image based on the prior knowledge vector to obtain a plurality of labeled segmented images, each labeled segmented image corresponds to a prior knowledge weight, and each of the labeled segmented images corresponds to a prior knowledge weight. annotating the segmented image to indicate the target area marked by the corresponding annotator;
  • the processing module is configured to process the target feature map based on the plurality of labeled segmentation images to obtain a second segmented image of the original image.
  • the processing module includes:
  • a first determining unit configured to determine an uncertainty image based on the difference between the plurality of labeled segmented images, the uncertainty image indicating the difference between a plurality of the target regions, each of the target regions segmenting the area indicated by the image for the annotation;
  • a first fusion unit configured to fuse the target feature map and the uncertainty image to obtain the second segmented image.
  • each labeled segmented image includes a first weight corresponding to multiple pixels in the original image, where the first weight is used to indicate that the corresponding pixel is in the target area possibility within;
  • the first determining unit is configured to determine a difference image between each of the labeled segmented images and an average image, where the average image is an average image of the multiple labeled segmented images; determine a plurality of differences The square sum of the pixel values of the pixel points located at the same position in the value image; the square root of the ratio between the square sum corresponding to each position and the number of targets is determined as the second weight of each position, so The number of targets is the number of the plurality of labeled segmented images; the uncertainty image is constructed based on the second weights of the plurality of positions.
  • the first fusion unit is configured to determine an average image of the multiple labeled segmented images; determine the product of the target feature map and the uncertainty image, and use this time The sum of the determined product and the target feature map is determined as the first fusion feature map; the product of the target feature map and the average image is determined, and the sum of the product determined this time and the target feature map, Determined as the second fusion feature map; splicing the first fusion feature map and the second fusion feature map to obtain a splicing fusion feature map; convolving the splicing fusion feature map to obtain the second segmentation image.
  • the encoding module includes:
  • a first encoding unit configured to encode the original image to obtain a first feature map of the original image
  • a second fusion unit configured to fuse the prior knowledge vector with the first feature map to obtain a second feature map
  • the first decoding unit is configured to decode the second feature map to obtain the target feature map.
  • the reconstruction module includes:
  • a splicing unit for splicing the original image and the first segmented image to obtain a spliced image
  • the second encoding unit is used to encode the spliced image to obtain a third feature map
  • a third fusion unit configured to fuse the prior knowledge vector with the third feature map to obtain a fourth feature map
  • the second decoding unit is configured to decode the fourth feature map to obtain the multiple labeled segmented images.
  • the step of encoding the original image based on the prior knowledge vector to obtain the target feature map is performed by the first image segmentation model
  • the step of decoding the target feature map to obtain the first segmented image of the original image is performed by the first image segmentation model
  • the step of performing image reconstruction on the first segmented image based on the prior knowledge vector to obtain a plurality of labeled segmented images is performed by an image reconstruction model
  • the step of processing the target feature map based on the plurality of labeled segmentation images to obtain a second segmented image of the original image is performed by a second image segmentation model.
  • the apparatus further includes:
  • the acquisition module is used to acquire a sample original image, a plurality of sample annotation segmentation images and the prior knowledge vector, the sample original image includes sample objects, each sample annotation segmentation image corresponds to a prior knowledge weight, and each sample annotation segmentation image corresponds to a prior knowledge weight.
  • each sample annotated segmented image indicates the sample area where the sample object is located in the sample original image, and each sample annotated segmented image is annotated by a corresponding annotator;
  • the encoding module is further configured to call the first image segmentation model, encode the sample original image based on the prior knowledge vector, and obtain a target sample feature map;
  • the decoding module is further configured to call the first image segmentation model to decode the feature map of the target sample to obtain a first sample segmented image of the sample original image, and the first sample segmented image indicates the sample area where the sample object is located in the sample original image;
  • the reconstruction module is further configured to call the image reconstruction model, perform image reconstruction on the first sample segmented image based on the prior knowledge vector, and obtain a plurality of predicted and labeled segmented images, each predicted The labeled segmentation image corresponds to a prior knowledge weight, and each predicted labeled segmentation image indicates the predicted sample area;
  • the processing module is further configured to call the second image segmentation model, process the target sample feature map based on the plurality of predicted annotation segmentation images, and obtain the predicted segmentation image of the sample original image;
  • a weighted fusion module configured to perform weighted fusion on the plurality of sample annotated segmentation images based on the prior knowledge vector to obtain a fused annotation segmentation image
  • a training module configured to train the first image segmentation model, the image reconstruction model and the second image segmentation model based on the difference between the predicted segmented image and the fused annotated segmented image.
  • the training module includes:
  • a second determining unit configured to determine a first loss value based on the difference between the predicted segmented image and the fused annotated segmented image
  • a training unit configured to train the first image segmentation model, the image reconstruction model and the second image segmentation model based on the first loss value.
  • the training unit is configured to determine a second loss value based on the difference between the first sample segmented image and the fused annotation segmented image; based on the first loss value and the second loss value, the first image segmentation model, the image reconstruction model and the second image segmentation model are trained.
  • the training unit is configured to determine a third loss value based on the difference between the plurality of predicted annotated segmented images and corresponding sample annotated segmented images; based on the first loss value and the third loss value, the first image segmentation model, the image reconstruction model and the second image segmentation model are trained.
  • the image reconstruction model includes an encoding sub-model, a fusion sub-model and a decoding sub-model;
  • the reconstruction module is used for splicing the sample original image and the first sample segmented image to obtain a first sample spliced image; calling the encoding sub-model to splicing the first sample image Encoding to obtain the first sample feature map; calling the fusion sub-model, fusing the prior knowledge vector with the first sample feature map to obtain the second sample feature map; calling the decoding sub-model , decoding the second sample feature map to obtain the plurality of predicted and labeled segmented images.
  • the apparatus further includes:
  • a splicing module configured to splicing the original sample image and the fused and labeled segmented image to obtain a second spliced sample image
  • the reconstruction module is further configured to call the encoding sub-model to encode the second sample spliced image to obtain a third sample feature map;
  • the training unit is configured to determine a fourth loss value based on the difference between the third sample feature map and the first sample feature map; based on the first loss value and the fourth loss value, The first image segmentation model, the image reconstruction model and the second image segmentation model are trained.
  • a computer device in another aspect, includes a processor and a memory, the memory stores at least one computer program, the at least one computer program is loaded and executed by the processor to achieve the above Operations performed by the image segmentation method described in the aspect.
  • a computer-readable storage medium is provided, and at least one computer program is stored in the computer-readable storage medium, and the at least one computer program is loaded and executed by a processor to realize the image according to the above aspect The operation performed by the split method.
  • a computer program product or computer program comprising computer program code stored in a computer-readable storage medium.
  • the processor of the computer device reads the computer program code from the computer-readable storage medium, and the processor executes the computer program code, so that the computer device implements the operations performed in the image segmentation method described in the above aspects.
  • FIG. 1 is a schematic structural diagram of an implementation environment provided by an embodiment of the present application.
  • FIG. 2 is a flowchart of an image segmentation method provided by an embodiment of the present application.
  • FIG. 3 is a flowchart of an image segmentation method provided by an embodiment of the present application.
  • FIG. 4 is a flowchart of a method for obtaining a second feature map provided by an embodiment of the present application
  • FIG. 5 is a schematic diagram of annotated images of a plurality of annotators provided by an embodiment of the present application.
  • FIG. 6 is a flowchart of an image segmentation method provided by an embodiment of the present application.
  • FIG. 7 is a comparison diagram of a segmentation image in a variety of ways provided by an embodiment of the present application.
  • FIG. 8 is a comparison diagram of a segmentation image in a variety of ways provided by an embodiment of the present application.
  • FIG. 9 is a flowchart of a model training method provided by an embodiment of the present application.
  • FIG. 10 is a flowchart of obtaining a predicted segmentation image provided by an embodiment of the present application.
  • FIG. 11 is a flowchart of a model training process provided by an embodiment of the present application.
  • FIG. 12 is a schematic structural diagram of an image segmentation apparatus provided by an embodiment of the present application.
  • FIG. 13 is a schematic structural diagram of an image segmentation apparatus provided by an embodiment of the present application.
  • FIG. 14 is a schematic structural diagram of a terminal provided by an embodiment of the present application.
  • FIG. 15 is a schematic structural diagram of a server provided by an embodiment of the present application.
  • first, second, third, fourth, fifth, “sixth,” etc. may be used herein to describe various concepts, unless specifically Note that these concepts are not limited by these terms. These terms are only used to distinguish one concept from another.
  • the first feature map can be referred to as the second feature map, and similarly, the second feature map can be referred to as the first feature map, without departing from the scope of the present application.
  • the multiple prior knowledge weights include 3 prior knowledge weights, and each refers to each of the 3 prior knowledge weights, and any one refers to the 3 prior knowledge weights Any one of the prior knowledge weights in can be the first one, or the second one, or the third one.
  • the original image is first encoded to obtain a feature map of the original image, and then the feature map is decoded to obtain a segmented image, which can indicate where the target object in the original image is located. area.
  • this image segmentation method is simple, and the accuracy of image segmentation is poor.
  • the image segmentation method provided by the embodiment of the present application is executed by a computer device.
  • the computer device is a terminal or a server.
  • the server is an independent physical server, or a server cluster or a distributed system composed of multiple physical servers, or provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, Cloud servers for basic cloud computing services such as cloud communication, middleware services, domain name services, security services, CDN (Content Delivery Network), and big data and artificial intelligence platforms.
  • the terminal is a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, etc., but is not limited thereto.
  • a plurality of servers can be formed into a blockchain, the servers are nodes on the blockchain, and the image segmentation method provided by the embodiments of this application can be applied to any one of the blockchains.
  • server adopts the image segmentation method provided in the embodiment of the present application, and can segment any image, and store the obtained segmented image in the blockchain, so as to share it with other servers in the blockchain.
  • FIG. 1 is a schematic diagram of an implementation environment provided by an embodiment of the present application.
  • the implementation environment includes a terminal 101 and a server 102 .
  • the terminal 101 and the server 102 are connected through a wireless or wired network.
  • a target application provided by the server 102 is installed on the terminal 101, and the terminal 101 can realize functions such as data transmission and message interaction through the target application.
  • the target application is a target application in the operating system of the terminal 101 or a target application provided by a third party.
  • the target application is a medical diagnosis application, and the medical diagnosis application has the function of image segmentation.
  • the medical diagnosis application can also have other functions, such as a comment function, a navigation function, and the like.
  • the terminal 101 is used to log in to the target application based on the user ID, and sends the original image to the server 102 through the target application, and the server 102 is used to receive the original image sent by the terminal 101, perform image segmentation on the original image, and obtain the segmented image of the original image, The obtained segmented image is returned to the terminal 101, and the terminal 101 can display the received segmented image.
  • the terminal shoots the user's eye, obtains the user's eye image, and sends the eye image to a server with an image segmentation function.
  • the server uses the image segmentation method provided in the embodiment of the present application to obtain The segmented image of the eye image is used to determine the area where the optic cup and the optic disc are located in the eye image, and then the doctor can determine the user's eye state according to the segmented image where the optic cup and the optic disc are located.
  • FIG. 2 is a flowchart of an image segmentation method provided by an embodiment of the present application, which is executed by a computer device. As shown in FIG. 2 , the method includes the following steps.
  • the computer device encodes the original image based on the prior knowledge vector to obtain a target feature map.
  • the original image includes a target object, and the original image is any image.
  • the original image is a medical image
  • the target object is a body part.
  • the original image is an eye image
  • the original image is an eye image.
  • the target object is an optic cup or an optic disc in the eye; or, the original image is a human lung image, and the target object is a diseased object in the human lung.
  • the prior knowledge vector includes a plurality of prior knowledge weights, each prior knowledge weight is used to represent the accuracy corresponding to an annotator, which is the accuracy of the annotator in annotating the region where any object is located in any image, and the target
  • the feature map is used to represent the feature information contained in the original image.
  • the computer device decodes the target feature map to obtain a first segmented image of the original image.
  • the first segmented image indicates the target area where the target object is located in the original image. Since the target feature map contains the feature information of the original image, and the prior knowledge vector is incorporated into the target feature map, the first segmented image obtained by decoding is equivalent to the segmentation after fusion of the annotation results of multiple annotators Image, the multiple annotators are the annotators corresponding to multiple prior knowledge weights.
  • the computer device performs image reconstruction on the first segmented image based on the prior knowledge vector to obtain multiple labeled segmented images.
  • each annotated segmented image corresponds to a prior knowledge weight
  • each annotated segmented image indicates a target region annotated by the corresponding annotator. Due to the different annotation levels of different annotators, there may be differences in the target regions indicated by multiple annotated segmentation images.
  • the first segmented image is equivalent to a segmented image after the annotation results of multiple annotators are fused. Through multiple prior knowledge weights in the prior knowledge vector and the first segmented image, a segmented image that matches multiple annotators is reconstructed.
  • the segmented image is labeled to indicate the target area where the target object is located in the original image, so that the segmented image of the original image can be updated based on the reconstructed multiple labeled segmented images, and the accuracy of the segmented image can be improved.
  • the computer device processes the target feature map based on the multiple labeled segmented images to obtain a second segmented image of the original image.
  • each annotated segmented image corresponds to a prior knowledge weight, and each annotated segmented image represents the annotation result of the original image annotated by the corresponding annotator, the target feature map is processed based on the multiple annotated segmented images, which improves the The accuracy of the second segmented image.
  • the annotation segmentation that matches the multiple annotators is reconstructed image to indicate the target area where the target object is located in the original image, that is, the multiple annotation results of the original image by multiple annotators are reconstructed.
  • the second segmented image of the original image is acquired, so that the annotation results corresponding to multiple annotators are integrated into the second segmented image, so as to ensure the accuracy of the second segmented image, thereby improving the accuracy of image segmentation.
  • FIG. 3 is a flowchart of an image segmentation method provided by an embodiment of the present application, which is executed by a computer device. As shown in FIG. 3 , the method includes the following steps.
  • the computer device encodes the original image to obtain a first feature map of the original image.
  • the first feature map is used to represent feature information contained in the original image, the original image includes a target object, and the original image is an arbitrary image.
  • the original image is a medical image
  • the target object For a body part, for example, the original image is an eye image, and the target object is the optic cup or optic disc in the eye; or, the original image is a human lung image, and the target object is a diseased object in the human lung .
  • the step 301 includes: calling a first encoding sub-model in the first image segmentation model to encode the original image to obtain a first feature map of the original image.
  • the first image segmentation model is used to obtain a model of the segmented image of the original image
  • the first image segmentation model is U-Net (a convolutional neural network for two-dimensional image segmentation).
  • the first encoding sub-model is used to obtain a feature map of the original image, for example, the first encoding sub-model is an encoder in U-Net.
  • the original image is encoded by the first encoding sub-model in the first image segmentation model, so that the first feature map contains feature information of the original image, so as to ensure the accuracy of the obtained first feature map.
  • the first encoding sub-model includes multiple first convolution modules
  • the process of acquiring the first feature map includes: calling the first first convolution module according to the arrangement order of the multiple first convolution modules , encode the original image to obtain the first first reference feature map, call the current first convolution module to encode the first reference feature map output by the previous first convolution module, and obtain the current first volume
  • the first reference feature map corresponding to the product module is obtained until the first reference feature map output by the last first convolution module is obtained, and the first reference feature map output by the last first convolution module is determined as the first feature map.
  • the sizes of the first reference feature maps output by the multiple first convolution modules gradually decrease Small.
  • the encoding sub-model adopts a down-sampling method to gradually enhance the features contained in the feature map to improve the first feature. accuracy of the graph.
  • the first coding sub-model includes n first convolution modules, the input of the first first convolution module is the original image, and the input of the ith convolution module is the i-1th convolution module.
  • the outputted first reference feature map where i is an integer greater than 1 and not greater than n, and n is an integer greater than 1; then the process of obtaining the first feature map includes: calling the first first convolution module to The original image is encoded to obtain the first first reference feature map, and the i-th first convolution module is called to encode the i-1th first reference feature map to obtain the i-th first reference feature map, until The nth reference feature map is obtained, and the nth first reference feature map is determined as the first feature map.
  • the size of the first reference feature map output by the n first convolution modules gradually decreases.
  • the computer device fuses the prior knowledge vector with the first feature map to obtain a second feature map.
  • the prior knowledge vector includes multiple prior knowledge weights, each prior knowledge weight is used to represent the accuracy corresponding to an annotator, and the accuracy is the accuracy of the annotator marking the region where any object is located in any image.
  • the prior knowledge weight can reflect the annotation level of the corresponding annotator. Since the annotation level of multiple annotators is different, the prior knowledge weight corresponding to each annotator is also different. The larger the prior knowledge weight, the higher the annotation level of the corresponding annotator, that is The higher the accuracy of annotating the region where the object is located in the image; the smaller the prior knowledge weight, the lower the labeling level of the corresponding annotator, that is, the lower the accuracy of the annotator in annotating the region where the object is located in the image.
  • the prior knowledge vector is set arbitrarily, for example, the prior knowledge vector is [0.1, 01, 0.4, 0.4], that is, the prior knowledge vector includes the prior knowledge corresponding to 4 annotators
  • the weight of prior knowledge corresponding to two annotators is 0.1, and the weight of prior knowledge corresponding to two annotators is 0.4.
  • the obtained second feature map not only includes the features in the original image, but also incorporates the prior knowledge weights corresponding to multiple annotators, so that the second feature map
  • the included features are dynamically associated with the prior knowledge vector, and the features included in the second feature map are affected by the prior knowledge vector, which enhances the dynamic representation capability of the features included in the second feature map and improves the second feature.
  • the step 302 includes: calling a knowledge inference sub-model in the first image segmentation model, and fusing the prior knowledge vector with the first feature map to obtain the second feature map.
  • the knowledge inference sub-model is used to fuse the prior knowledge vector with the first feature map.
  • the knowledge inference sub-model is ConvLSTM (Convolutional Long Short-Term Memory, long and short-term attention model).
  • ConvLSTM Convolutional Long Short-Term Memory, long and short-term attention model.
  • the prior knowledge vector is fused with the first feature map of the original image, which enhances the dynamic representation ability of the features contained in the second feature map.
  • the size of the prior knowledge vector is expanded, so that the size of the prior knowledge vector after size expansion is the same as the size of the first feature map.
  • the vector is fused with the first feature map to enhance the features contained in the first feature map to obtain a fused second feature map.
  • the first feature map, the prior knowledge vector, and the second feature map satisfy the following relationship:
  • h t is used to represent the enhanced feature map
  • f 5 is used to represent the first feature map
  • ConvLSTM( ) is used to represent the long and short-term attention model
  • h t-1 represents the feature map before enhancement
  • t represents the feature Enhanced iteration rounds
  • the computer device decodes the second feature map to obtain the target feature map.
  • the target feature map is used to represent the feature information contained in the original image.
  • a decoding method is used to refine the features contained in the feature map and improve the accuracy of the features contained in the target feature map.
  • the step 303 includes: calling the first decoding sub-model in the first image segmentation model to decode the second feature map to obtain the target feature map.
  • the first decoding sub-model is used to enhance the features included in the feature map, for example, the first decoding sub-model is a decoder in U-Net.
  • the first image segmentation model further includes a first encoding sub-model, the first encoding sub-model includes a plurality of first convolution modules, and the first decoding sub-model includes a plurality of second convolution modules, then the acquisition target
  • the process of the feature map includes: calling the first second convolution module according to the arrangement order of the multiple second convolution modules, decoding the second feature map, obtaining the first second reference feature map, calling the current
  • the second convolution module decodes the second reference feature map output by the previous second convolution module and the first reference feature map with the same size as the second reference feature map, and obtains the current second convolution module corresponding to the The second reference feature map is obtained until the second reference feature map output by the last second convolution module is obtained, and the second reference feature map output by the last second convolution module is determined as the target feature map.
  • the sizes of the second reference feature maps output by the multiple second convolution modules gradually increase big.
  • an up-sampling method is used to gradually refine the features contained in the feature map to improve the accuracy of the features contained in the target feature map.
  • the first image segmentation model includes a first encoding sub-model and a first decoding sub-model, the first encoding sub-model includes n first convolution modules, and the input of the first first convolution module is the original image , the input of the i-th convolution module is the first reference feature map output by the i-1-th convolution module, where i is an integer greater than 1 and not greater than n, and n is an integer greater than 1; the first decoder The model includes n second convolution modules, the input of the first second convolution module is the second feature map, and the input of the ith convolution module is the reference feature map output by the i-1th second convolution module and the reference feature map output by the n-i+1th first convolution module, the reference feature map output by the i-1th second convolution module and the reference output by the n-i+1th first convolution module
  • the feature maps are of equal size.
  • the process of obtaining the target feature map includes: calling the first first convolution module to encode the original image, obtaining the first first reference feature map, and calling the i-th
  • the first convolution module encodes the i-1 th first reference feature map, obtains the i th first reference feature map, until the n th reference feature map is obtained, and determines the n th first reference feature map as The first feature map; fuse the prior knowledge vector with the first feature map to obtain the second feature map; call the first second convolution module to decode the second feature map to obtain the first second reference Feature map, call the i-th second convolution module, decode the i-1th second reference feature map and the n-i+1th first reference feature map, and obtain the i-th second reference feature map, Until the nth second reference feature map is obtained, the n second reference feature maps are determined as the target feature map.
  • the size of the first reference feature map output by the n first convolution modules gradually decreases.
  • the size of the second reference feature map output by the n second convolution modules gradually increases.
  • the target feature map is obtained by decoding the obtained second feature map after fusing the encoded first feature map of the original image with the prior knowledge vector. , it is not necessary to perform the above steps 301-303, and other methods can be adopted to encode the original image according to the prior knowledge vector to obtain the target feature map.
  • the first image segmentation model is invoked, and the original image is encoded based on the prior knowledge vector to obtain the target feature map.
  • the target feature map is obtained through the first image segmentation model, so as to improve the accuracy of the target feature map.
  • the computer device decodes the target feature map to obtain a first segmented image of the original image.
  • the first segmented image indicates the target area where the target object is located in the original image.
  • the first segmented image includes weights corresponding to multiple pixels in the original image, where the weights are used to represent the possibility that the corresponding pixels are in the target area.
  • the pixel value of each pixel point represents the weight corresponding to the pixel point located at the same position in the original image. For any position in the original image, the pixel value of the pixel at the same position in the first segmented image is the weight of the pixel at that position in the original image.
  • the first segmented image is represented in the form of a heat map.
  • the color corresponding to the pixel corresponding to the weight is blue; when the weight is 1, the color corresponding to the pixel corresponding to the weight is red, and the weight is between
  • the color corresponding to the pixel corresponding to the weight is the transition color between blue and red.
  • the color corresponding to the pixel corresponding to the weight changes from blue. The color gradually changes to red.
  • the target feature map not only contains the feature information of the original image, but also incorporates a priori knowledge vector
  • the target feature map is decoded to obtain the first segmented image, which is equivalent to multiple A segmented image after fusion of the annotation results of multiple annotators, where the multiple annotators are annotators corresponding to multiple prior knowledge weights.
  • the step 304 includes: invoking the first image segmentation model, decoding the target feature map, and obtaining the first segmented image of the original image.
  • the computer device splices the original image and the first segmented image to obtain a spliced image.
  • the spliced image not only contains the information contained in the original image, but also contains the information used to indicate the target.
  • the information of the target area where the object is located in the original image enriches the information contained in the spliced image, so that multiple labeled and segmented images can be reconstructed later.
  • the computer equipment encodes the spliced image to obtain a third feature map.
  • the third feature map is used to represent the feature information contained in the original image and the information used to indicate the target area where the target object is located in the original image.
  • the step 306 includes: calling an encoding sub-model in the image reconstruction model to encode the spliced image to obtain the third feature map.
  • the image reconstruction model is used to reconstruct the labeled segmented images corresponding to the multiple prior knowledge weights, and the encoding sub-model is used to obtain the feature map of the spliced image.
  • the encoding submodel is the encoder in U-Net.
  • the encoding sub-model in the image reconstruction model is the same as the first encoding sub-model in the first image segmentation model in the foregoing step 301, and details are not described herein again.
  • the computer device fuses the prior knowledge vector with the third feature map to obtain a fourth feature map.
  • the fourth feature map not only includes the feature information contained in the original image, but also incorporates the prior knowledge weights corresponding to multiple annotators, so that the fourth feature map can be used in the follow-up.
  • the labeled segmentation images corresponding to each prior knowledge weight are reconstructed.
  • the step 307 includes: calling a fusion sub-model in the image reconstruction model, and fusing the prior knowledge vector with the third feature map to obtain a fourth feature map.
  • the fusion sub-model is similar to the knowledge inference sub-model in the foregoing step 302, and details are not described herein again.
  • the computer device decodes the fourth feature map to obtain multiple labeled segmented images.
  • each annotated segmented image corresponds to a prior knowledge weight
  • each annotated segmented image indicates the target area annotated by the corresponding annotator.
  • the target area may vary.
  • Figure 5 taking the original image as the eye image as an example, three annotators annotate the optic cup and optic disc in the eye image. Since the annotation levels of the three annotators are different, the obtained optic cup annotation There is a difference in the target area indicated by the image and the optic disc annotation image.
  • each annotated segmented image includes a first weight corresponding to a plurality of pixels in the original image, and the first weight is used to indicate the possibility that the corresponding pixel is in the target area
  • the segmented image by the annotation includes The multiple first weights can determine the target area marked by the corresponding labeler, and the target area is the area where the target object is located in the original image.
  • the pixel value of each pixel is the first weight included in the labeled segmented image.
  • the pixel value of the pixel at the same position in the labeled segmented image is the first weight of the pixel at that position in the original image.
  • the fourth feature map not only includes the feature information of the first segmented image and the feature information of the original image, but also incorporates a prior knowledge vector, and the first segmented image is equivalent to the annotation of multiple annotators
  • the multiple annotators are the annotators corresponding to multiple prior knowledge weights
  • the corresponding annotated segmented image restores the annotation results of multiple annotators on the original image, that is, the annotated segmented image corresponding to each annotator, so as to update the segmented image of the original image subsequently.
  • step 308 includes: calling a decoding sub-model in the image reconstruction model to decode the fourth feature map to obtain a plurality of labeled segmented images.
  • the decoding sub-model is similar to the first decoding sub-model in the foregoing step 303, and the first decoding sub-model is included in the first image segmentation model, and details are not described herein again.
  • the original image is introduced, and the original image and the prior knowledge vector are fused to reconstruct a plurality of labeled segmented images
  • the above steps 305- 308 other methods can be adopted, based on the prior knowledge vector, to perform image reconstruction on the first segmented image to obtain multiple labeled segmented images.
  • the image reconstruction model is invoked, and based on the prior knowledge vector, image reconstruction is performed on the first segmented image to obtain multiple labeled segmented images.
  • image reconstruction model based on the prior knowledge vector and the first segmented image, multiple labeled segmented images corresponding to the prior knowledge weights are reconstructed to ensure the accuracy of the labeled segmented images.
  • the computer device determines the uncertainty image based on the difference between the multiple labeled segmented images.
  • the uncertainty image indicates the difference between the target regions, and each target region is the region indicated by the annotated segmented image.
  • Each annotated segmented image corresponds to a prior knowledge weight, that is, each annotated segmented image is equivalent to the annotation result of the original image annotated by the corresponding annotator, and the annotator is the annotator corresponding to the prior knowledge weight. Due to the different annotation levels of multiple annotators, there will be differences in the target areas indicated by multiple annotated segmented images. Therefore, through the differences between multiple annotated segmented images, an uncertain image can be determined. Disputed areas among multiple target areas marked by multiple annotators can be indicated.
  • this step 309 includes the following steps 3091-3094.
  • the average value image is the average value image of the plurality of labeled segmented images.
  • Each annotated segmented image includes a first weight corresponding to a plurality of pixels in the original image
  • the average image includes an average value of a plurality of first weights corresponding to each pixel in the original image
  • the plurality of first weights are the pixel
  • the average value image can reflect the consistency between the target regions indicated by the multiple labeled segmented images
  • each difference image includes multiple difference values
  • each The difference value represents the difference value between a first weight and the corresponding average value
  • the first weight is the weight in the labeled segmented image corresponding to the difference value image
  • the average value is the average value in the average value image.
  • a plurality of difference images are obtained by determining an average value image of a plurality of labeled segmented images, and then determining a difference value image between each labeled segmented image and
  • this step 3091 includes: determining the average value of the first weights corresponding to the pixels located at the same position in the multiple labeled segmented images, and constructing the average value image based on the obtained multiple average values.
  • each annotated segmented image is determined, the difference between a plurality of first weights in the annotated segmented image and the corresponding average value is determined, and a plurality of obtained differences are formed into a difference image corresponding to the annotated segmented image.
  • the pixel value of any pixel is the first weight corresponding to the pixel in the marked segmented image and the pixel in the same position, and the pixel in the average image is in the same position as the pixel.
  • the difference between the average values corresponding to the pixel points of , the label image corresponds to the difference image.
  • For any position determine the square of the pixel value of the pixel at the position in the plurality of difference images, and determine the sum of the squares of the pixel values corresponding to the position in the plurality of difference images as the pixel corresponding to the position
  • the sum of squares of values can be obtained by repeating the above method to obtain the sum of squares of pixel values corresponding to multiple positions.
  • the number of targets is the number of multiple labeled segmented images
  • the second weight is used to indicate the difference between the labeling results of the corresponding positions in the multiple labeled segmented images, and the labeling result indicates whether the pixel at the corresponding position is in the target area.
  • the uncertainty image includes second weights corresponding to multiple pixels in the original image.
  • the multiple labeled segmented images and the uncertainty image satisfy the following relationship:
  • U map represents the uncertainty image
  • N 0 represents the number of multiple labeled segmented images
  • N 0 is a positive integer not less than 2
  • i 0 represents the sequence number of the labeled segmented image among the multiple labeled segmented images, and i 0 is greater than equal to 1, and less than or equal to N 0
  • the computer device fuses the target feature map and the uncertainty image to obtain a second segmented image.
  • each target area is the area indicated by the labeled segmentation image
  • the target feature map not only contains the feature information of the original image, but also incorporates the prior knowledge vector, so , by fusing the target feature map with the uncertainty image, the uncertain areas in the multiple labeled segmented images can be distinguished, so as to improve the accuracy of the target area indicated by the second segmented image.
  • the step 310 includes the following steps 3101-3105.
  • This step is the same as the method of determining the average value image in the above step 3091, and will not be repeated here.
  • the first fusion feature map is used to represent inconsistent information among multiple labeled segmented images, and the multiple labeled segmented images correspond to multiple prior knowledge weights.
  • the first fusion feature map By determining the product of the target feature map and the uncertainty image, and then determining the sum of the determined product and the target feature map as the first fusion feature map.
  • the features of the target feature map in the uncertain region are enhanced, and the uncertain region is the region indicated by the uncertainty image, so as to improve the accuracy of the first fusion feature map.
  • step 3102 includes: determining the pixel-level product of the target feature map and the uncertainty image, and determining the first fusion feature map as the pixel-level sum of the obtained product and the target feature map .
  • the pixel-level product refers to the product of the pixel value of the target feature map and the pixel at the same position in the uncertainty image
  • the pixel-level sum refers to the obtained product and the target feature map. The pixel of the pixel located at the same position sum of values.
  • the method further includes: performing smoothing processing on the uncertainty image, and performing maximum value processing on the smoothed uncertainty image.
  • the smoothing process can be in the form of Gaussian smoothing.
  • the multiple weight values contained in the smoothed uncertainty image may change. Therefore, through the maximum value processing method, the smoothed uncertainty image is compared with the unsmoothed uncertainty image.
  • the uncertainty image after smoothing is at the same position as the uncertainty image before smoothing, and the maximum value of the two weights corresponding to the position is determined as the weight after the maximum operation of the position, and the above method is repeated. , the uncertainty image after maximum value processing is obtained.
  • the smoothing method the multiple weights contained in the uncertain image tend to be smoothed, and the excessive effect is achieved to expand the coverage of the uncertain area, so as to effectively perceive and capture the difference between multiple labeled segmentation images. Inconsistent areas are processed by the maximum value to ensure the accuracy of the weights contained in the uncertainty image, and the accuracy of the uncertainty image is improved.
  • Soft(Umap) represents the uncertainty image processed by the maximum value
  • ⁇ max represents the maximum function, which is used to retain the higher pixel values in the same position in the smoothed uncertainty image and the original uncertainty image
  • F Gauss is used to represent the convolution operation with Gaussian kernel k
  • U map represents the original uncertainty image.
  • the second fusion feature map is used to represent consistent information among multiple labeled segmented images, and the multiple labeled segmented images correspond to multiple prior knowledge weights.
  • the second fusion feature map is determined by determining the product of the target feature map and the average image, and then determining the sum of the determined product and the target feature map.
  • step 3103 includes: determining the pixel-level product of the target feature map and the average image, and determining the second fusion feature map as the pixel-level sum of the obtained product and the target feature map.
  • the pixel-level product refers to the product of the pixel value of the target feature map and the pixel at the same position in the average image
  • the pixel-level sum refers to the obtained product and the target feature map. The pixel value of the pixel at the same position and value.
  • the method further includes: performing smoothing processing on the average value image, and performing maximum value processing on the smoothed average value image.
  • This step is the same as the process of performing the smoothing process on the uncertain image and performing the maximum value processing on the smoothed uncertainty image in the above step 3102, and will not be repeated here.
  • both the uncertainty image and the average value image can be smoothed and the maximum value can be processed. processing, then for the acquisition of the first fusion feature map and the second fusion feature map obtained in the above steps 3012 and 3013, the following relationship can be satisfied:
  • j is used to represent the code, and the value of j is 1 and 2; Represents the fusion feature map; when j is 1, represents the first fusion feature map; when j is 2, represents the second fusion feature map; F 1 represents the target feature map, Soft(A j ) represents the uncertainty image or the average image; when j is 1, Soft(A 1 ) represents the uncertainty image; when j is 2 , Soft(A 2 ) represents the average image; Used to represent pixel-level products.
  • the size of the first fusion feature map is B*C1*H*W
  • the size of the second fusion feature map is B*C2*H*W
  • the size of the spliced fusion feature map after splicing is B*(C1+ C2)*H*W.
  • the splicing fusion feature map includes the enhanced features of the target feature map in the highly certain area and the enhanced features in the uncertain area, when the splicing fusion feature is convolved, the splicing feature map can be convoluted.
  • the determined target area in is distinguished from other areas, so as to improve the accuracy of the target area indicated by the second segmented image, that is, to improve the accuracy of the second segmented image.
  • the obtained second segmented image indicates the region where the optic cup is located in the eye image, and where the optic disc is located in the eye image. area at. Then the second segmented image satisfies the following relationship:
  • O represents the second segmented image
  • Conv 1 ⁇ 1 ( ) represents convolution
  • Concat ( ) represents splicing processing
  • the first fusion feature map corresponding to the optic cup is The first fusion feature map corresponding to the optic disc is The second fusion feature map corresponding to the optic cup is The second fusion feature map corresponding to the optic disc is
  • the uncertainty image is obtained first, and then the second segmented image is obtained based on the target feature map and the uncertainty image.
  • the above steps 309-310 need not be performed.
  • Other methods can be adopted to process the target feature map based on a plurality of labeled segmented images to obtain a second segmented image of the original image.
  • the second image segmentation model is invoked, and the target feature map is processed based on the multiple labeled segmentation images to obtain the second segmented image of the original image.
  • the second image segmentation model is used to obtain the second segmented image.
  • inconsistent information and consistent information corresponding to a plurality of labeled segmented images are used to ensure the accuracy of the second segmented image.
  • the first image segmentation model, the image reconstruction model and the second image segmentation model are used to perform image segmentation on the original image to obtain a second segmented image.
  • the annotation segmentation that matches the multiple annotators is reconstructed image to indicate the target area where the target object is located in the original image, that is, the multiple annotation results of the original image by multiple annotators are reconstructed.
  • the second segmented image of the original image is acquired, so that the annotation results corresponding to multiple annotators are integrated into the second segmented image, so as to ensure the accuracy of the second segmented image, thereby improving the accuracy of image segmentation.
  • the prior knowledge vector can be introduced, and the prior knowledge vector can be embedded in the image.
  • the dynamic representation ability of the model is improved.
  • the present application provides a soft attention mechanism, which performs smoothing and maximum value processing on the uncertain image to expand the coverage of the uncertain area, so as to effectively perceive and capture the difference between multiple labeled segmentation images. Inconsistent areas are processed by the maximum value to ensure the accuracy of the weights contained in the uncertain image, which improves the accuracy of the uncertain image, thereby improving the performance of subsequent image segmentation.
  • the methods provided by the embodiments of the present application can be applied to the medical field, and can perform image segmentation on images in the medical field. As shown in FIG. 8 , on different datasets in the medical field, the image segmentation method provided by the embodiment of the present application is compared with segmented images obtained by other segmentation models in the prior art. It can be seen from the comparison that the segmented image obtained by the image segmentation method provided in the embodiment of the present application is more accurate.
  • the process of performing image segmentation on the original image can be performed by using the first image segmentation model, the image reconstruction model and the second image segmentation model.
  • the first image segmentation model, the image reconstruction model and the second image segmentation model need to be trained, and the training process is detailed in the following embodiments.
  • FIG. 9 is a flowchart of a model training method provided by an embodiment of the present application, which is applied to a computer device. As shown in FIG. 9 , the method includes the following steps.
  • the computer device acquires an original image of the sample, a plurality of labeled and segmented images of the sample, and a prior knowledge vector.
  • the sample original image includes sample objects, each sample labeled segmented image corresponds to a prior knowledge weight, each sample labeled segmented image indicates the sample area where the sample object is located in the sample original image, and each sample labeled segmented image
  • the original image of the sample is marked by the corresponding annotator, and each sample marked segmented image is the real annotation result of the corresponding annotator.
  • the original image of the sample is an eye image
  • the multiple annotators are multiple eye doctors.
  • the computer device invokes the first image segmentation model, encodes the original image of the sample based on the prior knowledge vector, and obtains a feature map of the target sample.
  • This step is the same as the above steps 301-303, and will not be repeated here.
  • the computer device invokes the first image segmentation model, decodes the feature map of the target sample, and obtains a first sample segmentation image of the original sample image.
  • the first sample segmented image indicates a sample area where the sample object is located in the sample original image.
  • This step is the same as the above-mentioned step 304, and will not be repeated here.
  • the computer device splices the sample original image and the first sample segmented image to obtain a first sample spliced image.
  • This step is the same as the above-mentioned step 305, and will not be repeated here.
  • the computer device invokes the encoding sub-model to encode the spliced image of the first sample to obtain a feature map of the first sample.
  • the image reconstruction model includes an encoding sub-model, a fusion sub-model, and a decoding sub-model. This step is the same as the above-mentioned step 306, and will not be repeated here.
  • the computer device invokes the fusion sub-model, and fuses the prior knowledge vector with the first sample feature map to obtain a second sample feature map.
  • This step is the same as the above-mentioned step 307, and will not be repeated here.
  • the computer device invokes the decoding sub-model to decode the feature map of the second sample to obtain multiple predicted and labeled segmented images.
  • each predicted and labeled segmented image corresponds to a prior knowledge weight
  • each predicted labeled segmented image indicates a sample area labeled by the corresponding annotator.
  • This step is the same as the above-mentioned step 308, and will not be repeated here.
  • the original sample image and the first sample segmented image are first spliced, and then the encoding sub-model, fusion sub-model and decoding sub-model in the image reconstruction model are called to obtain multiple predictions.
  • steps 904 to 907 need not be performed, and other methods can be used to call the image reconstruction model, and perform image reconstruction on the first sample segmented image according to the prior knowledge vector to obtain Multiple prediction annotations segment the image.
  • the computer device invokes the second image segmentation model to process the feature map of the target sample based on the plurality of predicted annotated segmentation images to obtain a predicted segmented image of the original sample image.
  • an image reconstruction model is used to obtain a predicted and labeled segmented image.
  • the predicted and labeled segmented image includes an optic cup predicted labeled segmented image and an optic disc predicted labeled segmented image.
  • the obtained The uncertainty image includes the optic cup uncertainty image and the optic disc uncertainty image, and the average image of multiple optic cup predictions and annotated segmented images is determined as the optic cup consistency image, and the multiple optic disc predictions are labeled as the average image of the segmented images.
  • the average image is determined as the optic disc consistency image, and then, through the second image segmentation model, the image features of the target sample are respectively associated with the optic cup uncertainty image, optic disc uncertainty image, optic cup consistency image and optic disc consistency image. Fusion is performed, the fused feature maps are spliced, and the spliced feature maps are rolled and processed to obtain the predicted segmentation image.
  • the computer device Based on the prior knowledge vector, the computer device performs weighted fusion on the labeled and segmented images of multiple samples to obtain a fused labeled segmented image.
  • the prior knowledge vector includes multiple prior knowledge weights, and multiple prior knowledge weights are in one-to-one correspondence with multiple sample labeled segmentation images
  • the multiple prior knowledge weights in the prior knowledge vector are used to classify multiple samples.
  • the annotated and segmented images are weighted and fused, and the obtained fused and annotated segmented images are used as the final results marked by multiple annotators, so that the fused and annotated segmented images can be used as a supervision value to train the first image segmentation model, image reconstruction model and A second image segmentation model is trained.
  • the multiple prior knowledge weights, multiple sample annotation segmentation images, and fusion annotation segmentation images satisfy the following relationship:
  • GT soft represents the fusion annotation segmentation image
  • N 1 represents the total number of multiple prior knowledge weights, N 1 is a positive integer greater than or equal to 2
  • i 1 is used to represent the prior knowledge weight and the serial number of the sample annotation segmentation image
  • i 1 is a positive integer greater than or equal to 1 and less than or equal to N 1
  • each sample annotated segmented image includes weights corresponding to multiple pixels in the original image
  • step 909 includes: based on a priori knowledge vector, annotating multiple samples in the segmented image at the same position
  • the pixel values of the pixel points are weighted and fused to obtain the fusion weight corresponding to each position, and the fusion weights corresponding to multiple positions are formed into the fusion labeled segmentation image.
  • the pixel value of the pixel is the weight corresponding to the pixel at the same position in the original image.
  • the computer device trains the first image segmentation model, the image reconstruction model, and the second image segmentation model based on the difference between the predicted segmented image and the fused annotated segmented image.
  • the fused labeled segmented image is equivalent to the real labeled segmented image of the sample original image, that is, the fused labeled segmented image indicates the sample area where the sample object is located in the sample original image, and the sample area is where the sample object is located in the sample original image.
  • the predicted segmented image is predicted by the first image segmentation model, the image reconstruction model and the second image segmentation model, then based on the difference between the predicted segmented image and the fused annotated segmented image, the first image can be determined
  • the inaccuracies of the segmentation model, the image reconstruction model, and the second image segmentation model are used for subsequent adjustment of the first image segmentation model, the image reconstruction model, and the second image segmentation model.
  • the step 910 includes the following steps 9101-9102.
  • the first loss value is used to represent the difference between the predicted segmented image and the fusion labeled segmented image, and the larger the loss value, the lower the accuracy of the first image segmentation model, the image reconstruction model, and the second image segmentation model. , the smaller the loss value, the higher the accuracy of the first image segmentation model, the image reconstruction model and the second image segmentation model.
  • the first image segmentation model, the image reconstruction model and the second image segmentation model are trained by the first loss value to reduce the first loss value and improve the first image segmentation model, the image reconstruction model and the second image segmentation model accuracy.
  • the predicted segmented image is obtained through the first image segmentation model, the image reconstruction model and the second segmentation model, wherein the knowledge inference sub-model is a sub-model in the first image segmentation model.
  • a fused annotation segmentation image is obtained.
  • a first loss value is determined based on the difference between the predicted segmentation image and the fused annotation segmentation image, and the first loss value is determined based on the determined first loss value.
  • An image segmentation model, an image reconstruction model and a second image segmentation model are trained.
  • this step 9102 includes the following three manners.
  • the first method Determine the second loss value based on the difference between the first sample segmented image and the fused annotation segmented image, and based on the first loss value and the second loss value, determine the first image segmentation model and the image reconstruction model. and the second image segmentation model for training.
  • the second loss value is used to represent the difference between the first sample segmented image and the fused annotated segmented image. The greater the difference between the first sample segmented image and the fused annotated segmented image, the greater the second loss value. Large, the smaller the difference between the first sample segmented image and the fused annotated segmented image, the smaller the second loss value.
  • the first image segmentation model, the image reconstruction model and the second image segmentation model are trained to reduce the first loss value and the second loss value and improve the first image segmentation model , the accuracy of the image reconstruction model and the second image segmentation model.
  • the process of training the first image segmentation model, the image reconstruction model and the second image segmentation model based on the first loss value and the second loss value includes: determining the first loss value and the second image segmentation model.
  • the first sum of the two loss values, based on the first sum, the first image segmentation model, the image reconstruction model and the second image segmentation model are trained.
  • the second method Determine the third loss value based on the difference between the multiple predicted and labeled segmented images and the corresponding sample labeled segmented images. Based on the first loss value and the third loss value, the first image segmentation model and the image are repeated. The construction model and the second image segmentation model are trained.
  • the third loss value is a reconstruction loss, which is used to represent the difference between a plurality of predicted annotated segmentation images and corresponding sample annotated segmentation images.
  • the multiple prediction annotation segmented images, multiple sample annotation segmented images, and the third loss value satisfy the following relationship:
  • loss rec represents the third loss value
  • N 1 represents the total number of multiple prior knowledge weights, that is, the number of multiple predicted and labeled segmentation images, N 1 is a positive integer greater than or equal to 2
  • i 2 represents Predict the serial number of the labeled segmented image and the sample labeled segmented image
  • L BCE is the binary cross-entropy loss function
  • the process of training the first image segmentation model, the image reconstruction model and the second image segmentation model based on the first loss value and the third loss value includes: determining the first loss value and the third image segmentation model.
  • the second sum of the three loss values, according to the second sum, the first image segmentation model, the image reconstruction model and the second image segmentation model are trained.
  • the third method splicing the original sample image and the fusion annotation segmentation image to obtain the second sample spliced image, calling the encoding sub-model, encoding the second sample spliced image, obtaining the third sample feature map, based on the third sample feature
  • the difference between the image and the first sample feature map determines a fourth loss value, and based on the first loss value and the fourth loss value, the first image segmentation model, the image reconstruction model and the second image segmentation model are trained.
  • the fourth loss value is the consistency loss, which is used to represent the difference between the third sample feature map and the first sample feature map.
  • the process of acquiring the stitched image of the second sample is the same as the above-mentioned step 305, and will not be repeated here.
  • the process of invoking the coding sub-model to obtain the third sample feature map is the same as the above step 905 and will not be repeated here.
  • the first sample feature map is obtained by calling the encoding sub-model to encode the first sample spliced image
  • the first sample spliced image is obtained by splicing the sample original image and the first sample segmented image
  • the third sample spliced image is obtained by splicing
  • the feature map is obtained by calling the coding sub-model to encode the second sample spliced image
  • the second sample spliced image is obtained by splicing the sample original image and the fusion annotation segmentation image
  • the first sample segmentation image is obtained by prediction
  • the fused annotated segmented image is the real result marked by multiple annotators, then through the fourth loss value, the first sample feature map corresponding to the predicted result output by the same coding sub-model and the third sample corresponding to the real result can be determined
  • the difference between the feature maps can reflect the difference between the predicted result and the real result, thus reflecting the accuracy of the coding sub-model.
  • the process of training the first image segmentation model, the image reconstruction model and the second image segmentation model based on the first loss value and the fourth loss value includes: determining the first loss value and the fourth loss value.
  • the third sum of the four loss values, based on the third sum, the first image segmentation model, the image reconstruction model and the second image segmentation model are trained.
  • the encoding sub-model includes multiple third convolution modules, and in the process of invoking the multiple third convolution modules to encode the first sample stitched image, the first third volume is called.
  • the product module encodes the spliced image of the first sample to obtain the first third reference feature map, and calls the current third convolution module to encode the third reference feature map output by the previous third convolution module to obtain the current.
  • the third reference feature map corresponding to the third convolution module is obtained until the third reference feature map output by the last third convolution module is obtained, and the third reference feature map output by the last third convolution module is determined as the first reference feature map.
  • Sample feature map is obtained.
  • multiple fourth reference feature maps corresponding to the second sample spliced image can also be obtained, and the last one can also be obtained.
  • the fourth reference feature map output by the third convolution module is determined to be the second sample stitched image; then the fourth loss value satisfies the following relationship:
  • loss con represents the fourth loss value
  • Q represents the number of third convolution modules included in the coding sub-model, and Q is a positive integer greater than or equal to 2
  • i 3 represents the serial number of the third convolution module, and i 3 is A positive integer greater than or equal to 1 and less than or equal to Q
  • the above only describes the training of the first image segmentation model, the image reconstruction model and the second image segmentation model in three ways.
  • the above three ways can be used in two ways. The two are combined, for example, the first method is combined with the second method, and the second method is combined with the third method; or, the above three methods are combined.
  • step 9102 includes: determining a second loss value based on the difference between the first sample segmented image and the fused annotation segmented image, and based on a plurality of predicted annotated segmented images and corresponding sample annotated segmented images The difference between the two, determine the third loss value, splicing the original sample image and the fusion annotation segmentation image to obtain the second sample spliced image, call the encoding sub-model, encode the second sample spliced image, and obtain the third sample feature map , based on the difference between the third sample feature map and the first sample feature map, determine the fourth loss value, based on the first loss value, the second loss value, the third loss value and the fourth loss value, for the first image
  • the segmentation model, the image reconstruction model and the second image segmentation model are trained.
  • the process of training the first image segmentation model, the image reconstruction model and the second image segmentation model includes: based on The first loss value, the second loss value, the third loss value and the fourth loss value determine the total loss value, and based on the total loss value, the first image segmentation model, the image reconstruction model and the second image segmentation model are trained.
  • the first loss value, the second loss value, the third loss value, the fourth loss value and the total loss value satisfy the following relationship:
  • L represents the total loss value
  • L BCE is the binary cross-entropy loss function
  • P 1 represents the second loss value
  • GT soft represents the fusion annotation segmentation image
  • M represents the predicted segmentation image
  • represents the hyperparameter, which is used to balance the third
  • the loss value and the second loss value can be set to 0.7
  • loss rec represents the third loss value
  • loss con represents the fourth loss value.
  • a model matching the multiple annotators is reconstructed.
  • the segmented image is annotated to indicate the target area where the target object is located in the original image, that is, multiple annotation results of the original image by multiple annotators are reconstructed.
  • the annotation results corresponding to multiple annotators are integrated into the second segmented image, so as to ensure the accuracy of the second segmented image, thereby improving the accuracy of image segmentation.
  • the difference between the reconstructed multiple predicted annotated segmentation images and the multiple sample annotation segmentation images by considering the difference between the reconstructed multiple predicted annotated segmentation images and the multiple sample annotation segmentation images, the difference between the third sample feature map and the first sample feature map, and the difference between the first sample segmentation image and the first sample segmentation image.
  • the differences between the annotated and segmented images are fused to improve the accuracy of the first image segmentation model, the image reconstruction model and the second image segmentation model.
  • the model training method provided by the embodiment of the present application trains the model based on the sample annotated segmented images corresponding to a plurality of annotators.
  • the model in the process of model training, can be trained by using a single annotator's sample annotated and segmented images.
  • Table 1 includes the sample annotated segmentation images annotated by different annotators.
  • the accuracy of the models after training for various models is shown in Table 1. From the accuracy in Table 1, it can be seen that the sample annotations of multiple annotators are used.
  • the segmentation image has a high accuracy rate for the model after model training.
  • the labeled segmented images of multiple annotators can be reconstructed, the correlation between the obtained predicted segmented images and the sample labeled segmented images can be enhanced, and the difference between the annotators can be estimated. certainty.
  • the method provided by the embodiment of the present application can perform image segmentation on the original image by using the first segmentation image model, the image reconstruction model and the second image segmentation model.
  • Tables 2 and 3 both take medical images as an example, and compare the accuracy of image segmentation of medical images by using the model provided by the present application and the image segmentation model provided by the prior art.
  • Table 2 takes the eye image as an example, and by performing image segmentation on the eye image, the segmented image corresponding to the optic cup in the eye and the segmented image corresponding to the optic disc in the eye are determined. From the data in Table 2, it can be known that the accuracy of the model provided by the embodiment of the present application is the highest whether the segmented image corresponding to the optic cup or the segmented image corresponding to the optic disc is acquired.
  • Table 3 is an example of kidney images, brain images, tumor images, etc. in medical images. From the data in Table 3, it can be seen that no matter whether a segmented image of any medical image is obtained, the model provided in the embodiment of the present application has The accuracy rates are the highest. That is to say, the segmentation image obtained by the image segmentation method provided by the present application has a high accuracy rate and a good image segmentation effect.
  • Model 1 to Model 6 are respectively models obtained by using the annotated segmented images of annotators 1 to 6 to be trained. Multiple models are evaluated by employing different prior knowledge vectors including single annotator, random annotator's prior knowledge weight, and average prior knowledge weight. As shown in Table 4, for the prior knowledge vector of a single annotator, the prior knowledge vector of the selected annotator is 1, and the prior knowledge vector of other annotators is 0. Taking the eye image as an example, the eye image is segmented by a variety of models to obtain the segmented image of the eye image. It can be seen from Table 4 that the image segmentation method provided in this application always achieves superior performance under the conditions of different prior knowledge vectors.
  • FIG. 12 is a schematic structural diagram of an image segmentation apparatus provided by an embodiment of the present application. As shown in FIG. 12 , the apparatus includes:
  • the encoding module 1201 is used for encoding the original image based on the prior knowledge vector to obtain the target feature map, the original image includes the target object, the prior knowledge vector includes multiple prior knowledge weights, and each prior knowledge weight is used to represent a The accuracy corresponding to the annotator, which is the accuracy with which the annotator marks the region where any object is located in any image;
  • a decoding module 1202 configured to decode the target feature map to obtain a first segmented image of the original image, where the first segmented image indicates the target area where the target object is located in the original image;
  • the reconstruction module 1203 is configured to perform image reconstruction on the first segmented image based on the prior knowledge vector to obtain a plurality of labeled segmented images, each labeled segmented image corresponds to a prior knowledge weight, and each labeled segmented image indicates a corresponding The target area marked by the annotator;
  • the processing module 1204 is configured to process the target feature map based on the multiple labeled segmented images to obtain a second segmented image of the original image.
  • the processing module 1204 includes:
  • the first determining unit 1241 is configured to determine an uncertainty image based on the difference between the multiple labeled segmented images, the uncertainty image indicates the difference between multiple target regions, and each target region is an area indicated by the labeled segmented image ;
  • the first fusion unit 1242 is configured to fuse the target feature map and the uncertainty image to obtain a second segmented image.
  • each annotated segmented image includes a first weight corresponding to a plurality of pixels in the original image, and the first weight is used to indicate the possibility that the corresponding pixel is in the target area;
  • the first determination unit 1241 is configured to determine the difference image between each labeled segmented image and the average value image, and the average value image is the average value image of multiple labeled segmented images;
  • the square sum of the pixel values of the pixel points; the square root of the ratio between the square sum corresponding to each position and the target number is determined as the second weight of each position, and the target number is the number of multiple labeled segmentation images. number; constructs an uncertainty image based on the second weight of multiple locations.
  • the first fusion unit 1242 is configured to determine the average image of multiple labeled segmented images; determine the product of the target feature map and the uncertainty image, and combine the product determined this time with the target feature map The sum is determined as the first fusion feature map; the product of the target feature map and the average image is determined, and the sum of the product determined this time and the target feature map is determined as the second fusion feature map; the first fusion feature map and The second fusion feature map is spliced to obtain a spliced fusion feature map; the spliced fusion feature map is convolved to obtain a second segmented image.
  • the encoding module 1201 includes:
  • the first encoding unit 1211 is used to encode the original image to obtain the first feature map of the original image
  • the second fusion unit 1212 is configured to fuse the prior knowledge vector with the first feature map to obtain a second feature map
  • the first decoding unit 1213 is configured to decode the second feature map to obtain the target feature map.
  • the reconstruction module 1203 includes:
  • a splicing unit for splicing the original image and the first segmented image to obtain a spliced image
  • the second encoding unit 1231 is used to encode the spliced image to obtain a third feature map
  • the third fusion unit 1232 is used to fuse the prior knowledge vector with the third feature map to obtain the fourth feature map;
  • the second decoding unit 1233 is configured to decode the fourth feature map to obtain a plurality of labeled segmented images.
  • the original image is encoded based on the prior knowledge vector, and the step of obtaining the target feature map is performed by the first image segmentation model;
  • the step of decoding the target feature map to obtain the first segmented image of the original image is performed by the first image segmentation model
  • image reconstruction is performed on the first segmented image, and the steps of obtaining a plurality of labeled segmented images are performed by the image reconstruction model;
  • the step of processing the target feature map based on the plurality of labeled segmented images to obtain a second segmented image of the original image is performed by the second image segmentation model.
  • the apparatus further includes:
  • the acquisition module 1205 is used to acquire a sample original image, a plurality of sample annotated segmentation images and a priori knowledge vector, the sample original image includes a sample object, each sample annotated segmentation image corresponds to a prior knowledge weight, and each sample annotated segmentation image Indicate the sample area where the sample object is located in the sample original image, and each sample annotated segmented image is annotated by the corresponding annotator;
  • the encoding module 1201 is further configured to call the first image segmentation model, encode the original image of the sample based on the prior knowledge vector, and obtain the feature map of the target sample;
  • the decoding module 1202 is further configured to call the first image segmentation model, decode the feature map of the target sample, and obtain a first sample segmentation image of the sample original image, where the first sample segmentation image indicates where the sample object is located in the sample original image. the sample area;
  • the reconstruction module 1203 is further configured to call the image reconstruction model, and based on the prior knowledge vector, perform image reconstruction on the first sample segmented image to obtain a plurality of predicted and labeled segmented images, each predicted and labeled segmented image is associated with a priori Corresponding knowledge weights, each predicted label segmentation image indicates the predicted sample area;
  • the processing module 1204 is further configured to call the second image segmentation model to segment the image based on a plurality of predicted annotations, and to process the feature map of the target sample to obtain a predicted segmented image of the original image of the sample;
  • a weighted fusion module 1206, configured to perform weighted fusion of a plurality of sample annotated segmented images based on the prior knowledge vector, to obtain a fused annotated segmented image;
  • the training module 1207 is configured to train the first image segmentation model, the image reconstruction model and the second image segmentation model based on the difference between the predicted segmented image and the fused annotated segmented image.
  • the training module 1207 includes:
  • the second determining unit 1271 is configured to determine the first loss value based on the difference between the predicted segmented image and the fused annotated segmented image;
  • the training unit 1272 is configured to train the first image segmentation model, the image reconstruction model and the second image segmentation model based on the first loss value.
  • the training unit 1272 is configured to determine the second loss value based on the difference between the first sample segmented image and the fused annotation segmented image; based on the first loss value and the second loss value, determine the second loss value
  • the first image segmentation model, the image reconstruction model and the second image segmentation model are trained.
  • the training unit 1272 is configured to determine the third loss value based on the difference between the multiple predicted labeled segmented images and the corresponding sample labeled segmented images; based on the first loss value and the third loss value , training the first image segmentation model, the image reconstruction model and the second image segmentation model.
  • the image reconstruction model includes an encoding sub-model, a fusion sub-model and a decoding sub-model;
  • the reconstruction module 1203 is used for splicing the original sample image and the first sample segmented image to obtain the first sample spliced image; calling the encoding sub-model to encode the first sample spliced image to obtain the first sample feature Figure; call the fusion sub-model to fuse the prior knowledge vector with the first sample feature map to obtain the second sample feature map; call the decoding sub-model to decode the second sample feature map to obtain multiple predicted and labeled segmentation images .
  • the apparatus further includes:
  • the splicing module 1208 is used for splicing the original image of the sample and the fused annotated segmented image to obtain a second spliced image of the sample;
  • the reconstruction module 1203 is further configured to call the encoding sub-model to encode the spliced image of the second sample to obtain a feature map of the third sample;
  • the training unit 1272 is configured to determine a fourth loss value based on the difference between the third sample feature map and the first sample feature map; The construction model and the second image segmentation model are trained.
  • the image segmentation apparatus provided in the above-mentioned embodiments is only illustrated by the division of the above-mentioned functional modules. In practical applications, the above-mentioned functions can be allocated to different functional modules according to needs. The structure is divided into different functional modules to complete all or part of the functions described above.
  • the image segmentation apparatus and the image segmentation method embodiments provided by the above embodiments belong to the same concept, and the specific implementation process thereof is detailed in the method embodiments, which will not be repeated here.
  • the embodiment of the present application also provides a computer device, the computer device includes a processor and a memory, and at least one computer program is stored in the memory, and the at least one computer program is loaded and executed by the processor to implement the following steps:
  • the original image is encoded based on the prior knowledge vector, and the target feature map is obtained.
  • the original image includes the target object, and the prior knowledge vector includes multiple prior knowledge weights.
  • Each prior knowledge weight is used to represent the accuracy corresponding to an annotator.
  • the accuracy is the accuracy with which the annotator marks the region where any object is located in any image;
  • image reconstruction is performed on the first segmented image to obtain multiple labeled segmented images, each labeled segmented image corresponds to a prior knowledge weight, and each labeled segmented image indicates the target marked by the corresponding annotator area;
  • the target feature map is processed to obtain a second segmented image of the original image.
  • the at least one computer program is loaded and executed by the processor to implement the following steps:
  • the uncertainty image indicates the difference between multiple target areas, and each target area is an area indicated by the labeled segmentation image;
  • the target feature map and the uncertainty image are fused to obtain the second segmented image.
  • each annotated segmented image includes a first weight corresponding to a plurality of pixels in the original image, and the first weight is used to represent the possibility that the corresponding pixel is in the target area; the at least one computer
  • the program is loaded and executed by the processor to implement the following steps:
  • An uncertainty image is constructed based on the second weights for the plurality of locations.
  • the at least one computer program is loaded and executed by the processor to implement the following steps:
  • the at least one computer program is loaded and executed by the processor to implement the following steps:
  • the at least one computer program is loaded and executed by the processor to implement the following steps:
  • the fourth feature map is decoded to obtain a plurality of labeled segmented images.
  • the original image is encoded based on the prior knowledge vector, and the step of obtaining the target feature map is performed by the first image segmentation model;
  • the step of decoding the target feature map to obtain the first segmented image of the original image is performed by the first image segmentation model
  • image reconstruction is performed on the first segmented image, and the steps of obtaining a plurality of labeled segmented images are performed by the image reconstruction model;
  • the step of processing the target feature map based on the plurality of labeled segmented images to obtain a second segmented image of the original image is performed by the second image segmentation model.
  • the at least one computer program is loaded and executed by the processor to implement the following steps:
  • the sample original image includes sample objects, each sample annotated segmentation image corresponds to a prior knowledge weight, and each sample annotated segmentation image indicates that the sample object is in the sample original.
  • the sample area in the image, and each sample annotated segmented image is annotated by the corresponding annotator;
  • the first image segmentation model decode the target sample feature map, and obtain the first sample segmentation image of the sample original image, and the first sample segmentation image indicates the sample area where the sample object is located in the sample original image;
  • the image reconstruction model is called, and based on the prior knowledge vector, image reconstruction is performed on the segmented image of the first sample to obtain multiple predicted and labeled segmented images.
  • Each predicted and labeled segmented image corresponds to a prior knowledge weight.
  • the segmented image indicates the predicted sample area;
  • the second image segmentation model segment the image based on multiple prediction annotations, process the feature map of the target sample, and obtain the predicted segmentation image of the original image of the sample;
  • weighted fusion of multiple sample annotated and segmented images is performed to obtain a fused annotated segmented image
  • the first image segmentation model, the image reconstruction model and the second image segmentation model are trained based on the difference between the predicted segmented image and the fused annotated segmented image.
  • the at least one computer program is loaded and executed by the processor to implement the following steps:
  • the first image segmentation model, the image reconstruction model and the second image segmentation model are trained.
  • the at least one computer program is loaded and executed by the processor to implement the following steps:
  • the first image segmentation model, the image reconstruction model and the second image segmentation model are trained.
  • the at least one computer program is loaded and executed by the processor to implement the following steps:
  • the first image segmentation model, the image reconstruction model and the second image segmentation model are trained.
  • the image reconstruction model includes an encoding sub-model, a fusion sub-model and a decoding sub-model; the at least one computer program is loaded and executed by the processor to implement the following steps:
  • the decoding sub-model is called to decode the feature map of the second sample to obtain multiple predicted and labeled segmented images.
  • the at least one computer program is loaded and executed by the processor to implement the following steps:
  • the first image segmentation model, the image reconstruction model and the second image segmentation model are trained.
  • FIG. 14 shows a structural block diagram of a terminal 1400 provided by an exemplary embodiment of the present application.
  • the terminal 1400 includes: a processor 1401 and a memory 1402 .
  • the processor 1401 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and the like.
  • the processor 1401 may also include a main processor and a coprocessor.
  • the main processor is a processor used to process data in the wake-up state, also called CPU (Central Processing Unit, central processing unit); the coprocessor is A low-power processor for processing data in a standby state.
  • the processor 1401 may be integrated with a GPU (Graphics Processing Unit, image processor), and the GPU is used for rendering and drawing the content that needs to be displayed on the display screen.
  • the processor 1401 may further include an AI (Artificial Intelligence, artificial intelligence) processor, where the AI processor is used to process computing operations related to machine learning.
  • AI Artificial Intelligence, artificial intelligence
  • Memory 1402 may include one or more computer-readable storage media, which may be non-transitory.
  • the non-transitory computer-readable storage medium in the memory 1402 is used to store at least one computer program, and the at least one computer program is used to be executed by the processor 1401 to implement the methods provided by the method embodiments in this application. Image segmentation method.
  • the terminal 1400 may optionally further include: a peripheral device interface 1403 and at least one peripheral device.
  • the processor 1401, the memory 1402 and the peripheral device interface 1403 may be connected through a bus or a signal line.
  • Each peripheral device can be connected to the peripheral device interface 1403 through a bus, a signal line or a circuit board.
  • the peripheral device includes: at least one of a radio frequency circuit 1404 , a display screen 1405 , a camera assembly 1406 , an audio circuit 1407 and a power supply 1408 .
  • the peripheral device interface 1403 may be used to connect at least one peripheral device related to I/O (Input/Output) to the processor 1401 and the memory 1402 .
  • processor 1401, memory 1402, and peripherals interface 1403 are integrated on the same chip or circuit board; in some other embodiments, any one of processor 1401, memory 1402, and peripherals interface 1403 or The two can be implemented on a separate chip or circuit board, which is not limited in this embodiment.
  • the radio frequency circuit 1404 is used for receiving and transmitting RF (Radio Frequency, radio frequency) signals, also called electromagnetic signals.
  • Radio frequency circuitry 1404 communicates with communication networks and other communication devices via electromagnetic signals.
  • the radio frequency circuit 1404 converts electrical signals into electromagnetic signals for transmission, or converts received electromagnetic signals into electrical signals.
  • the display screen 1405 is used for displaying UI (User Interface, user interface).
  • the UI can include graphics, text, icons, video, and any combination thereof.
  • the display screen 1405 is a touch display screen, the display screen 1405 also has the ability to acquire touch signals on or above the surface of the display screen 1405 .
  • the touch signal may be input to the processor 1401 as a control signal for processing.
  • the camera assembly 1406 is used to capture images or video.
  • the camera assembly 1406 includes a front camera and a rear camera.
  • Audio circuitry 1407 may include a microphone and speakers.
  • the microphone is used to collect the sound waves of the user and the environment, convert the sound waves into electrical signals, and input them to the processor 1401 for processing, or to the radio frequency circuit 1404 to realize voice communication.
  • Power supply 1408 is used to power various components in terminal 1400 .
  • the power source 1408 may be alternating current, direct current, a primary battery, or a rechargeable battery.
  • FIG. 14 does not constitute a limitation on the terminal 1400, and may include more or less components than the one shown, or combine some components, or adopt different component arrangements.
  • FIG. 15 is a schematic structural diagram of a server provided by an embodiment of the present application.
  • the server 1500 may vary greatly due to different configurations or performance, and may include one or more processors (Central Processing Units, CPU) 1501 and a Or more than one memory 1502, wherein, at least one computer program is stored in the memory 1502, and the at least one computer program is loaded and executed by the processor 1501 to implement the methods provided by the above method embodiments.
  • the server may also have components such as a wired or wireless network interface, a keyboard, and an input/output interface for input and output, and the server may also include other components for implementing device functions, which will not be described here.
  • Embodiments of the present application further provide a computer-readable storage medium, where at least one computer program is stored in the computer-readable storage medium, and the at least one computer program is loaded and executed by a processor to implement the following steps:
  • the original image is encoded based on the prior knowledge vector, and the target feature map is obtained.
  • the original image includes the target object, and the prior knowledge vector includes multiple prior knowledge weights.
  • Each prior knowledge weight is used to represent the accuracy corresponding to an annotator.
  • the accuracy is the accuracy with which the annotator marks the region where any object is located in any image;
  • image reconstruction is performed on the first segmented image to obtain multiple labeled segmented images, each labeled segmented image corresponds to a prior knowledge weight, and each labeled segmented image indicates the target marked by the corresponding annotator area;
  • the target feature map is processed to obtain a second segmented image of the original image.
  • the at least one computer program is loaded and executed by the processor to implement the following steps:
  • the uncertainty image indicates the difference between multiple target areas, and each target area is an area indicated by the labeled segmentation image;
  • the target feature map and the uncertainty image are fused to obtain the second segmented image.
  • each annotated segmented image includes a first weight corresponding to a plurality of pixels in the original image, and the first weight is used to represent the possibility that the corresponding pixel is in the target area; the at least one computer
  • the program is loaded and executed by the processor to implement the following steps:
  • An uncertainty image is constructed based on the second weights for the plurality of locations.
  • the at least one computer program is loaded and executed by the processor to implement the following steps:
  • the at least one computer program is loaded and executed by the processor to implement the following steps:
  • the at least one computer program is loaded and executed by the processor to implement the following steps:
  • the fourth feature map is decoded to obtain a plurality of labeled segmented images.
  • the original image is encoded based on the prior knowledge vector, and the step of obtaining the target feature map is performed by the first image segmentation model;
  • the step of decoding the target feature map to obtain the first segmented image of the original image is performed by the first image segmentation model
  • image reconstruction is performed on the first segmented image, and the steps of obtaining a plurality of labeled segmented images are performed by the image reconstruction model;
  • the step of processing the target feature map based on the plurality of labeled segmented images to obtain a second segmented image of the original image is performed by the second image segmentation model.
  • the at least one computer program is loaded and executed by the processor to implement the following steps:
  • the sample original image includes sample objects, each sample annotated segmentation image corresponds to a prior knowledge weight, and each sample annotated segmentation image indicates that the sample object is in the sample original.
  • the sample area in the image, and each sample annotated segmented image is annotated by the corresponding annotator;
  • the first image segmentation model decode the target sample feature map, and obtain the first sample segmentation image of the sample original image, and the first sample segmentation image indicates the sample area where the sample object is located in the sample original image;
  • the image reconstruction model is called, and based on the prior knowledge vector, image reconstruction is performed on the segmented image of the first sample to obtain multiple predicted and labeled segmented images.
  • Each predicted and labeled segmented image corresponds to a prior knowledge weight.
  • the segmented image indicates the predicted sample area;
  • the second image segmentation model segment the image based on multiple prediction annotations, process the feature map of the target sample, and obtain the predicted segmentation image of the original image of the sample;
  • weighted fusion of multiple sample annotated and segmented images is performed to obtain a fused annotated segmented image
  • the first image segmentation model, the image reconstruction model and the second image segmentation model are trained based on the difference between the predicted segmented image and the fused annotated segmented image.
  • the at least one computer program is loaded and executed by the processor to implement the following steps:
  • the first image segmentation model, the image reconstruction model and the second image segmentation model are trained.
  • the at least one computer program is loaded and executed by the processor to implement the following steps:
  • the first image segmentation model, the image reconstruction model and the second image segmentation model are trained.
  • the at least one computer program is loaded and executed by the processor to implement the following steps:
  • the first image segmentation model, the image reconstruction model and the second image segmentation model are trained.
  • the image reconstruction model includes an encoding sub-model, a fusion sub-model and a decoding sub-model; the at least one computer program is loaded and executed by the processor to implement the following steps:
  • the decoding sub-model is called to decode the feature map of the second sample to obtain multiple predicted and labeled segmented images.
  • the at least one computer program is loaded and executed by the processor to implement the following steps:
  • the first image segmentation model, the image reconstruction model and the second image segmentation model are trained.
  • Embodiments of the present application also provide a computer program product or computer program, where the computer program product or computer program includes computer program code, and the computer program code is stored in a computer-readable storage medium.
  • the processor of the computer device reads the computer program code from the computer-readable storage medium, and the processor executes the computer program code, so that the computer device implements the operations performed in the image segmentation method of the above-mentioned embodiment.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

本申请实施例公开了一种图像分割方法、装置、计算机设备及存储介质,属于计算机技术领域。该方法包括:计算机设备基于先验知识向量对原始图像进行编码,得到目标特征图(201),对目标特征图进行解码,得到原始图像的第一分割图像(202),基于先验知识向量,对第一分割图像进行图像重构,得到多个标注分割图像(203),基于多个标注分割图像,对目标特征图进行处理,得到原始图像的第二分割图像(204)。本申请实施例提供的方法,通过引入用于表示多个标注者的标注准确度的先验知识权重,重构出与多个标注者相匹配的标注分割图像,通过多个标注分割图像及目标特征图来获取原始图像的第二分割图像,提高了图像分割的准确性。

Description

图像分割方法、装置、计算机设备及存储介质
本申请要求于2021年03月03日提交、申请号为202110234267.7、发明名称为“图像分割方法、装置、计算机设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及计算机技术领域,特别涉及一种图像分割方法、装置、计算机设备及存储介质。
背景技术
随着计算机技术的发展,图像分割技术的应用越来越广泛,在很多领域中都需要进行图像分割。例如,在医疗领域能够采用图像分割技术从图像中提取某个身体部位的图像。
发明内容
本申请实施例提供了一种图像分割方法、装置、计算机设备及存储介质,能够提高图像分割的准确性。所述技术方案包括如下内容。
一方面,提供了一种图像分割方法,由计算机设备执行,所述方法包括:
基于先验知识向量对原始图像进行编码,得到目标特征图,所述原始图像包括目标物体,所述先验知识向量包括多个先验知识权重,每个先验知识权重用于表示一个标注者对应的准确度,所述准确度为所述标注者在任一图像中标注任一物体所在区域的准确度;
对所述目标特征图进行解码,得到所述原始图像的第一分割图像,所述第一分割图像指示所述目标物体在所述原始图像中所处的目标区域;
基于所述先验知识向量,对所述第一分割图像进行图像重构,得到多个标注分割图像,每个标注分割图像与一个先验知识权重对应,所述每个标注分割图像指示对应的标注者所标注的所述目标区域;
基于所述多个标注分割图像,对所述目标特征图进行处理,得到所述原始图像的第二分割图像。
另一方面,提供了一种图像分割装置,所述装置包括:
编码模块,用于基于先验知识向量对原始图像进行编码,得到目标特征图,所述原始图像包括目标物体,所述先验知识向量包括多个先验知识权重,每个先验知识权重用于表示一个标注者对应的准确度,所述准确度为所述标注者在任一图像中标注任一物体所在区域的准确度;
解码模块,用于对所述目标特征图进行解码,得到所述原始图像的第一分割图像,所述第一分割图像指示所述目标物体在所述原始图像中所处的目标区域;
重构模块,用于基于所述先验知识向量,对所述第一分割图像进行图像重构,得到多个标注分割图像,每个标注分割图像与一个先验知识权重对应,所述每个标注分割图像指示对应的标注者所标注的所述目标区域;
处理模块,用于基于所述多个标注分割图像,对所述目标特征图进行处理,得到所述原始图像的第二分割图像。
在一种可能实现方式中,所述处理模块,包括:
第一确定单元,用于基于所述多个标注分割图像之间的差异,确定不确定性图像,所述不确定性图像指示多个所述目标区域之间的差异,每个所述目标区域为所述标注分割图像指 示的区域;
第一融合单元,用于将所述目标特征图与所述不确定性图像进行融合,得到所述第二分割图像。
在另一种可能实现方式中,所述每个标注分割图像包括所述原始图像中的多个像素点对应的第一权重,所述第一权重用于表示对应的像素点在所述目标区域内的可能性;
所述第一确定单元,用于确定所述每个标注分割图像与平均值图像之间的差值图像,所述平均值图像为所述多个标注分割图像的平均值图像;确定多个差值图像中位于相同位置的像素点的像素值的平方和;将每个位置对应的平方和与目标个数之间的比值的开方,分别确定为所述每个位置的第二权重,所述目标个数为所述多个标注分割图像的个数;基于多个位置的第二权重,构建所述不确定性图像。
在另一种可能实现方式中,所述第一融合单元,用于确定所述多个标注分割图像的平均值图像;确定所述目标特征图与所述不确定性图像的乘积,将本次确定的乘积与所述目标特征图之和,确定为第一融合特征图;确定所述目标特征图与所述平均值图像的乘积,将本次确定的乘积与所述目标特征图之和,确定为第二融合特征图;将所述第一融合特征图及所述第二融合特征图进行拼接,得到拼接融合特征图;对所述拼接融合特征图进行卷积,得到所述第二分割图像。
在另一种可能实现方式中,所述编码模块,包括:
第一编码单元,用于对所述原始图像进行编码,得到所述原始图像的第一特征图;
第二融合单元,用于将所述先验知识向量与所述第一特征图进行融合,得到第二特征图;
第一解码单元,用于对所述第二特征图进行解码,得到所述目标特征图。
在另一种可能实现方式中,所述重构模块,包括:
拼接单元,用于将所述原始图像及所述第一分割图像进行拼接,得到拼接图像;
第二编码单元,用于对所述拼接图像进行编码,得到第三特征图;
第三融合单元,用于将所述先验知识向量与所述第三特征图进行融合,得到第四特征图;
第二解码单元,用于对所述第四特征图进行解码,得到所述多个标注分割图像。
在另一种可能实现方式中,
所述基于先验知识向量对原始图像进行编码,得到目标特征图的步骤由第一图像分割模型执行;
所述对所述目标特征图进行解码,得到所述原始图像的第一分割图像的步骤由所述第一图像分割模型执行;
所述基于所述先验知识向量,对所述第一分割图像进行图像重构,得到多个标注分割图像的步骤由图像重构模型执行;
所述基于所述多个标注分割图像,对所述目标特征图进行处理,得到所述原始图像的第二分割图像的步骤由第二图像分割模型执行。
在另一种可能实现方式中,所述装置还包括:
获取模块,用于获取样本原始图像、多个样本标注分割图像及所述先验知识向量,所述样本原始图像包括样本物体,每个样本标注分割图像与一个先验知识权重对应,所述每个样本标注分割图像指示所述样本物体在所述样本原始图像中所处的样本区域,且所述每个样本标注分割图像由对应的标注者标注;
所述编码模块,还用于调用所述第一图像分割模型,基于所述先验知识向量对所述样本原始图像进行编码,得到目标样本特征图;
所述解码模块,还用于调用所述第一图像分割模型,对所述目标样本特征图进行解码,得到所述样本原始图像的第一样本分割图像,所述第一样本分割图像指示所述样本物体在所述样本原始图像中所处的样本区域;
所述重构模块,还用于调用所述图像重构模型,基于所述先验知识向量,对所述第一样 本分割图像进行图像重构,得到多个预测标注分割图像,每个预测标注分割图像与一个先验知识权重对应,所述每个预测标注分割图像指示预测到的所述样本区域;
所述处理模块,还用于调用所述第二图像分割模型,基于所述多个预测标注分割图像,对所述目标样本特征图进行处理,得到所述样本原始图像的预测分割图像;
加权融合模块,用于基于所述先验知识向量,对所述多个样本标注分割图像进行加权融合,得到融合标注分割图像;
训练模块,用于基于所述预测分割图像与所述融合标注分割图像之间的差异,对所述第一图像分割模型、所述图像重构模型及所述第二图像分割模型进行训练。
在另一种可能实现方式中,所述训练模块,包括:
第二确定单元,用于基于所述预测分割图像与所述融合标注分割图像之间的差异,确定第一损失值;
训练单元,用于基于所述第一损失值,对所述第一图像分割模型、所述图像重构模型及所述第二图像分割模型进行训练。
在另一种可能实现方式中,所述训练单元,用于基于所述第一样本分割图像与所述融合标注分割图像之间的差异,确定第二损失值;基于所述第一损失值及所述第二损失值,对所述第一图像分割模型、所述图像重构模型及所述第二图像分割模型进行训练。
在另一种可能实现方式中,所述训练单元,用于基于所述多个预测标注分割图像与对应的样本标注分割图像之间的差异,确定第三损失值;基于所述第一损失值及所述第三损失值,对所述第一图像分割模型、所述图像重构模型及所述第二图像分割模型进行训练。
在另一种可能实现方式中,所述图像重构模型包括编码子模型、融合子模型及解码子模型;
所述重构模块,用于将所述样本原始图像及所述第一样本分割图像进行拼接,得到第一样本拼接图像;调用所述编码子模型,对所述第一样本拼接图像进行编码,得到第一样本特征图;调用所述融合子模型,将所述先验知识向量与所述第一样本特征图进行融合,得到第二样本特征图;调用所述解码子模型,对所述第二样本特征图进行解码,得到所述多个预测标注分割图像。
在另一种可能实现方式中,所述装置还包括:
拼接模块,用于将所述样本原始图像及所述融合标注分割图像进行拼接,得到第二样本拼接图像;
所述重构模块,还用于调用所述编码子模型,对所述第二样本拼接图像进行编码,得到第三样本特征图;
所述训练单元,用于基于所述第三样本特征图与所述第一样本特征图之间的差异,确定第四损失值;基于所述第一损失值及所述第四损失值,对所述第一图像分割模型、所述图像重构模型及所述第二图像分割模型进行训练。
另一方面,提供了一种计算机设备,所述计算机设备包括处理器和存储器,所述存储器中存储有至少一条计算机程序,所述至少一条计算机程序由所述处理器加载并执行以实现如上述方面所述的图像分割方法所执行的操作。
另一方面,提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有至少一条计算机程序,所述至少一条计算机程序由处理器加载并执行以实现如上述方面所述的图像分割方法所执行的操作。
再一方面,提供了一种计算机程序产品或计算机程序,所述计算机程序产品或计算机程序包括计算机程序代码,所述计算机程序代码存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机程序代码,处理器执行所述计算机程序代码,使得所述计算机设备实现如上述方面所述的图像分割方法中所执行的操作。
本申请实施例提供的方法、装置、计算机设备及存储介质,在对原始图像进行分割的过程中,通过引入用于表示多个标注者的标注准确度的先验知识权重,重构出与多个标注者相匹配的标注分割图像,以指示目标物体在原始图像中所处的目标区域,即重构出了多个标注者对原始图像的多种标注结果,之后,通过多个标注分割图像及原始图像的目标特征图来获取原始图像的第二分割图像,使得第二分割图像中融入了多个标注者对应的标注结果,保证第二分割图像的准确性,从而提高了图像分割的准确性。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请实施例的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本申请实施例提供的一种实施环境的结构示意图;
图2是本申请实施例提供的一种图像分割方法的流程图;
图3是本申请实施例提供的一种图像分割方法的流程图;
图4是本申请实施例提供的一种获取第二特征图方法的流程图;
图5是本申请实施例提供的一种多个标注者的标注图像的示意图;
图6是本申请实施例提供的一种图像分割方法的流程图;
图7是本申请实施例提供的一种多种方式的分割图像的对比图;
图8是本申请实施例提供的一种多种方式的分割图像的对比图;
图9是本申请实施例提供的一种模型训练方法的流程图;
图10是本申请实施例提供的一种获取预测分割图像的流程图;
图11是本申请实施例提供的一种模型训练过程的流程图;
图12是本申请实施例提供的一种图像分割装置的结构示意图;
图13是本申请实施例提供的一种图像分割装置的结构示意图;
图14是本申请实施例提供的一种终端的结构示意图;
图15是本申请实施例提供的一种服务器的结构示意图。
具体实施方式
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。
本申请所使用的术语“第一”、“第二”、“第三”、“第四”、“第五”、“第六”等可在本文中用于描述各种概念,但除非特别说明,这些概念不受这些术语限制。这些术语仅用于将一个概念与另一个概念区分。举例来说,在不脱离本申请的范围的情况下,能够将第一特征图称为第二特征图,且类似地,能够将第二特征图称为第一特征图。
本申请所使用的术语“至少一个”、“多个”、“每个”、“任一”,至少一个包括一个、两个或两个以上,多个包括两个或两个以上,而每个是指对应的多个中的每一个,任一是指多个中的任意一个。举例来说,多个先验知识权重包括3个先验知识权重,而每个是指这3个先验知识权重中的每一个先验知识权重,任一是指这3个先验知识权重中的任意一个先验知识权重,能够是第一个,或者,是第二个,或者,是第三个。
通常在对原始图像进行分割处理时,会先对原始图像进行编码,得到原始图像的特征图,再对该特征图进行解码,得到分割图像,该分割图像能够指示原始图像中的目标物体所在的区域。但这种图像分割方式简单,图像分割的准确性差。
本申请实施例提供的图像分割方法,由计算机设备执行。可选地,该计算机设备为终端或服务器。可选地,该服务器是独立的物理服务器,或者,是多个物理服务器构成的服务器集群或者分布式系统,或者,是提供云服务、云数据库、云计算、云函数、云存储、网络服 务、云通信、中间件服务、域名服务、安全服务、CDN(Content Delivery Network,内容分发网络)、以及大数据和人工智能平台等基础云计算服务的云服务器。可选地,该终端是智能手机、平板电脑、笔记本电脑、台式计算机、智能音箱、智能手表等,但并不局限于此。
在一种可能实现方式中,多个服务器可组成为一区块链,服务器为区块链上的节点,且本申请实施例提供的图像分割方法,能够应用于该区块链中的任一服务器。该服务器采用本申请实施例提供的图像分割方法,能够对任一图像进行分割,将得到的分割图像存储于区块链中,从而与区块链中的其他服务器进行共享。
图1是本申请实施例提供的一种实施环境的示意图。参见图1,该实施环境包括终端101和服务器102。终端101和服务器102之间通过无线或者有线网络连接。终端101上安装由服务器102提供服务的目标应用,终端101能够通过该目标应用实现例如数据传输、消息交互等功能。可选地,目标应用为终端101操作系统中的目标应用,或者为第三方提供的目标应用。例如,目标应用为医疗诊断应用,该医疗诊断应用具有图像分割的功能,当然,该医疗诊断应用还能够具有其他功能,例如,点评功能、导航功能等。
终端101用于基于用户标识登录目标应用,通过目标应用将原始图像发送至服务器102,服务器102用于接收终端101发送的原始图像,对该原始图像进行图像分割,获取该原始图像的分割图像,并将获取到的分割图像返回至终端101,终端101能够显示接收到的分割图像。
本申请实施例提供的方法,可用于多种场景。
例如,医疗场景下:
终端拍摄用户眼部,得到用户的眼部图像,将该眼部图像发送至具有图像分割功能的服务器,该服务器在接收到该眼部图像后,采用本申请实施例提供的图像分割方法,获取该眼部图像的分割图像,以确定该眼部图像中视杯和视盘所处的区域,之后医生能够根据该分割图像中视杯和视盘所处的区域,确定该用户的眼部状态。
图2是本申请实施例提供的一种图像分割方法的流程图,由计算机设备执行,如图2所示,该方法包括以下步骤。
201、计算机设备基于先验知识向量对原始图像进行编码,得到目标特征图。
其中,原始图像包括目标物体,该原始图像为任意的图像,例如,在医疗场景下,该原始图像为医学图像,该目标物体为某一身体部位,如,该原始图像为眼部图像,该目标物体为眼部中的视杯或视盘;或者,该原始图像为人体肺部图像,该目标物体为人体肺部的病变物体。先验知识向量包括多个先验知识权重,每个先验知识权重用于表示一个标注者对应的准确度,该准确度为标注者在任一图像中标注任一物体所在区域的准确度,目标特征图用于表示该原始图像包含的特征信息。
202、计算机设备对目标特征图进行解码,得到原始图像的第一分割图像。
其中,第一分割图像指示目标物体在原始图像中所处的目标区域。由于目标特征图包含了原始图像的特征信息,且该目标特征图中融入了先验知识向量,采用解码的方式来获取的第一分割图像,相当于是多个标注者的标注结果融合后的分割图像,该多个标注者为多个先验知识权重所对应的标注者。
203、计算机设备基于先验知识向量,对第一分割图像进行图像重构,得到多个标注分割图像。
其中,每个标注分割图像与一个先验知识权重对应,每个标注分割图像指示对应的标注者所标注的目标区域。由于不同的标注者的标注水平不同,则多个标注分割图像所指示的目标区域可能存在差异。第一分割图像相当于是多个标注者的标注结果融合后的分割图像,通过该先验知识向量中的多个先验知识权重及第一分割图像,重构出与多个标注者相匹配的标注分割图像,以指示目标物体在原始图像中所处的目标区域,以便后续能够基于重构出的多个标注分割图像来更新原始图像的分割图像,提高分割图像的准确性。
204、计算机设备基于多个标注分割图像对目标特征图进行处理,得到原始图像的第二分割图像。
由于每个标注分割图像对应于一个先验知识权重,每个标注分割图像表示对应的标注者对原始图像进行标注的标注结果,则基于该多个标注分割图像对目标特征图进行处理,提高了第二分割图像的准确性。
本申请实施例提供的方法,在对原始图像进行分割的过程中,通过引入用于表示多个标注者的标注准确度的先验知识权重,重构出与多个标注者相匹配的标注分割图像,以指示目标物体在原始图像中所处的目标区域,即重构出了多个标注者对原始图像的多种标注结果,之后,通过多个标注分割图像及原始图像的目标特征图来获取原始图像的第二分割图像,使得第二分割图像中融入了多个标注者对应的标注结果,保证第二分割图像的准确性,从而提高了图像分割的准确性。
图3是本申请实施例提供的一种图像分割方法的流程图,由计算机设备执行,如图3所示,该方法包括以下步骤。
301、计算机设备对原始图像进行编码,得到原始图像的第一特征图。
其中,该第一特征图用于表示该原始图像包含的特征信息,该原始图像包括目标物体,该原始图像为任意的图像,例如,在医疗场景下,该原始图像为医学图像,该目标物体为某一身体部位,如,该原始图像为眼部图像,该目标物体为眼部中的视杯或视盘;或者,该原始图像为人体肺部图像,该目标物体为人体肺部的病变物体。
在一种可能实现方式中,该步骤301包括:调用第一图像分割模型中的第一编码子模型,对该原始图像进行编码,得到该原始图像的第一特征图。
其中,第一图像分割模型用于获取原始图像的分割图像的模型,例如,该第一图像分割模型为U-Net(一种二维图像分割的卷积神经网络)。该第一编码子模型用于获取原始图像的特征图,例如,该第一编码子模型为U-Net中的编码器。通过第一图像分割模型中的第一编码子模型对原始图像进行编码,使第一特征图包含了原始图像的特征信息,以保证获取到的第一特征图的准确性。
可选地,该第一编码子模型包括多个第一卷积模块,则获取第一特征图的过程包括:按照多个第一卷积模块的排列顺序,调用第一个第一卷积模块,对该原始图像进行编码,得到第一个第一参考特征图,调用当前的第一卷积模块对上一个第一卷积模块输出的第一参考特征图进行编码,得到当前的第一卷积模块对应的第一参考特征图,直至得到最后一个第一卷积模块输出的第一参考特征图,将最后一个第一卷积模块输出的第一参考特征图确定为该第一特征图。
其中,在该第一编码子模型包括的多个第一卷积模块中,按照多个第一卷积模块的排列顺序,多个第一卷积模块输出的第一参考特征图的尺寸逐渐减小。通过编码子模型中的多个第一卷积模块,按照多个尺度对原始图像进行编码,即该编码子模型采用一种下采样的方式,逐渐增强特征图包含的特征,以提高第一特征图的准确性。
可选地,该第一编码子模型包括n个第一卷积模块,第1个第一卷积模块的输入为原始图像,第i个卷积模块的输入为第i-1个卷积模块输出的第一参考特征图,其中,i为大于1且不大于n的整数,n为大于1的整数;则获取第一特征图的过程包括:调用第1个第一卷积模块,对该原始图像进行编码,得到第1个第一参考特征图,调用第i个第一卷积模块,对第i-1个第一参考特征图进行编码,得到第i个第一参考特征图,直至得到第n个参考特征图,将第n个第一参考特征图确定为该第一特征图。
其中,按照由第1个第一卷积模块至第n个第一卷积模块的排列顺序,n个第一卷积模块输出的第一参考特征图的尺寸逐渐减小。
302、计算机设备将先验知识向量与第一特征图进行融合,得到第二特征图。
其中,先验知识向量包括多个先验知识权重,每个先验知识权重用于表示一个标注者对 应的准确度,该准确度为标注者在任一图像中标注任一物体所在区域的准确度,该先验知识权重能够体现出对应的标注者的标注水平。由于多个标注者的标注水平有高低之分,则每个标注者对应的先验知识权重也有大小之分,先验知识权重越大,表示对应的标注者的标注水平越高,即该标注者在图像中标注物体所在区域的准确度越高;先验知识权重越小,表示对应的标注者的标注水平越低,即该标注者在图像中标注物体所在区域的准确度越低。
可选地,该先验知识向量是由任意设置的,例如,该先验知识向量为[0.1,01,0.4,0.4],即该先验知识向量包括了4个标注者对应的先验知识权重,两个标注者对应的先验知识权重为0.1,两个标注者对应的先验知识权重为0.4。
通过将先验知识向量与第一特征图进行融合,以使得到的第二特征图不仅包括了原始图像中的特征,还融入了多个标注者对应的先验知识权重,使得第二特征图包含的特征与该先验知识向量之间动态关联,且第二特征图包含的特征受到该先验知识向量的影响,增强了第二特征图包含的特征的动态表征能力,提高了第二特征图包含的特征的准确性。
在一种可能实现方式中,该步骤302包括:调用第一图像分割模型中的知识推断子模型,将先验知识向量与第一特征图进行融合,得到第二特征图。
其中,该知识推断子模型用于将先验知识向量与第一特征图进行融合。例如,该知识推断子模型为ConvLSTM(Convolutional Long Short-Term Memory,长短期注意力模型)。通过该知识推断子模型,将先验知识向量与原始图像的第一特征图进行融合,增强了第二特征图包含的特征的动态表征能力。如图4所示,对先验知识向量进行尺寸扩张,以使尺寸扩张后的先验知识向量与第一特征图的尺寸相同,之后,通过长短期注意力模型,将扩张后的先验知识向量与第一特征图进行融合,以增强第一特征图包含的特征,得到融合后的第二特征图。
可选地,第一特征图、先验知识向量及第二特征图满足以下关系:
Figure PCTCN2022077951-appb-000001
其中,h t用于表示增强后的特征图,f 5用于表示第一特征图,ConvLSTM(·)用于表示长短期注意力模型,h t-1表示增强前的特征图,t表示特征增强迭代轮次,
Figure PCTCN2022077951-appb-000002
用于表示迭代处理,该T为不小于2的正整数;当t=1时,h 0为该先验知识向量;当t=T时,h T为该第二特征图。
303、计算机设备对第二特征图进行解码,得到目标特征图。
其中,目标特征图用于表示该原始图像包含的特征信息。在得到第二特征图后,采用解码的方式,以细化特征图包含的特征,提高了目标特征图包含的特征的准确性。
在一种可能实现方式中,该步骤303包括:调用第一图像分割模型中的第一解码子模型,对该第二特征图进行解码,得到该目标特征图。
其中,该第一解码子模型用于增强特征图包含的特征,例如,该第一解码子模型为U-Net中的解码器。
可选地,第一图像分割模型还包括第一编码子模型,该第一编码子模型包括多个第一卷积模块,该第一解码子模型包括多个第二卷积模块,则获取目标特征图的过程包括:按照多个第二卷积模块的排列顺序,调用第一个第二卷积模块,对该第二特征图进行解码,得到第一个第二参考特征图,调用当前的第二卷积模块,对上一个第二卷积模块输出的第二参考特征图及与该第二参考特征图尺寸相等的第一参考特征图进行解码,得到当前的第二卷积模块对应的第二参考特征图,直至得到最后一个第二卷积模块输出的第二参考特征图,将最后一个第二卷积模块输出的第二参考特征图确定为目标特征图。
其中,在该第一解码子模型包括的多个第二卷积模块中,按照多个第二卷积模块的排列顺序,多个第二卷积模块输出的第二参考特征图的尺寸逐渐增大。通过解码子模型中的多个第二卷积模块,采用一种上采样的方式,逐渐细化特征图包含的特征,以提高目标特征图包 含的特征的准确性。
可选地,第一图像分割模型包括第一编码子模型和第一解码子模型,该第一编码子模型包括n个第一卷积模块,第1个第一卷积模块的输入为原始图像,第i个卷积模块的输入为第i-1个卷积模块输出的第一参考特征图,其中,i为大于1且不大于n的整数,n为大于1的整数;第一解码子模型包括n个第二卷积模块,第1个第二卷积模块的输入为第二特征图,第i个卷积模块的输入为第i-1个第二卷积模块输出的参考特征图和第n-i+1个第一卷积模块输出的参考特征图,第i-1个第二卷积模块输出的参考特征图与第n-i+1个第一卷积模块输出的参考特征图的尺寸相等。
则基于原始图像及先验知识向量,获取目标特征图的过程,包括:调用第1个第一卷积模块,对该原始图像进行编码,得到第1个第一参考特征图,调用第i个第一卷积模块,对第i-1个第一参考特征图进行编码,得到第i个第一参考特征图,直至得到第n个参考特征图,将第n个第一参考特征图确定为该第一特征图;将先验知识向量与第一特征图进行融合,得到第二特征图;调用第1个第二卷积模块,对第二特征图进行解码,得到第1个第二参考特征图,调用第i个第二卷积模块,对第i-1个第二参考特征图及第n-i+1个第一参考特征图进行解码,得到第i个第二参考特征图,直至得到第n个第二参考特征图,将n个第二参考特征图确定为目标特征图。
其中,按照第1个第一卷积模块至第n个第一卷积模块的排列顺序,n个第一卷积模块输出的第一参考特征图的尺寸逐渐减小。按照第1个第二卷积模块至第n个第二卷积模块的排列顺序,n个第二卷积模块输出的第二参考特征图的尺寸逐渐增大。
需要说明的是,本申请实施例是将原始图像编码后的第一特征图与先验知识向量融合后,对得到的第二特征图进行解码来获取目标特征图的,而在另一实施例中,无需执行上述步骤301-303,能够采取其他方式,根据先验知识向量对原始图像进行编码,得到目标特征图。
在一种可能实现方式中,调用第一图像分割模型,基于先验知识向量对原始图像进行编码,得到目标特征图。通过第一图像分割模型,来获取目标特征图,以提高目标特征图的准确性。
304、计算机设备对目标特征图进行解码,得到原始图像的第一分割图像。
其中,第一分割图像指示目标物体在原始图像中所处的目标区域。可选地,该第一分割图像包括原始图像中的多个像素点对应的权重,该权重用于表示对应的像素点在该目标区域内的可能性。在该第一分割图像包括多个像素点中,每个像素点的像素值表示原始图像中位于相同位置的像素点对应的权重。对于原始图像中的任一位置,在该第一分割图像中位于相同位置的像素点的像素值,即为该原始图像中该位置的像素点的权重。
可选地,该第一分割图像以热力图的形式表示,在该第一分割图像中,像素点对应的权重越大,该像素点对应的颜色越深,像素点对应的权重越小,该像素点对应的颜色越浅。例如,在该第一分割图像中,权重为0时,该权重对应的像素点所对应的颜色为蓝色,权重为1时,该权重对应的像素点所对应的颜色为红色,权重介于0至1之间时,该权重对应的像素点所对应的颜色为蓝色变为红色之间的过渡色,例如,权重由0变为1,该权重对应的像素点所对应的颜色由蓝色逐渐变为红色。
由于目标特征图不仅包含了原始图像的特征信息,并且该目标特征图中还融入了先验知识向量,对目标特征图采用解码的方式来获取第一分割图像,该第一分割图像相当于是多个标注者的标注结果融合后的分割图像,该多个标注者为多个先验知识权重所对应的标注者。
在一种可能实现方式中,该步骤304包括:调用第一图像分割模型,对目标特征图进行解码,得到原始图像的第一分割图像。
可选地,调用该第一图像分割模型中的卷积子模型,对该目标特征图进行卷积,得到该第一分割图像。
305、计算机设备将原始图像及第一分割图像进行拼接,得到拼接图像。
由于第一分割图像指示目标物体在原始图像中所处的目标区域,通过将原始图像与第一分割图像进行拼接,使得拼接图像不仅包含了原始图像所包含的信息,还包含了用于指示目标物体在原始图像中所处的目标区域的信息,丰富了拼接图像包含的信息,以便后续能够重构出多个标注分割图像。
306、计算机设备对拼接图像进行编码,得到第三特征图。
其中,第三特征图用于表示原始图像包含的特征信息及用于指示目标物体在原始图像中所处的目标区域的信息。
在一种可能实现方式中,该步骤306包括:调用图像重构模型中的编码子模型,对该拼接图像进行编码,得到该第三特征图。
其中,图像重构模型用于重构多个先验知识权重所对应的标注分割图像,该编码子模型用于获取拼接图像的特征图。例如,该编码子模型为U-Net中的编码器。该图像重构模型中的编码子模型与上述步骤301中的第一图像分割模型中的第一编码子模型同理,在此不再赘述。
307、计算机设备将先验知识向量与第三特征图进行融合,得到第四特征图。
通过将先验知识向量与第三特征图融合,使得第四特征图不仅包括原始图像包含的特征信息,还融入了多个标注者对应的先验知识权重,以便后续能够根据第四特征图来重构出每个先验知识权重对应的标注分割图像。
在一种可能实现方式中,该步骤307包括:调用图像重构模型中的融合子模型,将先验知识向量与第三特征图进行融合,得到第四特征图。
其中,融合子模型与上述步骤302中的知识推断子模型类似,在此不再赘述。
308、计算机设备对第四特征图进行解码,得到多个标注分割图像。
其中,每个标注分割图像与一个先验知识权重对应,每个标注分割图像指示对应的标注者所标注的目标区域,由于不同的标注者的标注水平不同,则多个标注分割图像中所指示的目标区域可能存在差异。如图5所示,以原始图像为眼部图像为例,3个标注者对眼部图像中的视杯和视盘进行标注,由于3个标注者的标注水平不同,则标注得到的视杯标注图像和视盘标注图像所指示的目标区域存在差异。
可选地,每个标注分割图像包括原始图像中的多个像素点对应的第一权重,第一权重用于表示对应的像素点在目标区域内的可能性,则通过该标注分割图像包括的多个第一权重,能够确定出对应的标注者所标注的目标区域,该目标区域为标物体在原始图像中所处的区域。对于该任一标注分割图像包括的多个像素点,每个像素点的像素值即为该标注分割图像包含的第一权重。对于原始图像中的任一位置的像素点,在该标注分割图像中位于相同位置的像素点的像素值,即为该原始图像中该位置的像素点的第一权重。
在本申请实施例中,第四特征图不仅包含了第一分割图像的特征信息、原始图像的特征信息,还融入了先验知识向量,并且该第一分割图像相当于是多个标注者的标注结果融合后的分割图像,该多个标注者为多个先验知识权重所对应的标注者,则通过采用解码的方式,对第四特征图进行处理,能够重构出多个先验知识权重所对应的标注分割图像,还原出多个标注者对原始图像的标注结果,即每个标注者对应的标注分割图像,以便后续更新原始图像的分割图像。
在一种可能实现方式中,该步骤308包括:调用图像重构模型中的解码子模型,对第四特征图进行解码,得到多个标注分割图像。
其中,该解码子模型与上述步骤303中的第一解码子模型类似,该第一解码子模型包含于第一图像分割模型中,在此不再赘述。
需要说明的是,本申请实施例是通过引入原始图像,将原始图像及先验知识向量进行融合,重构出多个标注分割图像的,而在另一实施例中,无需执行上述步骤305-308,能够采取其他方式,基于先验知识向量,对第一分割图像进行图像重构,得到多个标注分割图像。
在一种可能实现方式中,调用图像重构模型,基于先验知识向量,对第一分割图像进行图像重构,得到多个标注分割图像。通过图像重构模型,基于先验知识向量和第一分割图像,重构出多个先验知识权重对应的标注分割图像,以保证标注分割图像的准确性。
309、计算机设备基于多个标注分割图像之间的差异,确定不确定性图像。
其中,不确定性图像指示目标区域之间的差异,每个目标区域为标注分割图像指示的区域。每个标注分割图像对应一个先验知识权重,即每个标注分割图像相当于对应的标注者对原始图像进行标注的标注结果,该标注者为先验知识权重所对应的标注者。由于多个标注者的标注水平不同,多个标注分割图像所指示的目标区域会存在差异,因此,通过多个标注分割图像之间的差异,能够确定出不确定性图像,该不确定性图像能够指示出多个标注者所标注的多个目标区域中存在争议的区域。
在一种可能实现方式中,该步骤309包括以下步骤3091-3094。
3091、确定多个标注分割图像与平均值图像之间的差值图像。
其中,平均值图像为该多个标注分割图像的平均值图像。每个标注分割图像包括原始图像中多个像素点对应的第一权重,该平均值图像包括原始图像中每个像素点对应的多个第一权重的平均值,多个第一权重是该像素点在多个标注分割图像中对应的第一权重,该平均值图像能够体现出多个标注分割图像所指示的目标区域之间的一致性,每个差值图像包括多个差值,每个差值表示一个第一权重与对应的平均值之间的差值,该第一权重为该差值图像对应的标注分割图像中的权重,该平均值为该平均值图像中的平均值。通过确定多个标注分割图像的平均值图像,之后,再确定每个标注分割图像与该平均值图像之间的差值图像,从而得到多个差值图像。
在一种可能实现方式中,该步骤3091包括:确定多个标注分割图像中位于相同位置的像素点对应的第一权重的平均值,基于得到的多个平均值构建该平均值图像,对于每个标注分割图像,确定该标注分割图像中的多个第一权重与对应的平均值之间的差值,将得到的多个差值构成该标注分割图像对应的差值图像。
3092、确定多个差值图像中位于相同位置的像素点的像素值的平方和。
其中,在任一差值图像中,任一像素点的像素值为标注分割图像与该像素点位于相同位置的像素点的所对应的第一权重,与平均值图像中与该像素点位于相同位置的像素点所对应的平均值之间的差值,该标注图像与该差值图像对应。
对于任一位置,确定该多个差值图像中位于该位置的像素点的像素值的平方,将多个差值图像在该位置对应的像素值的平方之和,确定为该位置对应的像素值的平方和,重复上述方式,即可得到多个位置对应的像素值的平方和。
3093、将每个位置对应的平方和与目标个数之间的比值的开方,分别确定为每个位置的第二权重。
其中,目标个数为多个标注分割图像的个数,第二权重用于表示对应的位置在多个标注分割图像中的标注结果的差异,标注结果表示是位于对应的位置的像素点是否在目标区域。
3094、基于多个位置的第二权重构建不确定性图像。
其中,该不确定性图像包括原始图像中的多个像素点对应的第二权重。
在一种可能实现方式中,该多个标注分割图像及该不确定性图像满足以下关系:
Figure PCTCN2022077951-appb-000003
其中,U map表示不确定性图像,N 0表示多个标注分割图像的个数,N 0为不小于2的正整数;i 0表示多个标注分割图像中标注分割图像的序号,i 0大于等于1,且小于等于N 0
Figure PCTCN2022077951-appb-000004
表示第i 0个标注分割图像。
310、计算机设备将目标特征图与不确定性图像进行融合,得到第二分割图像。
由于不确定性图像能够指示多个目标区域之间的差异,每个目标区域为标注分割图像指示的区域,目标特征图中不仅包含了原始图像的特征信息,还融合了先验知识向量,因此,通过将目标特征图与不确定性图像进行融合,能够将多个标注分割图像中不确定的区域进行区分开,以提高第二分割图像所指示的目标区域的准确性。
在一种可能实现方式中,该步骤310包括以下步骤3101-3105。
3101、确定多个标注分割图像的平均值图像。
该步骤与上述步骤3091中确定平均值图像的方式同理,在此不再赘述。
3102、确定目标特征图与不确定性图像的乘积,将本次确定的乘积与目标特征图之和,确定为第一融合特征图。
其中,该第一融合特征图用于表示多个标注分割图像之间的不一致信息,该多个标注分割图像与多个先验知识权重对应。通过确定目标特征图与不确定性图像的乘积,之后,将确定的乘积与该目标特征图之和,确定为该第一融合特征图。通过获取第一融合特征图,使得目标特征图在不确定区域中的特征得到增强,该不确定区域为该不确定性图像所指示的区域,以提高第一融合特征图的准确性。
在一种可能实现方式中,该步骤3102包括:确定目标特征图与不确定性图像的像素级乘积,将得到的乘积与该目标特征图的像素级之和,确定为该第一融合特征图。其中,像素级乘积是指目标特征图与不确定性图像中位于相同位置的像素点的像素值的乘积,像素级之和是指得到的乘积与目标特征图中位于相同位置的像素点的像素值的和值。
在一种可能实现方式中,在步骤3102之前,该方法还包括:对不确定性图像进行平滑处理,对平滑处理后的不确定性图像进行最大值处理。
其中,平滑处理能够采用高斯平滑处理的方式。平滑处理后的不确定性图像包含的多个权重值可能会发生改变,因此,通过最大值处理方式,将平滑处理后的不确定性图像与平滑处理前的不确定性图像进行比对,对于平滑处理后的不确定性图像与平滑处理前的不确定性图像中的任一相同位置,将该位置对应的两个权重中的最大值,确定为该位置最大操作后的权重,重复上述方式,得到最大值处理后的不确定性图像。通过采用平滑处理的方式,以使不确定性图像包含的多个权重趋于平滑,实现过度的效果,以扩大不确定区域的覆盖范围,从而有效地感知和捕捉多个标注分割图像之间的不一致区域,并且通过最大值处理,以保证不确定性图像包含的权重的准确性,提高了不确定性图像的准确性。
可选地,对不确定性图像进行平滑处理,对平滑处理后的不确定性图像进行最大值处理,满足以下关系:
Soft(Umap)=Ω max(F Gauss(U map,k),U map)
其中,Soft(Umap)表示最大值处理后的不确定性图像,Ω max表示最大函数,用于保留平滑处理后的不确定性图像和原始的不确定性图像中相同位置的较高像素值;F Gauss用于表示具有高斯核为k的卷积运算,U map表示原始的不确定性图像。
3103、确定目标特征图与平均值图像的乘积,将本次确定的乘积与目标特征图之和,确定为第二融合特征图。
其中,该第二融合特征图用于表示多个标注分割图像之间的一致信息,该多个标注分割图像与多个先验知识权重对应。通过确定目标特征图与平均值图像的乘积,之后,将确定的乘积与该目标特征图之和,确定为该第二融合特征图。通过获取第二融合特征图,使得目标特征图中多个标注分割图像均标注为目标区域的区域中的特征得到增强,提高了第二融合特征的准确性。
在一种可能实现方式中,该步骤3103包括:确定目标特征图与平均值图像的像素级乘积,将得到的乘积与该目标特征图的像素级之和,确定为该第二融合特征图。其中,像素级乘积是指目标特征图与平均值图像中位于相同位置的像素点的像素值的乘积,像素级之和是指得 到的乘积与目标特征图中位于相同位置的像素点的像素值的和值。
在一种可能实现方式中,在步骤3103之前,该方法还包括:对平均值图像进行平滑处理,对平滑处理后的平均值图像进行最大值处理。
该步骤与上述步骤3102中对不确定性图像进行平滑处理,并对平滑处理后的不确定性图像进行最大值处理的过程同理,在此不再赘述。
需要说明的是,在本申请实施例中,在执行上述步骤3102和3103获取第一融合特征图和第二融合特征图之前,均能够对不确定性图像及平均值图像进行平滑处理和最大值处理,则对于上述步骤3012和3013获取到的获取第一融合特征图和第二融合特征图,能够满足以下关系式:
Figure PCTCN2022077951-appb-000005
其中,j用于表示代号,j的取值为1和2;
Figure PCTCN2022077951-appb-000006
表示融合特征图;当j为1时,
Figure PCTCN2022077951-appb-000007
表示第一融合特征图;当j为2时,
Figure PCTCN2022077951-appb-000008
表示第二融合特征图;F 1表示目标特征图,Soft(A j)表示不确定性图像或者平均值图像;当j为1时,Soft(A 1)表示不确定性图像;当j为2时,Soft(A 2)表示平均值图像;
Figure PCTCN2022077951-appb-000009
用于表示像素级乘积。
3104、将第一融合特征图及第二融合特征图进行拼接,得到拼接融合特征图。
例如,该第一融合特征图的尺寸为B*C1*H*W,第二融合特征图的尺寸为B*C2*H*W,拼接后为拼接融合特征图的尺寸为B*(C1+C2)*H*W。
3105、对拼接融合特征图进行卷积,得到第二分割图像。
由于拼接融合特征图包含了目标特征图在高度确定区域中的增强后的特征,及在不确定性区域中的增强后的特征,则通过对拼接融合特征进行卷积时,能够将拼接特征图中的确定的目标区域与其他区域区分开,以提高第二分割图像所指示的目标区域的准确性,也即是提高了第二分割图像的准确性。
另外,以原始图像为眼部图像,目标物体为眼部中的视杯和视盘,则获取的第二分割图像指示视杯在眼部图像中所处的区域,及视盘在眼部图像中所处的区域。则第二分割图像满足以下关系:
Figure PCTCN2022077951-appb-000010
其中,O表示第二分割图像,Conv 1×1(·)表示卷积,Concat(·)表示拼接处理;视杯对应的第一融合特征图为
Figure PCTCN2022077951-appb-000011
视盘对应的第一融合特征图为
Figure PCTCN2022077951-appb-000012
视杯对应的第二融合特征图为
Figure PCTCN2022077951-appb-000013
视盘对应的第二融合特征图为
Figure PCTCN2022077951-appb-000014
需要说明的是,本申请实施例是先获取不确定性图像,之后基于目标特征图及不确定性图像来获取第二分割图像,而在另一实施例中,无需执行上述步骤309-310,能够采取其他方式,基于多个标注分割图像对目标特征图进行处理,得到原始图像的第二分割图像。
在一种可能实现方式中,调用第二图像分割模型,基于多个标注分割图像对目标特征图进行处理,得到原始图像的第二分割图像。其中,第二图像分割模型用于获取第二分割图像。通过第二图像分割模型,利用多个标注分割图像对应的不一致信息和一致信息,以保证第二分割图像的准确性。如图6所示,以原始图像为眼部图像为例,通过第一图像分割模型、图像重构模型及第二图像分割模型对原始图像进行图像分割,得到第二分割图像。
本申请实施例提供的方法,在对原始图像进行分割的过程中,通过引入用于表示多个标注者的标注准确度的先验知识权重,重构出与多个标注者相匹配的标注分割图像,以指示目标物体在原始图像中所处的目标区域,即重构出了多个标注者对原始图像的多种标注结果,之后,通过多个标注分割图像及原始图像的目标特征图来获取原始图像的第二分割图像,使得第二分割图像中融入了多个标注者对应的标注结果,保证第二分割图像的准确性,从而提 高了图像分割的准确性。
并且,通过第一分割图像模型中的知识推断子模型及图像重构模型中的融合子模型,在获取原始图像的分割图像的过程中,能够引入先验知识向量,使先验知识向量嵌入到原始图像的特征中,提高模型的动态表示能力。
并且,本申请提供的提供了一种软注意力机制,对不确定图像进行平滑处理和最大值处理,以扩大不确定区域的覆盖范围,从而有效地感知和捕捉多个标注分割图像之间的不一致区域,并且通过最大值处理,以保证不确定性图像包含的权重的准确性,提高了不确定性图像的准确性,从而提高后续图像分割的性能。
如图7所示,通过本申请提供的图像分割方法及现有技术中提供的其他图像分割模型,对原始图像进行图像分割,并对得到的分割图像进行对比。通过对比可知,本申请实施例提供的图像分割方法得到的分割图像更准确。
本申请实施例提供的方法,能够应用于医疗领域,能够医疗领域中的图像进行图像分割。如图8所示,在医疗领域的不同数据集上,将本申请实施例提供的图像分割方法,与其他现有技术中的其他分割模型得到的分割图像进行对比。通过对比可知,本申请实施例提供的图像分割方法得到的分割图像更准确。
基于上述图3所示实施例可知,对原始图像进行图像分割的过程,能够采用第一图像分割模型、图像重构模型及第二图像分割模型来执行,在调用第一图像分割模型、图像重构模型及第二图像分割模型之前,需要对第一图像分割模型、图像重构模型及第二图像分割模型进行训练,训练过程详见下述实施例。
图9是本申请实施例提供的一种模型训练方法的流程图,应用于计算机设备中,如图9所示,该方法包括以下步骤。
901、计算机设备获取样本原始图像、多个样本标注分割图像及先验知识向量。
其中,样本原始图像包括样本物体,每个样本标注分割图像与一个先验知识权重对应,每个样本标注分割图像指示样本物体在样本原始图像中所处的样本区域,且每个样本标注分割图像由对应的标注者对该样本原始图像标注得到的,每个样本标注分割图像为对应的标注者的真实标注结果。例如,在医疗领域,该样本原始图像为眼部图像,该多个标注者为多个眼部医生。
902、计算机设备调用第一图像分割模型,基于先验知识向量对样本原始图像进行编码,得到目标样本特征图。
该步骤与上述步骤301-303同理,在此不再赘述。
903、计算机设备调用第一图像分割模型,对目标样本特征图进行解码,得到样本原始图像的第一样本分割图像。
其中,第一样本分割图像指示样本物体在样本原始图像中所处的样本区域。
该步骤与上述步骤304同理,在此不再赘述。
904、计算机设备将样本原始图像及第一样本分割图像进行拼接,得到第一样本拼接图像。
该步骤与上述步骤305同理,在此不再赘述。
905、计算机设备调用编码子模型,对第一样本拼接图像进行编码,得到第一样本特征图。
在本申请实施例中,图像重构模型包括编码子模型、融合子模型及解码子模型。该步骤与上述步骤306同理,在此不再赘述。
906、计算机设备调用融合子模型,将先验知识向量与第一样本特征图进行融合,得到第二样本特征图。
该步骤与上述步骤307同理,在此不再赘述。
907、计算机设备调用解码子模型,对第二样本特征图进行解码,得到多个预测标注分割图像。
其中,每个预测标注分割图像与一个先验知识权重对应,每个预测标注分割图像指示对应的标注者所标注的样本区域。
该步骤与上述步骤308同理,在此不再赘述。
需要说明的是,本申请实施例是先将样本原始图像及第一样本分割图像进行拼接,之后调用图像重构模型中的编码子模型、融合子模型及解码子模型,来获取多个预测标注分割图像的,而在另一实施例中,无需执行步骤904-907,能够采取其他方式,调用图像重构模型,根据先验知识向量,对第一样本分割图像进行图像重构,得到多个预测标注分割图像。
908、计算机设备调用第二图像分割模型,基于多个预测标注分割图像对目标样本特征图进行处理,得到样本原始图像的预测分割图像。
该步骤与上述步骤309-310同理,在此不再赘述。如图10所示,以样本原始图像为眼部图像为例,通过图像重构模型获取预测标注分割图像,该预测标注分割图像包括视杯预测标注分割图像和视盘预测标注分割图像,获取到的不确定性图像包括视杯不确定性图像和视盘不确定性图像,并将多个视杯预测标注分割图像的平均值图像,确定为视杯一致性图像,将多个视盘预测标注分割图像的平均值图像,确定为视盘一致性图像,之后,通过第二图像分割模型,将目标样本图像特征分别与视杯不确定性图像、视盘不确定性图像、视杯一致性图像及视盘一致性图像进行融合,并将融合后的多个特征图进行拼接,并对拼接得到的特征图进行卷及处理,得到该预测分割图像。
909、计算机设备基于先验知识向量,对多个样本标注分割图像进行加权融合,得到融合标注分割图像。
由于该先验知识向量包括多个先验知识权重,多个先验知识权重与多个样本标注分割图像一一对应,则通过先验知识向量中的多个先验知识权重,将多个样本标注分割图像进行加权融合,将得到的融合标注分割图像,作为多个标注者所标注的最终结果,以便后续将该融合标注分割图像作为监督值,训练第一图像分割模型、图像重构模型及第二图像分割模型进行训练。
在一种可能实现方式中,该多个先验知识权重、多个样本标注分割图像及融合标注分割图像满足以下关系:
Figure PCTCN2022077951-appb-000015
其中,GT soft表示融合标注分割图像;N 1表示多个先验知识权重的总个数,N 1为大于等于2的正整数;i 1用于表示先验知识权重及样本标注分割图像的序号,i 1为大于等于1、且小于等于N 1的正整数;
Figure PCTCN2022077951-appb-000016
表示第i 1个样本标注分割图像;
Figure PCTCN2022077951-appb-000017
表示第i 1个先验知识权重,且第i 1个样本标注分割图像
Figure PCTCN2022077951-appb-000018
与第i 1个先验知识权重
Figure PCTCN2022077951-appb-000019
对应。
在一种可能实现方式中,每个样本标注分割图像包括原始图像中多个像素点对应的权重,则该步骤909包括:基于先验知识向量,将多个样本标注分割图像中位于相同位置的像素点的像素值进行加权融合,得到每个位置对应的融合权重,将多个位置对应的融合权重构成该融合标注分割图像。其中,对于任一样本标注分割图像中任一位置的像素点,该像素点的像素值为该原始图像中位于相同位置的像素点对应的权重。
910、计算机设备基于预测分割图像与融合标注分割图像之间的差异,对第一图像分割模型、图像重构模型及第二图像分割模型进行训练。
由于融合标注分割图像相当于样本原始图像的真实标注分割图像,即该融合标注分割图像指示样本物体在样本原始图像中所处的样本区域,该样本区域为该样本物体在样本原始图像中所处的真实区域,该预测分割图像是通过第一图像分割模型、图像重构模型及第二图像分割模型预测到的,则基于预测分割图像与融合标注分割图像之间的差异,能够确定第一图像分割模型、图像重构模型及第二图像分割模型的不准确度,以便后续对第一图像分割模型、 图像重构模型及第二图像分割模型进行调整。
在一种可能实现方式中,该步骤910包括以下步骤9101-9102。
9101、基于预测分割图像与融合标注分割图像之间的差异,确定第一损失值。
其中,该第一损失值用于表示预测分割图像与融合标注分割图像之间的差异,损失值越大,表示第一图像分割模型、图像重构模型及第二图像分割模型的准确度越低,损失值越小,表示第一图像分割模型、图像重构模型及第二图像分割模型的准确度越高。
9102、基于第一损失值,对第一图像分割模型、图像重构模型及第二图像分割模型进行训练。
通过第一损失值对第一图像分割模型、图像重构模型及第二图像分割模型进行训练,以减小第一损失值,提高第一图像分割模型、图像重构模型及第二图像分割模型的准确度。如图11所示,通过第一图像分割模型、图像重构模型及第二个分割模型,来获取预测分割图像,其中,该知识推断子模型为第一图像分割模型中的子模型。通过对多个样本标注分割图像进行加权融合,得到融合标注分割图像,之后,基于预测分割图像与融合标注分割图像之间的差异,确定第一损失值,并基于确定的第一损失值对第一图像分割模型、图像重构模型及第二图像分割模型进行训练。
在一种可能实现方式中,该步骤9102包括以下三种方式。
第一种方式:基于第一样本分割图像与融合标注分割图像之间的差异,确定第二损失值,基于第一损失值及第二损失值,对第一图像分割模型、图像重构模型及第二图像分割模型进行训练。
其中,该第二损失值用于表示第一样本分割图像与融合标注分割图像之间的差异,第一样本分割图像与融合标注分割图像之间的差异越大,该第二损失值越大,第一样本分割图像与融合标注分割图像之间的差异越小,该第二损失值越小。
通过第一损失值及第二损失值,对第一图像分割模型、图像重构模型及第二图像分割模型进行训练,以减小第一损失值及第二损失值,提高第一图像分割模型、图像重构模型及第二图像分割模型的准确度。
在一种可能实现方式中,基于第一损失值及第二损失值,对第一图像分割模型、图像重构模型及第二图像分割模型进行训练的过程,包括:确定第一损失值与第二损失值的第一和值,基于第一和值,对第一图像分割模型、图像重构模型及第二图像分割模型进行训练。
第二种方式:基于多个预测标注分割图像与对应的样本标注分割图像之间的差异,确定第三损失值,基于第一损失值及第三损失值,对第一图像分割模型、图像重构模型及第二图像分割模型进行训练。
其中,该第三损失值为重构损失,用于表示多个预测标注分割图像与对应的样本标注分割图像之间的差异。
在一种可能实现方式中,多个预测标注分割图像、多个样本标注分割图像及第三损失值,满足以下关系:
Figure PCTCN2022077951-appb-000020
Figure PCTCN2022077951-appb-000021
其中,loss rec表示第三损失值;N 1表示多个先验知识权重的总个数,也即是多个预测标注分割图像的个数,N 1为大于等于2的正整数;i 2表示预测标注分割图像及样本标注分割图像的序号;L BCE为二元交叉熵损失函数;
Figure PCTCN2022077951-appb-000022
表示第i 2个样本标注分割图像;
Figure PCTCN2022077951-appb-000023
表示第i 2个预测标注分割图像。
在一种可能实现方式中,基于第一损失值及第三损失值,对第一图像分割模型、图像重 构模型及第二图像分割模型进行训练的过程,包括:确定第一损失值与第三损失值的第二和值,根据第二和值,对第一图像分割模型、图像重构模型及第二图像分割模型进行训练。
第三种方式:将样本原始图像及融合标注分割图像进行拼接,得到第二样本拼接图像,调用编码子模型,对第二样本拼接图像进行编码,得到第三样本特征图,基于第三样本特征图与第一样本特征图之间的差异,确定第四损失值,基于第一损失值及第四损失值,对第一图像分割模型、图像重构模型及第二图像分割模型进行训练。
其中,第四损失值为一致性损失,用于表示第三样本特征图与第一样本特征图之间的差异。获取第二样本拼接图像的过程,与上述步骤305同理,在此不再赘述。调用编码子模型获取第三样本特征图的过程,与上述步骤905同理在此不再赘述。
由于第一样本特征图是调用编码子模型第一样本拼接图像进行编码得到的,该第一样本拼接图像是由样本原始图像及第一样本分割图像拼接得到的,该第三样本特征图是调用编码子模型对第二样本拼接图像进行编码得到的,该第二样本拼接图像是由样本原始图像及融合标注分割图像拼接得到的,该第一样本分割图像是预测得到的,该融合标注分割图像是多个标注者标注的真实结果,则通过第四损失值,能够确定同一个编码子模型输出的预测结果所对应的第一样本特征图与真实结果对应的第三样本特征图之间的差异,能够反映出预测结果与真实结果之间的差异,从而反映出编码子模型的准确性。
在一种可能实现方式中,基于第一损失值及第四损失值,对第一图像分割模型、图像重构模型及第二图像分割模型进行训练的过程,包括:确定第一损失值与第四损失值的第三和值,基于第三和值,对第一图像分割模型、图像重构模型及第二图像分割模型进行训练。
在一种可能实现方式中,该编码子模型包括多个第三卷积模块,在调用多个第三卷积模块对第一样本拼接图像进行编码的过程中,调用第一个第三卷积模块对第一样本拼接图像进行编码,得到第一个第三参考特征图,调用当前的第三卷积模块对上一个第三卷积模块输出的第三参考特征图进行编码,得到当前的第三卷积模块对应的第三参考特征图,直至得到最后一个第三卷积模块输出第三参考特征图,将最后一个第三卷积模块输出的第三参考特征图确定为该第一样本特征图。同理,在调用多个第三卷积模块对第二样本拼接图像进行编码的过程中,按照上述方式,也能够得到第二样本拼接图像对应的多个第四参考特征图,并将最后一个第三卷积模块输出的第四参考特征图确定为该第二样本拼接图像;则第四损失值,满足以下关系:
Figure PCTCN2022077951-appb-000024
其中,loss con表示第四损失值;Q表示编码子模型包括多个第三卷积模块的个数,Q为大于等于2的正整数;i 3表示第三卷积模块的序号,i 3为大于等于1、且小于等于Q的正整数;
Figure PCTCN2022077951-appb-000025
表示第i 3个第三卷积模块输出的第三参考特征图;
Figure PCTCN2022077951-appb-000026
表示第i 3个第三卷积模块输出的第四参考特征图。
需要说明的是,上述仅是以三种方式分别对第一图像分割模型、图像重构模型及第二图像分割模型进行训练进行说明的,而在另一实施例中,上述三种方式能够两两结合,例如,第一种方式与第二种方式结合,第二种方式与第三种方式结合;或者,上述三种方式结合。
在一种可能实现方式中,该步骤9102包括:基于第一样本分割图像与融合标注分割图像之间的差异,确定第二损失值,基于多个预测标注分割图像与对应的样本标注分割图像之间的差异,确定第三损失值,将样本原始图像及融合标注分割图像进行拼接,得到第二样本拼接图像,调用编码子模型,对第二样本拼接图像进行编码,得到第三样本特征图,基于第三样本特征图与第一样本特征图之间的差异,确定第四损失值,基于第一损失值、第二损失值、第三损失值及第四损失值,对第一图像分割模型、图像重构模型及第二图像分割模型进行训练。
可选地,基于第一损失值、第二损失值、第三损失值及第四损失值,对第一图像分割模型、图像重构模型及第二图像分割模型进行训练的过程,包括:基于第一损失值、第二损失值、第三损失值及第四损失值,确定总损失值,基于总损失值,对第一图像分割模型、图像重构模型及第二图像分割模型进行训练。
可选地,第一损失值、第二损失值、第三损失值、第四损失值及总损失值,满足以下关系:
L=L BCE(P 1,GT soft)+L BCE(M,GT soft)+αloss con+(1-α)loss rec
其中,L表示总损失值;L BCE为二元交叉熵损失函数;P 1表示第二损失值;GT soft表示融合标注分割图像;M表示预测分割图像;α表示超参数,用于平衡第三损失值和第二损失值,能够设置为0.7;loss rec表示第三损失值;loss con表示第四损失值。
本申请实施例提供的模型训练方式,在对原始图像进行分割的过程中,通过引入用于表示多个标注者的标注准确度的先验知识权重,重构出与多个标注者相匹配的标注分割图像,以指示目标物体在原始图像中所处的目标区域,即重构出了多个标注者对原始图像的多种标注结果,之后,通过多个标注分割图像及原始图像的目标特征图来获取原始图像的第二分割图像,使得第二分割图像中融入了多个标注者对应的标注结果,保证第二分割图像的准确性,从而提高了图像分割的准确性。
并且,通过考虑到重构的多个预测标注分割图像与多个样本标注分割图像之间的差异、第三样本特征图与第一样本特征图之间的差异及第一样本分割图像与融合标注分割图像之间的差异,以提高第一图像分割模型、图像重构模型及第二图像分割模型的准确性。
本申请实施例提供的模型训练方法,是基于多个标注者对应的样本标注分割图像对模型进行训练的。现有技术在模型训练过程中,能够采用单个标注者的样本标注分割图像对模型进行训练。表1包括了采用不同的标注者标注的样本标注分割图像,对多种模型训练后的模型准确率,如表1所示,通过表1中的准确率可知,采用多个标注者的样本标注分割图像对模型训练后的模型的准确率高。
并且,通过提供的图像重构模型,能够重构出多个标注者的标注分割图像,能够增强得到的预测分割图像与样本标注分割图像之间的相关系,能够估计出标注者之间的不确定性。
表1
Figure PCTCN2022077951-appb-000027
本申请实施例提供的方法,能够通过第一分割图像模型、图像重构模型和第二图像分割模型,对原始图像进行图像分割。表2和3均是以医学图像为例,通过本申请提供的模型及现有技术提供的图像分割模型,对医学图像进行图像分割的准确率进行比对。表2是以眼部图像为例,通过对眼部图像进行图像分割,以确定眼部中的视杯对应的分割图像及眼部中的视盘对应的分割图像。通过表2中的数据可知,无论是获取视杯对应的分割图像,还是获取视盘对应的分割图像,本申请实施例提供的模型的准确率均最高。表3是以医学图像中的肾部图像、脑部图像、肿瘤图像等为例,通过表3中的数据可知,无论是获取任一种医学图像 的分割图像,本申请实施例提供的模型的准确率均最高。即表明了,本申请提供的图像分割方法获取到的分割图像的准确率高,且图像分割的效果好。
表2
Figure PCTCN2022077951-appb-000028
表3
Figure PCTCN2022077951-appb-000029
基于本申请提供的图像分割方法,在不同的先验知识权重条件下,对不同的模型的准确率进行比对,如图4所示。其中,模型1至模型6分别是通过采用标注者1至标注者6的标注分割图像训练得到的模型。通过采用不同的先验知识向量对多种模型进行评估,多种不同的先验知识向量包括单一标注者、随机标注者的先验知识权重和平均先验知识权重。如表4所示,对于单一标注者的先验知识向量,选择的标注者的先验知识向量为1,其他的标注者的先验知识向量为0。以眼部图像为例,通过多种模型对眼部图像进行分割,得到眼部图像的分割图像,每种模型得到视杯的分割图像的准确率及视盘的分割图像的准确率如表4所示,通过表4可知,本申请提供的图像分割方法在不同先验知识向量条件下始终取得优越的性能。
表4
Figure PCTCN2022077951-appb-000030
Figure PCTCN2022077951-appb-000031
图12是本申请实施例提供的一种图像分割装置的结构示意图,如图12所示,该装置包括:
编码模块1201,用于基于先验知识向量对原始图像进行编码,得到目标特征图,原始图像包括目标物体,先验知识向量包括多个先验知识权重,每个先验知识权重用于表示一个标注者对应的准确度,该准确度为标注者在任一图像中标注任一物体所在区域的准确度;
解码模块1202,用于对目标特征图进行解码,得到原始图像的第一分割图像,第一分割图像指示目标物体在原始图像中所处的目标区域;
重构模块1203,用于基于先验知识向量,对第一分割图像进行图像重构,得到多个标注分割图像,每个标注分割图像与一个先验知识权重对应,每个标注分割图像指示对应的标注者所标注的目标区域;
处理模块1204,用于基于多个标注分割图像对目标特征图进行处理,得到原始图像的第二分割图像。
在一种可能实现方式中,如图13所示,处理模块1204,包括:
第一确定单元1241,用于基于多个标注分割图像之间的差异,确定不确定性图像,不确定性图像指示多个目标区域之间的差异,每个目标区域为标注分割图像指示的区域;
第一融合单元1242,用于将目标特征图与不确定性图像进行融合,得到第二分割图像。
在另一种可能实现方式中,每个标注分割图像包括原始图像中的多个像素点对应的第一权重,第一权重用于表示对应的像素点在目标区域内的可能性;
第一确定单元1241,用于确定每个标注分割图像与平均值图像之间的差值图像,平均值图像为多个标注分割图像的平均值图像;确定多个差值图像中位于相同位置的像素点的像素值的平方和;将每个位置对应的平方和与目标个数之间的比值的开方,分别确定为每个位置的第二权重,目标个数为多个标注分割图像的个数;基于多个位置的第二权重构建不确定性图像。
在另一种可能实现方式中,第一融合单元1242,用于确定多个标注分割图像的平均值图像;确定目标特征图与不确定性图像的乘积,将本次确定的乘积与目标特征图之和,确定为第一融合特征图;确定目标特征图与平均值图像的乘积,将本次确定的乘积与目标特征图之和,确定为第二融合特征图;将第一融合特征图及第二融合特征图进行拼接,得到拼接融合特征图;对拼接融合特征图进行卷积,得到第二分割图像。
在另一种可能实现方式中,如图13所示,编码模块1201,包括:
第一编码单元1211,用于对原始图像进行编码,得到原始图像的第一特征图;
第二融合单元1212,用于将先验知识向量与第一特征图进行融合,得到第二特征图;
第一解码单元1213,用于对第二特征图进行解码,得到目标特征图。
在另一种可能实现方式中,如图13所示,重构模块1203,包括:
拼接单元,用于将原始图像及第一分割图像进行拼接,得到拼接图像;
第二编码单元1231,用于对拼接图像进行编码,得到第三特征图;
第三融合单元1232,用于将先验知识向量与第三特征图进行融合,得到第四特征图;
第二解码单元1233,用于对第四特征图进行解码,得到多个标注分割图像。
在另一种可能实现方式中,
基于先验知识向量对原始图像进行编码,得到目标特征图的步骤由第一图像分割模型执行;
对目标特征图进行解码,得到原始图像的第一分割图像的步骤由第一图像分割模型执行;
基于先验知识向量,对第一分割图像进行图像重构,得到多个标注分割图像的步骤由图像重构模型执行;
基于多个标注分割图像对目标特征图进行处理,得到原始图像的第二分割图像的步骤由第二图像分割模型执行。
在另一种可能实现方式中,如图13所示,装置还包括:
获取模块1205,用于获取样本原始图像、多个样本标注分割图像及先验知识向量,样本原始图像包括样本物体,每个样本标注分割图像与一个先验知识权重对应,每个样本标注分割图像指示样本物体在样本原始图像中所处的样本区域,且每个样本标注分割图像由对应的标注者标注;
编码模块1201,还用于调用第一图像分割模型,基于先验知识向量对样本原始图像进行编码,得到目标样本特征图;
解码模块1202,还用于调用第一图像分割模型,对目标样本特征图进行解码,得到样本原始图像的第一样本分割图像,第一样本分割图像指示样本物体在样本原始图像中所处的样本区域;
重构模块1203,还用于调用图像重构模型,基于先验知识向量,对第一样本分割图像进行图像重构,得到多个预测标注分割图像,每个预测标注分割图像与一个先验知识权重对应,每个预测标注分割图像指示预测到的样本区域;
处理模块1204,还用于调用第二图像分割模型,基于多个预测标注分割图像,对目标样本特征图进行处理,得到样本原始图像的预测分割图像;
加权融合模块1206,用于基于先验知识向量,对多个样本标注分割图像进行加权融合,得到融合标注分割图像;
训练模块1207,用于基于预测分割图像与融合标注分割图像之间的差异,对第一图像分割模型、图像重构模型及第二图像分割模型进行训练。
在另一种可能实现方式中,如图13所示,训练模块1207,包括:
第二确定单元1271,用于基于预测分割图像与融合标注分割图像之间的差异,确定第一损失值;
训练单元1272,用于基于第一损失值,对第一图像分割模型、图像重构模型及第二图像 分割模型进行训练。
在另一种可能实现方式中,训练单元1272,用于基于第一样本分割图像与融合标注分割图像之间的差异,确定第二损失值;基于第一损失值及第二损失值,对第一图像分割模型、图像重构模型及第二图像分割模型进行训练。
在另一种可能实现方式中,训练单元1272,用于基于多个预测标注分割图像与对应的样本标注分割图像之间的差异,确定第三损失值;基于第一损失值及第三损失值,对第一图像分割模型、图像重构模型及第二图像分割模型进行训练。
在另一种可能实现方式中,图像重构模型包括编码子模型、融合子模型及解码子模型;
重构模块1203,用于将样本原始图像及第一样本分割图像进行拼接,得到第一样本拼接图像;调用编码子模型,对第一样本拼接图像进行编码,得到第一样本特征图;调用融合子模型,将先验知识向量与第一样本特征图进行融合,得到第二样本特征图;调用解码子模型,对第二样本特征图进行解码,得到多个预测标注分割图像。
在另一种可能实现方式中,如图13所示,装置还包括:
拼接模块1208,用于将样本原始图像及融合标注分割图像进行拼接,得到第二样本拼接图像;
重构模块1203,还用于调用编码子模型,对第二样本拼接图像进行编码,得到第三样本特征图;
训练单元1272,用于基于第三样本特征图与第一样本特征图之间的差异,确定第四损失值;基于第一损失值及第四损失值,对第一图像分割模型、图像重构模型及第二图像分割模型进行训练。
需要说明的是:上述实施例提供的图像分割装置,仅以上述各功能模块的划分进行举例说明,实际应用中,能够根据需要而将上述功能分配由不同的功能模块完成,即将计算机设备的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的图像分割装置与图像分割方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
本申请实施例还提供了一种计算机设备,该计算机设备包括处理器和存储器,存储器中存储有至少一条计算机程序,该至少一条计算机程序由处理器加载并执行以实现如下步骤:
基于先验知识向量对原始图像进行编码,得到目标特征图,原始图像包括目标物体,先验知识向量包括多个先验知识权重,每个先验知识权重用于表示一个标注者对应的准确度,准确度为标注者在任一图像中标注任一物体所在区域的准确度;
对目标特征图进行解码,得到原始图像的第一分割图像,第一分割图像指示目标物体在原始图像中所处的目标区域;
基于先验知识向量,对第一分割图像进行图像重构,得到多个标注分割图像,每个标注分割图像与一个先验知识权重对应,每个标注分割图像指示对应的标注者所标注的目标区域;
基于多个标注分割图像,对目标特征图进行处理,得到原始图像的第二分割图像。
在一种可能实现方式中,该至少一条计算机程序由处理器加载并执行,以实现如下步骤:
基于多个标注分割图像之间的差异,确定不确定性图像,不确定性图像指示多个目标区域之间的差异,每个目标区域为标注分割图像指示的区域;
将目标特征图与不确定性图像进行融合,得到第二分割图像。
在一种可能实现方式中,每个标注分割图像包括原始图像中的多个像素点对应的第一权重,第一权重用于表示对应的像素点在目标区域内的可能性;该至少一条计算机程序由处理器加载并执行,以实现如下步骤:
确定每个标注分割图像与平均值图像之间的差值图像,平均值图像为多个标注分割图像的平均值图像;
确定多个差值图像中位于相同位置的像素点的像素值的平方和;
将每个位置对应的平方和与目标个数之间的比值的开方,分别确定为每个位置的第二权重,目标个数为多个标注分割图像的个数;
基于多个位置的第二权重,构建不确定性图像。
在一种可能实现方式中,该至少一条计算机程序由处理器加载并执行,以实现如下步骤:
确定多个标注分割图像的平均值图像;
确定目标特征图与不确定性图像的乘积,将本次确定的乘积与目标特征图之和,确定为第一融合特征图;
确定目标特征图与平均值图像的乘积,将本次确定的乘积与目标特征图之和,确定为第二融合特征图;
将第一融合特征图及第二融合特征图进行拼接,得到拼接融合特征图;
对拼接融合特征图进行卷积,得到第二分割图像。
在一种可能实现方式中,该至少一条计算机程序由处理器加载并执行,以实现如下步骤:
对原始图像进行编码,得到原始图像的第一特征图;
将先验知识向量与第一特征图进行融合,得到第二特征图;
对第二特征图进行解码,得到目标特征图。
在一种可能实现方式中,该至少一条计算机程序由处理器加载并执行,以实现如下步骤:
将原始图像及第一分割图像进行拼接,得到拼接图像;
对拼接图像进行编码,得到第三特征图;
将先验知识向量与第三特征图进行融合,得到第四特征图;
对第四特征图进行解码,得到多个标注分割图像。
在一种可能实现方式中,基于先验知识向量对原始图像进行编码,得到目标特征图的步骤由第一图像分割模型执行;
对目标特征图进行解码,得到原始图像的第一分割图像的步骤由第一图像分割模型执行;
基于先验知识向量,对第一分割图像进行图像重构,得到多个标注分割图像的步骤由图像重构模型执行;
基于多个标注分割图像,对目标特征图进行处理,得到原始图像的第二分割图像的步骤由第二图像分割模型执行。
在一种可能实现方式中,该至少一条计算机程序由处理器加载并执行,以实现如下步骤:
获取样本原始图像、多个样本标注分割图像及先验知识向量,样本原始图像包括样本物体,每个样本标注分割图像与一个先验知识权重对应,每个样本标注分割图像指示样本物体在样本原始图像中所处的样本区域,且每个样本标注分割图像由对应的标注者标注;
调用第一图像分割模型,基于先验知识向量对样本原始图像进行编码,得到目标样本特征图;
调用第一图像分割模型,对目标样本特征图进行解码,得到样本原始图像的第一样本分割图像,第一样本分割图像指示样本物体在样本原始图像中所处的样本区域;
调用图像重构模型,基于先验知识向量,对第一样本分割图像进行图像重构,得到多个预测标注分割图像,每个预测标注分割图像与一个先验知识权重对应,每个预测标注分割图像指示预测到的样本区域;
调用第二图像分割模型,基于多个预测标注分割图像,对目标样本特征图进行处理,得到样本原始图像的预测分割图像;
基于先验知识向量,对多个样本标注分割图像进行加权融合,得到融合标注分割图像;
基于预测分割图像与融合标注分割图像之间的差异,对第一图像分割模型、图像重构模型及第二图像分割模型进行训练。
在一种可能实现方式中,该至少一条计算机程序由处理器加载并执行,以实现如下步骤:
基于预测分割图像与融合标注分割图像之间的差异,确定第一损失值;
基于第一损失值,对第一图像分割模型、图像重构模型及第二图像分割模型进行训练。
在一种可能实现方式中,该至少一条计算机程序由处理器加载并执行,以实现如下步骤:
基于第一样本分割图像与融合标注分割图像之间的差异,确定第二损失值;
基于第一损失值及第二损失值,对第一图像分割模型、图像重构模型及第二图像分割模型进行训练。
在一种可能实现方式中,该至少一条计算机程序由处理器加载并执行,以实现如下步骤:
基于多个预测标注分割图像与对应的样本标注分割图像之间的差异,确定第三损失值;
基于第一损失值及第三损失值,对第一图像分割模型、图像重构模型及第二图像分割模型进行训练。
在一种可能实现方式中,图像重构模型包括编码子模型、融合子模型及解码子模型;该至少一条计算机程序由处理器加载并执行,以实现如下步骤:
将样本原始图像及第一样本分割图像进行拼接,得到第一样本拼接图像;
调用编码子模型,对第一样本拼接图像进行编码,得到第一样本特征图;
调用融合子模型,将先验知识向量与第一样本特征图进行融合,得到第二样本特征图;
调用解码子模型,对第二样本特征图进行解码,得到多个预测标注分割图像。
在一种可能实现方式中,该至少一条计算机程序由处理器加载并执行,以实现如下步骤:
将样本原始图像及融合标注分割图像进行拼接,得到第二样本拼接图像;
调用编码子模型,对第二样本拼接图像进行编码,得到第三样本特征图;
基于第三样本特征图与第一样本特征图之间的差异,确定第四损失值;
基于第一损失值及第四损失值,对第一图像分割模型、图像重构模型及第二图像分割模型进行训练。
可选地,计算机设备提供为终端。图14示出了本申请一个示例性实施例提供的终端1400的结构框图。终端1400包括有:处理器1401和存储器1402。
处理器1401可以包括一个或多个处理核心,比如4核心处理器、8核心处理器等。处理器1401也可以包括主处理器和协处理器,主处理器是用于对在唤醒状态下的数据进行处理的处理器,也称CPU(Central Processing Unit,中央处理器);协处理器是用于对在待机状态下的数据进行处理的低功耗处理器。在一些实施例中,处理器1401可以集成有GPU(Graphics Processing Unit,图像处理器),GPU用于负责显示屏所需要显示的内容的渲染和绘制。一些实施例中,处理器1401还可以包括AI(Artificial Intelligence,人工智能)处理器,该AI处理器用于处理有关机器学习的计算操作。
存储器1402可以包括一个或多个计算机可读存储介质,该计算机可读存储介质可以是非暂态的。在一些实施例中,存储器1402中的非暂态的计算机可读存储介质用于存储至少一个计算机程序,该至少一个计算机程序用于被处理器1401所执行以实现本申请中方法实施例提供的图像分割方法。
在一些实施例中,终端1400还可选包括有:外围设备接口1403和至少一个外围设备。处理器1401、存储器1402和外围设备接口1403之间可以通过总线或信号线相连。各个外围设备可以通过总线、信号线或电路板与外围设备接口1403相连。具体地,外围设备包括:射频电路1404、显示屏1405、摄像头组件1406、音频电路1407和电源1408中的至少一种。
外围设备接口1403可被用于将I/O(Input/Output,输入/输出)相关的至少一个外围设备连接到处理器1401和存储器1402。在一些实施例中,处理器1401、存储器1402和外围设备接口1403被集成在同一芯片或电路板上;在一些其他实施例中,处理器1401、存储器1402和外围设备接口1403中的任意一个或两个可以在单独的芯片或电路板上实现,本实施例对此不加以限定。
射频电路1404用于接收和发射RF(Radio Frequency,射频)信号,也称电磁信号。射 频电路1404通过电磁信号与通信网络以及其他通信设备进行通信。射频电路1404将电信号转换为电磁信号进行发送,或者,将接收到的电磁信号转换为电信号。
显示屏1405用于显示UI(User Interface,用户界面)。该UI可以包括图形、文本、图标、视频及其它们的任意组合。当显示屏1405是触摸显示屏时,显示屏1405还具有采集在显示屏1405的表面或表面上方的触摸信号的能力。该触摸信号可以作为控制信号输入至处理器1401进行处理。
摄像头组件1406用于采集图像或视频。可选地,摄像头组件1406包括前置摄像头和后置摄像头。
音频电路1407可以包括麦克风和扬声器。麦克风用于采集用户及环境的声波,并将声波转换为电信号输入至处理器1401进行处理,或者输入至射频电路1404以实现语音通信。
电源1408用于为终端1400中的各个组件进行供电。电源1408可以是交流电、直流电、一次性电池或可充电电池。
本领域技术人员可以理解,图14中示出的结构并不构成对终端1400的限定,可以包括比图示更多或更少的组件,或者组合某些组件,或者采用不同的组件布置。
可选地,计算机设备提供为服务器。图15是本申请实施例提供的一种服务器的结构示意图,该服务器1500可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上处理器(Central Processing Units,CPU)1501和一个或一个以上的存储器1502,其中,存储器1502中存储有至少一条计算机程序,至少一条计算机程序由处理器1501加载并执行以实现上述各个方法实施例提供的方法。当然,该服务器还可以具有有线或无线网络接口、键盘及输入输出接口等部件,以便进行输入输出,该服务器还可以包括其他用于实现设备功能的部件,在此不做赘述。
本申请实施例还提供了一种计算机可读存储介质,该计算机可读存储介质中存储有至少一条计算机程序,该至少一条计算机程序由处理器加载并执行以实现如下步骤:
基于先验知识向量对原始图像进行编码,得到目标特征图,原始图像包括目标物体,先验知识向量包括多个先验知识权重,每个先验知识权重用于表示一个标注者对应的准确度,准确度为标注者在任一图像中标注任一物体所在区域的准确度;
对目标特征图进行解码,得到原始图像的第一分割图像,第一分割图像指示目标物体在原始图像中所处的目标区域;
基于先验知识向量,对第一分割图像进行图像重构,得到多个标注分割图像,每个标注分割图像与一个先验知识权重对应,每个标注分割图像指示对应的标注者所标注的目标区域;
基于多个标注分割图像,对目标特征图进行处理,得到原始图像的第二分割图像。
在一种可能实现方式中,该至少一条计算机程序由处理器加载并执行,以实现如下步骤:
基于多个标注分割图像之间的差异,确定不确定性图像,不确定性图像指示多个目标区域之间的差异,每个目标区域为标注分割图像指示的区域;
将目标特征图与不确定性图像进行融合,得到第二分割图像。
在一种可能实现方式中,每个标注分割图像包括原始图像中的多个像素点对应的第一权重,第一权重用于表示对应的像素点在目标区域内的可能性;该至少一条计算机程序由处理器加载并执行,以实现如下步骤:
确定每个标注分割图像与平均值图像之间的差值图像,平均值图像为多个标注分割图像的平均值图像;
确定多个差值图像中位于相同位置的像素点的像素值的平方和;
将每个位置对应的平方和与目标个数之间的比值的开方,分别确定为每个位置的第二权重,目标个数为多个标注分割图像的个数;
基于多个位置的第二权重,构建不确定性图像。
在一种可能实现方式中,该至少一条计算机程序由处理器加载并执行,以实现如下步骤:
确定多个标注分割图像的平均值图像;
确定目标特征图与不确定性图像的乘积,将本次确定的乘积与目标特征图之和,确定为第一融合特征图;
确定目标特征图与平均值图像的乘积,将本次确定的乘积与目标特征图之和,确定为第二融合特征图;
将第一融合特征图及第二融合特征图进行拼接,得到拼接融合特征图;
对拼接融合特征图进行卷积,得到第二分割图像。
在一种可能实现方式中,该至少一条计算机程序由处理器加载并执行,以实现如下步骤:
对原始图像进行编码,得到原始图像的第一特征图;
将先验知识向量与第一特征图进行融合,得到第二特征图;
对第二特征图进行解码,得到目标特征图。
在一种可能实现方式中,该至少一条计算机程序由处理器加载并执行,以实现如下步骤:
将原始图像及第一分割图像进行拼接,得到拼接图像;
对拼接图像进行编码,得到第三特征图;
将先验知识向量与第三特征图进行融合,得到第四特征图;
对第四特征图进行解码,得到多个标注分割图像。
在一种可能实现方式中,基于先验知识向量对原始图像进行编码,得到目标特征图的步骤由第一图像分割模型执行;
对目标特征图进行解码,得到原始图像的第一分割图像的步骤由第一图像分割模型执行;
基于先验知识向量,对第一分割图像进行图像重构,得到多个标注分割图像的步骤由图像重构模型执行;
基于多个标注分割图像,对目标特征图进行处理,得到原始图像的第二分割图像的步骤由第二图像分割模型执行。
在一种可能实现方式中,该至少一条计算机程序由处理器加载并执行,以实现如下步骤:
获取样本原始图像、多个样本标注分割图像及先验知识向量,样本原始图像包括样本物体,每个样本标注分割图像与一个先验知识权重对应,每个样本标注分割图像指示样本物体在样本原始图像中所处的样本区域,且每个样本标注分割图像由对应的标注者标注;
调用第一图像分割模型,基于先验知识向量对样本原始图像进行编码,得到目标样本特征图;
调用第一图像分割模型,对目标样本特征图进行解码,得到样本原始图像的第一样本分割图像,第一样本分割图像指示样本物体在样本原始图像中所处的样本区域;
调用图像重构模型,基于先验知识向量,对第一样本分割图像进行图像重构,得到多个预测标注分割图像,每个预测标注分割图像与一个先验知识权重对应,每个预测标注分割图像指示预测到的样本区域;
调用第二图像分割模型,基于多个预测标注分割图像,对目标样本特征图进行处理,得到样本原始图像的预测分割图像;
基于先验知识向量,对多个样本标注分割图像进行加权融合,得到融合标注分割图像;
基于预测分割图像与融合标注分割图像之间的差异,对第一图像分割模型、图像重构模型及第二图像分割模型进行训练。
在一种可能实现方式中,该至少一条计算机程序由处理器加载并执行,以实现如下步骤:
基于预测分割图像与融合标注分割图像之间的差异,确定第一损失值;
基于第一损失值,对第一图像分割模型、图像重构模型及第二图像分割模型进行训练。
在一种可能实现方式中,该至少一条计算机程序由处理器加载并执行,以实现如下步骤:
基于第一样本分割图像与融合标注分割图像之间的差异,确定第二损失值;
基于第一损失值及第二损失值,对第一图像分割模型、图像重构模型及第二图像分割模 型进行训练。
在一种可能实现方式中,该至少一条计算机程序由处理器加载并执行,以实现如下步骤:
基于多个预测标注分割图像与对应的样本标注分割图像之间的差异,确定第三损失值;
基于第一损失值及第三损失值,对第一图像分割模型、图像重构模型及第二图像分割模型进行训练。
在一种可能实现方式中,图像重构模型包括编码子模型、融合子模型及解码子模型;该至少一条计算机程序由处理器加载并执行,以实现如下步骤:
将样本原始图像及第一样本分割图像进行拼接,得到第一样本拼接图像;
调用编码子模型,对第一样本拼接图像进行编码,得到第一样本特征图;
调用融合子模型,将先验知识向量与第一样本特征图进行融合,得到第二样本特征图;
调用解码子模型,对第二样本特征图进行解码,得到多个预测标注分割图像。
在一种可能实现方式中,该至少一条计算机程序由处理器加载并执行,以实现如下步骤:
将样本原始图像及融合标注分割图像进行拼接,得到第二样本拼接图像;
调用编码子模型,对第二样本拼接图像进行编码,得到第三样本特征图;
基于第三样本特征图与第一样本特征图之间的差异,确定第四损失值;
基于第一损失值及第四损失值,对第一图像分割模型、图像重构模型及第二图像分割模型进行训练。
本申请实施例还提供了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机程序代码,该计算机程序代码存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机程序代码,处理器执行该计算机程序代码,使得该计算机设备实现如上述实施例的图像分割方法中所执行的操作。
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,所述程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。
以上所述仅为本申请实施例的可选实施例,并不用以限制本申请实施例,凡在本申请实施例的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。

Claims (16)

  1. 一种图像分割方法,由计算机设备执行,所述方法包括:
    基于先验知识向量对原始图像进行编码,得到目标特征图,所述原始图像包括目标物体,所述先验知识向量包括多个先验知识权重,每个先验知识权重用于表示一个标注者对应的准确度,所述准确度为所述标注者在任一图像中标注任一物体所在区域的准确度;
    对所述目标特征图进行解码,得到所述原始图像的第一分割图像,所述第一分割图像指示所述目标物体在所述原始图像中所处的目标区域;
    基于所述先验知识向量,对所述第一分割图像进行图像重构,得到多个标注分割图像,每个标注分割图像与一个先验知识权重对应,所述每个标注分割图像指示对应的标注者所标注的所述目标区域;
    基于所述多个标注分割图像,对所述目标特征图进行处理,得到所述原始图像的第二分割图像。
  2. 根据权利要求1所述的方法,其中,所述基于所述多个标注分割图像,对所述目标特征图进行处理,得到所述原始图像的第二分割图像,包括:
    基于所述多个标注分割图像之间的差异,确定不确定性图像,所述不确定性图像指示多个所述目标区域之间的差异,每个所述目标区域为所述标注分割图像指示的区域;
    将所述目标特征图与所述不确定性图像进行融合,得到所述第二分割图像。
  3. 根据权利要求2所述的方法,其中,所述每个标注分割图像包括所述原始图像中的多个像素点对应的第一权重,所述第一权重用于表示对应的像素点在所述目标区域内的可能性;
    所述基于所述多个标注分割图像之间的差异,确定不确定性图像,包括:
    确定所述每个标注分割图像与平均值图像之间的差值图像,所述平均值图像为所述多个标注分割图像的平均值图像;
    确定多个差值图像中位于相同位置的像素点的像素值的平方和;
    将每个位置对应的平方和与目标个数之间的比值的开方,分别确定为所述每个位置的第二权重,所述目标个数为所述多个标注分割图像的个数;
    基于多个位置的第二权重,构建所述不确定性图像。
  4. 根据权利要求2所述的方法,其中,所述将所述目标特征图与所述不确定性图像进行融合,得到所述第二分割图像,包括:
    确定所述多个标注分割图像的平均值图像;
    确定所述目标特征图与所述不确定性图像的乘积,将本次确定的乘积与所述目标特征图之和,确定为第一融合特征图;
    确定所述目标特征图与所述平均值图像的乘积,将本次确定的乘积与所述目标特征图之和,确定为第二融合特征图;
    将所述第一融合特征图及所述第二融合特征图进行拼接,得到拼接融合特征图;
    对所述拼接融合特征图进行卷积,得到所述第二分割图像。
  5. 根据权利要求1所述的方法,其中,所述基于先验知识向量对原始图像进行编码,得到目标特征图,包括:
    对所述原始图像进行编码,得到所述原始图像的第一特征图;
    将所述先验知识向量与所述第一特征图进行融合,得到第二特征图;
    对所述第二特征图进行解码,得到所述目标特征图。
  6. 根据权利要求1所述的方法,其中,所述基于所述先验知识向量,对所述第一分割图像进行图像重构,得到多个标注分割图像,包括:
    将所述原始图像及所述第一分割图像进行拼接,得到拼接图像;
    对所述拼接图像进行编码,得到第三特征图;
    将所述先验知识向量与所述第三特征图进行融合,得到第四特征图;
    对所述第四特征图进行解码,得到所述多个标注分割图像。
  7. 根据权利要求1所述的方法,其中,
    所述基于先验知识向量对原始图像进行编码,得到目标特征图的步骤由第一图像分割模型执行;
    所述对所述目标特征图进行解码,得到所述原始图像的第一分割图像的步骤由所述第一图像分割模型执行;
    所述基于所述先验知识向量,对所述第一分割图像进行图像重构,得到多个标注分割图像的步骤由图像重构模型执行;
    所述基于所述多个标注分割图像,对所述目标特征图进行处理,得到所述原始图像的第二分割图像的步骤由第二图像分割模型执行。
  8. 根据权利要求7所述的方法,其中,所述方法还包括:
    获取样本原始图像、多个样本标注分割图像及所述先验知识向量,所述样本原始图像包括样本物体,每个样本标注分割图像与一个先验知识权重对应,所述每个样本标注分割图像指示所述样本物体在所述样本原始图像中所处的样本区域,且所述每个样本标注分割图像由对应的标注者标注;
    调用所述第一图像分割模型,基于所述先验知识向量对所述样本原始图像进行编码,得到目标样本特征图;
    调用所述第一图像分割模型,对所述目标样本特征图进行解码,得到所述样本原始图像的第一样本分割图像,所述第一样本分割图像指示所述样本物体在所述样本原始图像中所处的所述样本区域;
    调用所述图像重构模型,基于所述先验知识向量,对所述第一样本分割图像进行图像重构,得到多个预测标注分割图像,每个预测标注分割图像与一个先验知识权重对应,所述每个预测标注分割图像指示预测到的所述样本区域;
    调用所述第二图像分割模型,基于所述多个预测标注分割图像,对所述目标样本特征图进行处理,得到所述样本原始图像的预测分割图像;
    基于所述先验知识向量,对所述多个样本标注分割图像进行加权融合,得到融合标注分割图像;
    基于所述预测分割图像与所述融合标注分割图像之间的差异,对所述第一图像分割模型、所述图像重构模型及所述第二图像分割模型进行训练。
  9. 根据权利要求8所述的方法,其中,所述基于所述预测分割图像与所述融合标注分割图像之间的差异,对所述第一图像分割模型、所述图像重构模型及所述第二图像分割模型进行训练,包括:
    基于所述预测分割图像与所述融合标注分割图像之间的差异,确定第一损失值;
    基于所述第一损失值,对所述第一图像分割模型、所述图像重构模型及所述第二图像分割模型进行训练。
  10. 根据权利要求9所述的方法,其中,所述基于所述第一损失值,对所述第一图像分割 模型、所述图像重构模型及所述第二图像分割模型进行训练,包括:
    基于所述第一样本分割图像与所述融合标注分割图像之间的差异,确定第二损失值;
    基于所述第一损失值及所述第二损失值,对所述第一图像分割模型、所述图像重构模型及所述第二图像分割模型进行训练。
  11. 根据权利要求9所述的方法,其中,所述基于所述第一损失值,对所述第一图像分割模型、所述图像重构模型及所述第二图像分割模型进行训练,包括:
    基于所述多个预测标注分割图像与对应的样本标注分割图像之间的差异,确定第三损失值;
    基于所述第一损失值及所述第三损失值,对所述第一图像分割模型、所述图像重构模型及所述第二图像分割模型进行训练。
  12. 根据权利要求9所述的方法,其中,所述图像重构模型包括编码子模型、融合子模型及解码子模型;
    所述调用所述图像重构模型,基于所述先验知识向量,对所述第一样本分割图像进行图像重构,得到多个预测标注分割图像,包括:
    将所述样本原始图像及所述第一样本分割图像进行拼接,得到第一样本拼接图像;
    调用所述编码子模型,对所述第一样本拼接图像进行编码,得到第一样本特征图;
    调用所述融合子模型,将所述先验知识向量与所述第一样本特征图进行融合,得到第二样本特征图;
    调用所述解码子模型,对所述第二样本特征图进行解码,得到所述多个预测标注分割图像。
  13. 根据权利要求12所述的方法,其中,所述方法还包括:
    将所述样本原始图像及所述融合标注分割图像进行拼接,得到第二样本拼接图像;
    调用所述编码子模型,对所述第二样本拼接图像进行编码,得到第三样本特征图;
    所述基于所述第一损失值,对所述第一图像分割模型、所述图像重构模型及所述第二图像分割模型进行训练,包括:
    基于所述第三样本特征图与所述第一样本特征图之间的差异,确定第四损失值;
    基于所述第一损失值及所述第四损失值,对所述第一图像分割模型、所述图像重构模型及所述第二图像分割模型进行训练。
  14. 一种图像分割装置,所述装置包括:
    编码模块,用于基于先验知识向量对原始图像进行编码,得到目标特征图,所述原始图像包括目标物体,所述先验知识向量包括多个先验知识权重,每个先验知识权重用于表示一个标注者对应的准确度,所述准确度为所述标注者在任一图像中标注任一物体所在区域的准确度;
    解码模块,用于对所述目标特征图进行解码,得到所述原始图像的第一分割图像,所述第一分割图像指示所述目标物体在所述原始图像中所处的目标区域;
    重构模块,用于基于所述先验知识向量,对所述第一分割图像进行图像重构,得到多个标注分割图像,每个标注分割图像与一个先验知识权重对应,所述每个标注分割图像指示对应的标注者所标注的所述目标区域;
    处理模块,用于基于所述多个标注分割图像,对所述目标特征图进行处理,得到所述原始图像的第二分割图像。
  15. 一种计算机设备,所述计算机设备包括处理器和存储器,所述存储器中存储有至少一 条计算机程序,所述至少一条计算机程序由所述处理器加载并执行以实现如权利要求1至13任一权利要求所述的图像分割方法所执行的操作。
  16. 一种计算机可读存储介质,所述计算机可读存储介质中存储有至少一条计算机程序,所述至少一条计算机程序由处理器加载并执行以实现如权利要求1至13任一权利要求所述的图像分割方法所执行的操作。
PCT/CN2022/077951 2021-03-03 2022-02-25 图像分割方法、装置、计算机设备及存储介质 WO2022183984A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP22762449.1A EP4287117A4 (en) 2021-03-03 2022-02-25 IMAGE SEGMENTATION METHOD AND APPARATUS, COMPUTER DEVICE AND STORAGE MEDIUM
US18/074,906 US20230106468A1 (en) 2021-03-03 2022-12-05 Image segmentation method and apparatus, computer device, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110234267.7A CN112598686B (zh) 2021-03-03 2021-03-03 图像分割方法、装置、计算机设备及存储介质
CN202110234267.7 2021-03-03

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/074,906 Continuation US20230106468A1 (en) 2021-03-03 2022-12-05 Image segmentation method and apparatus, computer device, and storage medium

Publications (1)

Publication Number Publication Date
WO2022183984A1 true WO2022183984A1 (zh) 2022-09-09

Family

ID=75210146

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/077951 WO2022183984A1 (zh) 2021-03-03 2022-02-25 图像分割方法、装置、计算机设备及存储介质

Country Status (4)

Country Link
US (1) US20230106468A1 (zh)
EP (1) EP4287117A4 (zh)
CN (1) CN112598686B (zh)
WO (1) WO2022183984A1 (zh)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112598686B (zh) * 2021-03-03 2021-06-04 腾讯科技(深圳)有限公司 图像分割方法、装置、计算机设备及存储介质
CN113705691B (zh) * 2021-08-30 2024-04-09 深圳平安智慧医健科技有限公司 基于人工智能的图像标注校验方法、装置、设备及介质
CN114399640B (zh) * 2022-03-24 2022-07-15 之江实验室 一种不确定区域发现与模型改进的道路分割方法及装置
CN114742807A (zh) * 2022-04-24 2022-07-12 北京医准智能科技有限公司 基于x光图像的胸片识别方法、装置、电子设备和介质
CN116580194B (zh) * 2023-05-04 2024-02-06 山东省人工智能研究院 融合几何信息的软注意力网络的血管分割方法
CN116524206B (zh) * 2023-06-30 2023-10-03 深圳须弥云图空间科技有限公司 目标图像的识别方法及装置

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109598727A (zh) * 2018-11-28 2019-04-09 北京工业大学 一种基于深度神经网络的ct图像肺实质三维语义分割方法
US20200258235A1 (en) * 2019-02-07 2020-08-13 Vysioneer INC. Method and apparatus for automated target and tissue segmentation using multi-modal imaging and ensemble machine learning models
CN111563910A (zh) * 2020-05-13 2020-08-21 上海鹰瞳医疗科技有限公司 眼底图像分割方法及设备
CN111583282A (zh) * 2020-05-18 2020-08-25 联想(北京)有限公司 图像分割方法、装置、设备及存储介质
CN112365512A (zh) * 2020-11-18 2021-02-12 南开大学 训练图像分割模型的方法、图像分割的方法及其装置
CN112598686A (zh) * 2021-03-03 2021-04-02 腾讯科技(深圳)有限公司 图像分割方法、装置、计算机设备及存储介质

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7580556B2 (en) * 2004-01-26 2009-08-25 Drvision Technologies Llc Image region partitioning using pre-labeled regions
CN102298605B (zh) * 2011-06-01 2013-04-17 清华大学 基于有向图非等概率随机搜索的图像自动标注方法及装置
CN110599491B (zh) * 2019-09-04 2024-04-12 腾讯医疗健康(深圳)有限公司 基于先验信息的眼部图像分割方法、装置、设备及介质
CN111652887B (zh) * 2020-05-13 2023-04-07 腾讯科技(深圳)有限公司 图像分割模型训练方法、装置、计算机设备及存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109598727A (zh) * 2018-11-28 2019-04-09 北京工业大学 一种基于深度神经网络的ct图像肺实质三维语义分割方法
US20200258235A1 (en) * 2019-02-07 2020-08-13 Vysioneer INC. Method and apparatus for automated target and tissue segmentation using multi-modal imaging and ensemble machine learning models
CN111563910A (zh) * 2020-05-13 2020-08-21 上海鹰瞳医疗科技有限公司 眼底图像分割方法及设备
CN111583282A (zh) * 2020-05-18 2020-08-25 联想(北京)有限公司 图像分割方法、装置、设备及存储介质
CN112365512A (zh) * 2020-11-18 2021-02-12 南开大学 训练图像分割模型的方法、图像分割的方法及其装置
CN112598686A (zh) * 2021-03-03 2021-04-02 腾讯科技(深圳)有限公司 图像分割方法、装置、计算机设备及存储介质

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JI WEI; YU SHUANG; WU JUNDE; MA KAI; BIAN CHENG; BI QI; LI JINGJING; LIU HANRUO; CHENG LI; ZHENG YEFENG: "Learning Calibrated Medical Image Segmentation via Multi-rater Agreement Modeling", 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 20 June 2021 (2021-06-20), pages 12336 - 12346, XP034010538, DOI: 10.1109/CVPR46437.2021.01216 *
See also references of EP4287117A4

Also Published As

Publication number Publication date
EP4287117A1 (en) 2023-12-06
US20230106468A1 (en) 2023-04-06
CN112598686A (zh) 2021-04-02
EP4287117A4 (en) 2024-06-12
CN112598686B (zh) 2021-06-04

Similar Documents

Publication Publication Date Title
WO2022183984A1 (zh) 图像分割方法、装置、计算机设备及存储介质
CN109377544B (zh) 一种人脸三维图像生成方法、装置和可读介质
CN110162670B (zh) 用于生成表情包的方法和装置
CN113658309B (zh) 三维重建方法、装置、设备以及存储介质
CN114820905B (zh) 虚拟形象生成方法、装置、电子设备及可读存储介质
CN111598168B (zh) 图像分类方法、装置、计算机设备及介质
JP7384943B2 (ja) 文字生成モデルのトレーニング方法、文字生成方法、装置、機器及び媒体
CN112839223B (zh) 图像压缩方法、装置、存储介质及电子设备
CN111680123B (zh) 对话模型的训练方法、装置、计算机设备及存储介质
CN112069309A (zh) 信息获取方法、装置、计算机设备及存储介质
CN109977905B (zh) 用于处理眼底图像的方法和装置
US20220391425A1 (en) Method and apparatus for processing information
WO2024012251A1 (zh) 语义分割模型训练方法、装置、电子设备及存储介质
CN112785493A (zh) 模型的训练方法、风格迁移方法、装置、设备及存储介质
CN115205925A (zh) 表情系数确定方法、装置、电子设备及存储介质
CN115222862A (zh) 虚拟人衣物生成方法、装置、设备、介质及程序产品
CN117078509A (zh) 模型训练方法、照片生成方法及相关设备
WO2022095640A1 (zh) 对图像中的树状组织进行重建的方法、设备及存储介质
CN114792355A (zh) 虚拟形象生成方法、装置、电子设备和存储介质
CN111583102B (zh) 人脸图像处理方法、装置、电子设备及计算机存储介质
CN116402914A (zh) 用于确定风格化图像生成模型的方法、装置及产品
CN111414737A (zh) 故事生成模型训练方法、装置、设备及存储介质
CN114758130B (zh) 图像处理及模型训练方法、装置、设备和存储介质
CN116883708A (zh) 图像分类方法、装置、电子设备及存储介质
CN115862794A (zh) 病历文本生成方法、装置、计算机设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22762449

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2022762449

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2022762449

Country of ref document: EP

Effective date: 20230831

NENP Non-entry into the national phase

Ref country code: DE