CN114463361A

CN114463361A - Network model training method, device, equipment, medium and program product

Info

Publication number: CN114463361A
Application number: CN202210138535.XA
Authority: CN
Inventors: 刘佳; 王兆玮; 孙钦佩; 杨叶辉; 王晓荣
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2022-02-15
Filing date: 2022-02-15
Publication date: 2022-05-10

Abstract

The present disclosure provides a network model training method and device, an electronic device, a non-transitory computer-readable storage medium storing computer instructions, and a computer program product, which relate to the technical field of deep learning, and in particular, to the technical field of image segmentation. The present disclosure firstly generates a probability map corresponding to the target object based on the annotation information of a plurality of first sample images for the target object; and then uses the probability map and unlabeled images to perform pre-training, so as to obtain a pre-training including high-quality network parameters. On this basis, a small number of labeled medical images are used to further train the pre-trained model, and the obtained image segmentation model has high image segmentation accuracy.

Description

Network model training method, device, equipment, medium and program product

技术领域technical field

本公开涉及深度学习技术领域，尤其涉及图像分割技术领域，公开了一种网络模型训练方法及装置、电子设备、存储有计算机指令的非瞬时计算机可读存储介质、计算机程序产品。The present disclosure relates to the technical field of deep learning, and in particular, to the technical field of image segmentation, and discloses a network model training method and device, an electronic device, a non-transitory computer-readable storage medium storing computer instructions, and a computer program product.

背景技术Background technique

医学图像分割领域主要可分为两大类型的分割，一种是结构分割(例如脑组织、肺部、肝脏及心脏等)，另一种是病灶分割。近年来深度学习在医学图像分割领域取得了非常好的效果，具有高鲁棒、精度更高、速度更快的优势。一般来讲，深度学习需要大量的标注数据来完成模型的训练，然而由于医学图像主要为三维影像，且医学图像质量较传统自然图像相比对比度差，标注难度大、标注非常耗时，导致医学图像分割领域，标注数据量都比较少，这极大地限制了深度学习在医学图像分割领域中的应用。The field of medical image segmentation can be mainly divided into two types of segmentation, one is structural segmentation (such as brain tissue, lungs, liver and heart, etc.), and the other is lesion segmentation. In recent years, deep learning has achieved very good results in the field of medical image segmentation, with the advantages of high robustness, higher accuracy and faster speed. Generally speaking, deep learning requires a large amount of labeled data to complete the training of the model. However, because medical images are mainly three-dimensional images, and the quality of medical images is poorer than that of traditional natural images, the contrast is difficult and time-consuming. In the field of image segmentation, the amount of labeled data is relatively small, which greatly limits the application of deep learning in the field of medical image segmentation.

发明内容SUMMARY OF THE INVENTION

本公开至少提供了一种网络模型训练方法、装置、设备、程序产品以及存储介质。The present disclosure provides at least a network model training method, apparatus, device, program product, and storage medium.

根据本公开的一方面，提供了一种网络模型训练方法，包括：According to an aspect of the present disclosure, a network model training method is provided, including:

基于多张第一样本图像针对目标对象的标注信息，生成目标对象对应的概率图谱；Generate a probability map corresponding to the target object based on the annotation information of the plurality of first sample images for the target object;

基于概率图谱和多张第二样本图像，以恢复各张第二样本图像中被掩膜掉的图像块为目标进行模型训练，得到预训练模型；Based on the probability map and multiple second sample images, model training is performed with the goal of restoring the masked image blocks in each second sample image to obtain a pre-training model;

基于多张第三样本图像和第三样本图像对目标对象的标注信息，对预训练模型进行训练，得到针对目标对象的图像分割模型。Based on the multiple third sample images and the labeling information of the target object by the third sample images, the pre-training model is trained to obtain an image segmentation model for the target object.

根据本公开的另一方面，提供了一种网络模型训练方法，包括：According to another aspect of the present disclosure, a network model training method is provided, comprising:

基于概率图谱和多张第二样本图像，以恢复各张第二样本图像中被掩膜掉的图像块为目标进行模型训练，得到图像恢复模型。Based on the probability map and multiple second sample images, model training is performed with the goal of restoring the masked image blocks in each second sample image, and an image restoration model is obtained.

根据本公开的另一方面，提供了一种网络模型训练装置，包括：According to another aspect of the present disclosure, a network model training apparatus is provided, comprising:

第一图谱确定模块，用于基于多张第一样本图像针对目标对象的标注信息，生成目标对象对应的概率图谱；a first atlas determination module, configured to generate a probability atlas corresponding to the target object based on the annotation information of the plurality of first sample images for the target object;

预训练模块，用于基于概率图谱和多张第二样本图像，以恢复各张第二样本图像中被掩膜掉的图像块为目标进行模型训练，得到预训练模型；The pre-training module is used to perform model training based on the probability map and multiple second sample images, with the goal of restoring the masked image blocks in each second sample image, to obtain a pre-training model;

分割模型训练模块，用于基于多张第三样本图像和第三样本图像对目标对象的标注信息，对预训练模型进行训练，得到针对目标对象的图像分割模型：The segmentation model training module is used to train the pre-training model based on the multiple third sample images and the annotation information of the third sample images to the target object, and obtain the image segmentation model for the target object:

第二图谱确定模块，用于基于多张第一样本图像针对目标对象的标注信息，生成目标对象对应的概率图谱；A second atlas determination module, configured to generate a probability atlas corresponding to the target object based on the annotation information of the plurality of first sample images for the target object;

恢复模型训练模块，用于基于概率图谱和多张第二样本图像，以恢复各张第二样本图像中被掩膜掉的图像块为目标进行模型训练，得到图像恢复模型。The recovery model training module is used to perform model training based on the probability map and the plurality of second sample images with the goal of recovering the masked image blocks in each second sample image to obtain an image recovery model.

根据本公开的另一方面，提供了一种电子设备，包括：According to another aspect of the present disclosure, there is provided an electronic device, comprising:

至少一个处理器；以及at least one processor; and

与该至少一个处理器通信连接的存储器；其中，a memory communicatively coupled to the at least one processor; wherein,

该存储器存储有可被该至少一个处理器执行的指令，该指令被该至少一个处理器执行，以使该至少一个处理器能够执行本公开任一实施例中的方法。The memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method in any of the embodiments of the present disclosure.

根据本公开的另一方面，提供了一种存储有计算机指令的非瞬时计算机可读存储介质，该计算机指令用于使计算机执行本公开任一实施例中的方法。According to another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the method in any of the embodiments of the present disclosure.

根据本公开的另一方面，提供了一种计算机程序产品，包括计算机程序/指令，该计算机程序/指令被处理器执行时实现本公开任一实施例中的方法。According to another aspect of the present disclosure, there is provided a computer program product comprising computer programs/instructions that, when executed by a processor, implement the method in any of the embodiments of the present disclosure.

根据本公开的技术利用概率图谱和没有标注的图像，即第二样本图像进行预训练，能够得到包含质量较高的网络参数的预训练模型，在此基础之上，再利用少量的有标注的医学图像，即第三样本图像对预训练模型进行进一步地训练，得到的图像分割模型能够确定分割精度较高的医学图像的分割结果。According to the technology of the present disclosure, using a probability map and an unlabeled image, that is, a second sample image for pre-training, a pre-training model containing high-quality network parameters can be obtained. The medical image, that is, the third sample image, further trains the pre-training model, and the obtained image segmentation model can determine the segmentation result of the medical image with higher segmentation accuracy.

应当理解，本部分所描述的内容并非旨在标识本公开的实施例的关键或重要特征，也不用于限制本公开的范围。本公开的其它特征将通过以下的说明书而变得容易理解。It should be understood that what is described in this section is not intended to identify key or critical features of embodiments of the disclosure, nor is it intended to limit the scope of the disclosure. Other features of the present disclosure will become readily understood from the following description.

附图说明Description of drawings

附图用于更好地理解本方案，不构成对本公开的限定。其中：The accompanying drawings are used for better understanding of the present solution, and do not constitute a limitation to the present disclosure. in:

图1是根据本公开网络模型训练方法的流程图之一；1 is one of the flow charts of the network model training method according to the present disclosure;

图2是根据本公开预训练模型的训练方法的流程图；2 is a flowchart of a training method of a pre-trained model according to the present disclosure;

图3是根据本公开编码器的结构示意图；3 is a schematic structural diagram of an encoder according to the present disclosure;

图4是根据本公开图像分割模型的训练方法的流程图；4 is a flowchart of a training method for an image segmentation model according to the present disclosure;

图5是根据本公开网络模型训练方法的流程图之二；5 is the second flow chart of the network model training method according to the present disclosure;

图6是根据本公开网络模型训练方法的流程图之三；6 is the third flow chart of the network model training method according to the present disclosure;

图7是根据本公开网络模型训练方法的流程图之四；Fig. 7 is the fourth flow chart of the network model training method according to the present disclosure;

图8是根据本公开网络模型训练装置的结构示意图之一；8 is one of the schematic structural diagrams of the network model training apparatus according to the present disclosure;

图9是根据本公开网络模型训练装置的结构示意图之二；FIG. 9 is the second schematic structural diagram of the network model training apparatus according to the present disclosure;

图10是根据本公开的电子设备的结构示意图。FIG. 10 is a schematic structural diagram of an electronic device according to the present disclosure.

具体实施方式Detailed ways

以下结合附图对本公开的示范性实施例做出说明，其中包括本公开实施例的各种细节以助于理解，应当将它们认为仅仅是示范性的。因此，本领域普通技术人员应当认识到，可以对这里描述的实施例做出各种改变和修改，而不会背离本公开的范围和精神。同样，为了清楚和简明，以下的描述中省略了对公知功能和结构的描述。Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding and should be considered as exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted from the following description for clarity and conciseness.

针对该技术缺陷，本公开至少提供了一种网络模型训练方法、装置、设备、程序产品以及存储介质。本公开利用概率图谱和没有标注的图像，即第二样本图像进行预训练，能够得到包含质量较高的网络参数的预训练模型，在此基础之上，再利用少量的有标注的医学图像，即第三样本图像对预训练模型进行进一步地训练，得到的图像分割模型能够确定分割精度较高的医学图像的分割结果。In view of this technical defect, the present disclosure provides at least a network model training method, apparatus, device, program product, and storage medium. The present disclosure uses the probability map and unlabeled images, that is, the second sample image, for pre-training, and can obtain a pre-training model containing high-quality network parameters. That is, the third sample image further trains the pre-training model, and the obtained image segmentation model can determine the segmentation result of the medical image with higher segmentation accuracy.

下面通过具体的实施例对本公开的网络模型训练方法进行说明。The network model training method of the present disclosure will be described below through specific embodiments.

图1示出了本公开实施例的网络模型训练方法的流程图，该实施例的执行主体可以是具有计算能力的设备。如图1所示，本公开实施例的网络模型训练方法可以包括如下步骤：FIG. 1 shows a flowchart of a method for training a network model according to an embodiment of the present disclosure, and an executive body of this embodiment may be a device with computing capabilities. As shown in FIG. 1 , the network model training method according to the embodiment of the present disclosure may include the following steps:

S110、基于多张第一样本图像针对目标对象的标注信息，生成目标对象对应的概率图谱。S110. Generate a probability map corresponding to the target object based on the annotation information of the plurality of first sample images for the target object.

上述第一样本图像可以是有标注信息的医学图像，例如可以是有肝脏对应的标注信息的腹部图像。上述标注信息具体可以包括胸部图像中各个像素段点是否属于肝脏的信息。The above-mentioned first sample image may be a medical image with labeled information, for example, an abdominal image with labeled information corresponding to the liver. The above-mentioned labeling information may specifically include information on whether each pixel segment point in the chest image belongs to the liver.

概率图谱包括预设尺寸的图像中各个像素点属于目标对象的概率，上述预设尺寸的图像与下述第二样本图像具有相同的分辨率和尺寸，这样才能结合概率图谱和各张第二样本图像进行模型训练，得到具有较高的图像还原精度的预训练模型。The probability map includes the probability that each pixel in the image of the preset size belongs to the target object. The image of the preset size has the same resolution and size as the second sample image below, so that the probability map and each second sample can be combined. The image is used for model training, and a pre-trained model with high image restoration accuracy is obtained.

S120、基于概率图谱和多张第二样本图像，以恢复各张第二样本图像中被掩膜掉的图像块为目标进行模型训练，得到预训练模型。S120. Based on the probability map and the plurality of second sample images, perform model training with the goal of restoring the masked image blocks in each second sample image, to obtain a pre-trained model.

上述第二样本图像是没有目标对象的标注信息的图像，例如可以是没有肝脏对应的标注信息的腹部图像。第二样本图像与第一样本图像的来源可以不相同，但是两者均需要包括目标对象，例如，两者均需要包括肝脏。The above-mentioned second sample image is an image without labeling information of the target object, for example, it may be an abdominal image without labeling information corresponding to the liver. The sources of the second sample image and the first sample image may be different, but both need to include the target object, for example, both need to include the liver.

由于第二样本图像不需要标注信息，因此第二样本图像比较容易获得，用于训练预训练模型的第二样本图像的数量比较多，预训练模型能够得到充分的训练，其图像恢复能力或图像还原能力较强，精度较高。预训练模型具有较高的图像还原能力表示其能够提取到较为准确地图像特征，以此为指导进一步对预训练模型进行训练，能够得到具有较高的图像分割精度的图像分割模型。Since the second sample image does not need labeling information, the second sample image is relatively easy to obtain, the number of second sample images used to train the pre-training model is relatively large, the pre-training model can be fully trained, and its image restoration ability or image Strong reduction ability and high precision. The high image restoration ability of the pre-trained model means that it can extract more accurate image features. Based on this, the pre-trained model is further trained, and an image segmentation model with high image segmentation accuracy can be obtained.

上述目标对象在对应的图像中的位置相对固定，因此结合概率图谱对预训练模型进行训练，能够提高预训练模型恢复目标对象对应的图像块的精度。例如，肝脏在腹部图像中的位置相对固定，概率图谱能够较为准确地表征对应的像素点是否属于肝脏，结合肝脏对应的概率图谱对预训练模型进行训练，能够提高预训练模型恢复肝脏对应的图像块的精度。The position of the above-mentioned target object in the corresponding image is relatively fixed. Therefore, training the pre-training model in combination with the probability atlas can improve the accuracy of the pre-training model in restoring the image block corresponding to the target object. For example, the position of the liver in the abdominal image is relatively fixed, and the probability map can more accurately characterize whether the corresponding pixel belongs to the liver. Training the pre-training model in combination with the probability map corresponding to the liver can improve the recovery of the image corresponding to the liver by the pre-training model. The precision of the block.

S130、基于多张第三样本图像和第三样本图像对目标对象的标注信息，对预训练模型进行训练，得到针对目标对象的图像分割模型。S130: Train the pre-training model based on the plurality of third sample images and the labeling information of the third sample images on the target object to obtain an image segmentation model for the target object.

上述第三样本图像可以是有标注信息的医学图像，例如可以是有肝脏对应的标注信息的腹部图像。第三样本图像可以是与上述第一样本图像相同的图像，也可以是不同的图像，本公开对此并不进行限定The above-mentioned third sample image may be a medical image with labeled information, for example, may be an abdominal image with labeled information corresponding to the liver. The third sample image may be the same image as the above-mentioned first sample image, or may be a different image, which is not limited in the present disclosure

利用第三样本图像对目标对象的标注信息对目标对象的图像分割进行有监督的学习，能够得到对目标对象具有较高的分割精度的图像分割模型。Using the annotation information of the target object by the third sample image to perform supervised learning on the image segmentation of the target object, an image segmentation model with high segmentation accuracy for the target object can be obtained.

上述预训练模型是结合概率图谱训练得到的，其能够提取到较为准确地图像特征，在预训练模型的基础上，即将预训练模型的参数作为初始参数，进一步进行分割模型的训练，实现了将多张第一样本图像中的先验信息，即概率图谱中的信息融合到第二样本图像训练得到的预训练模型中，来提升该预训练模型用于图像分割中的迁移能力，从而能够提升对图像中对象或结构的分割精度。The above-mentioned pre-training model is obtained by combining the probability map training, which can extract more accurate image features. The prior information in the multiple first sample images, that is, the information in the probability map, is fused into the pre-training model obtained by training the second sample image to improve the transfer ability of the pre-training model for image segmentation, so as to be able to Improve the accuracy of segmentation of objects or structures in images.

在一些实施例中，可以利用如下步骤生成上述概率图谱：In some embodiments, the above-mentioned probability map can be generated by the following steps:

首先，针对每张第一样本图像，基于第一样本图像针对目标对象的标注信息，生成第一样本图像针对目标对象的掩膜图像；之后，基于各张第一样本图像对应的掩膜图像，生成目标对象对应的概率图谱。First, for each first sample image, based on the annotation information of the first sample image for the target object, a mask image of the first sample image for the target object is generated; Mask the image to generate a probability map corresponding to the target object.

上述掩膜图像与第二样本图像的尺寸和分辨率相同。The above mask image has the same size and resolution as the second sample image.

上述掩膜图像可以是表征其中的各个像素点是否为目标对象的二值图像，因此在确定该掩膜图像的时候需要基于第一样本图像针对目标对象的标注信息。具体地，在某一像素点的标注信息表示该像素点属于目标对象时，该像素点在掩膜图像中的像素值为1；在该像素点的标注信息表示该像素点不属于目标对象时，该像素点在掩膜图像中的像素值为0。The above-mentioned mask image may be a binary image representing whether each pixel in the mask image is a target object. Therefore, when determining the mask image, annotation information for the target object needs to be based on the first sample image. Specifically, when the label information of a pixel indicates that the pixel belongs to the target object, the pixel value of the pixel in the mask image is 1; when the label information of the pixel indicates that the pixel does not belong to the target object , the pixel value of this pixel in the mask image is 0.

在得到各张第一样本图像的掩膜图像之后，将各张掩膜图像中相同位置的像素点的像素值进行求和后取平均的运算，得到对应像素点是否属于目标对象的概率，例如，可以利用如下公式确定某一像素点的概率：After obtaining the mask images of each first sample image, the pixel values of the pixels at the same position in each mask image are summed and then averaged to obtain the probability of whether the corresponding pixel belongs to the target object, For example, the probability of a pixel can be determined using the following formula:

式中，P_posion表示位置为(x,y,z)的像素点的概率；N表示掩膜图像的数量，(x,y,z)表示该像素点的位置坐标，I_i(x,y,z)表示第i张掩膜图像中位置为(x,y,z)的像素点的像素值。In the formula, P _posion represents the probability of the pixel point at the position (x, y, z); N represents the number of mask images, (x, y, z) represents the position coordinate of the pixel point, I _{i(x, y , z)} represents the pixel value of the pixel at the position (x, y, z) in the ith mask image.

根据上述方式可以确定各个像素点的是否属于目标对象的较为准确地概率，之后利用各个像素点的概率形成概率图谱。上述概率图谱可以准确地反映目标对象在图像中的位置信息，在目标对象的分割和还原中能起到很好的指导作用。According to the above method, a relatively accurate probability of whether each pixel point belongs to the target object can be determined, and then a probability map is formed by using the probability of each pixel point. The above probability map can accurately reflect the position information of the target object in the image, and can play a good guiding role in the segmentation and restoration of the target object.

由于不同的第一样本图像的分辨率和尺寸可能不同，为了提高生成的概率图谱的准确性，在生成各第一样本图像的掩膜图像前，可以分别对每张第一样本图像进行预处理，以统一所有第一样本图像到预设分辨率和预设尺寸。之后再基于预处理后的第一样本图像，生成对应的掩膜图像。Since the resolution and size of different first sample images may be different, in order to improve the accuracy of the generated probability map, before generating the mask image of each first sample image, each first sample image can be Preprocessing is performed to unify all first sample images to a preset resolution and a preset size. Then, based on the preprocessed first sample image, a corresponding mask image is generated.

上述预处理可以包括第一预处理操作和第二预处理操作。示例性地，可以利用如下步骤对某一张第一样本图像进行预处理，并生成第一样本图像对应的掩膜图像：The above-mentioned preprocessing may include a first preprocessing operation and a second preprocessing operation. Exemplarily, the following steps can be used to preprocess a certain first sample image, and generate a mask image corresponding to the first sample image:

首先，对该第一样本图像进行第一预处理操作，得到具有预设分辨率的第一图像；之后，对第一图像进行第二预处理操作，得到具有预设尺寸的第二图像；最后，基于第一样本图像针对目标对象的标注信息和第二图像，生成目标对象对应的掩膜图像。First, a first preprocessing operation is performed on the first sample image to obtain a first image with a preset resolution; then, a second preprocessing operation is performed on the first image to obtain a second image with a preset size; Finally, based on the annotation information of the first sample image for the target object and the second image, a mask image corresponding to the target object is generated.

示例性地，预设分辨率可以是1mm*1mm*1mm的图像分辨率；具体可以利用三线性插值算法将第一样本图像变统一为具有预设分辨率的第一图像。预设尺寸可以是所有第一样本图像的最大尺寸，第二预处理操作可以是padding补齐方法。Exemplarily, the preset resolution may be an image resolution of 1mm*1mm*1mm; specifically, a trilinear interpolation algorithm may be used to unify the first sample image into a first image with a preset resolution. The preset size may be the maximum size of all the first sample images, and the second preprocessing operation may be a padding method.

在一些实施例中可以利用如下步骤进行预训练模型的训练：In some embodiments, the following steps can be used to train the pre-trained model:

首先，对概率图谱中包括的各个概率进行取反操作，得到目标图谱；之后，基于目标图谱和多张第二样本图像，以恢复各张第二样本图像中被掩膜掉的图像块为目标进行模型训练，直到满足训练的第一截止条件，得到预训练模型。上述第一截止条件具体可以是迭代次数，也可以是预训练模型的图像恢复精度。First, invert each probability included in the probability map to obtain the target map; then, based on the target map and multiple second sample images, the goal is to restore the masked image blocks in each second sample image Model training is performed until the first cut-off condition for training is met, resulting in a pre-trained model. The above-mentioned first cut-off condition may specifically be the number of iterations, or may be the image restoration accuracy of the pre-trained model.

由于概率图谱中概率的值越大的像素点表明该像素点为目标对象的概率越大，同时也是最容易学到的，而数值越小表明该像素点的分割难度一般越大，因此，为了更精确地学习到目标对象的边缘，可以将概率图谱中各个概率进行取反操作，之后进行模型训练。取反操作的目的是将原来较大概率变为较小的概率。Since a pixel with a larger probability value in the probability map indicates that the pixel has a higher probability of being the target object, it is also the easiest to learn, while a smaller value indicates that the pixel is generally more difficult to segment. Therefore, in order to To learn the edge of the target object more accurately, you can invert each probability in the probability map, and then perform model training. The purpose of the negation operation is to change the original larger probability into a smaller probability.

示例性地，上述取反操作具体可以是对于值为零的概率不进行取反操作，对于不为零的概率，计算1减去该概率后得到的值，并将得到的值作为取反操作的结果，构成目标图谱中的概率。Exemplarily, the above-mentioned inversion operation may specifically be that the inversion operation is not performed for the probability that the value is zero, and for the probability that the value is not zero, the value obtained after subtracting the probability from 1 is calculated, and the obtained value is used as the inversion operation. The result constitutes the probability in the target map.

在进行预训练模型的训练之前，需要将各张第二样本图像分别分割成多个图像块，并分别掩膜掉各张第二样本图像中的至少一个图像块。示例性地，针对某一张第二样本图像，可以将其分割形成的多个图像块排列成一个队列，之后将该队列中的各个图像块的顺序打乱，之后将排列在该队列的队尾的75％的图像块掩膜掉。Before training the pre-training model, each second sample image needs to be divided into a plurality of image blocks, and at least one image block in each second sample image needs to be masked out respectively. Exemplarily, for a certain second sample image, a plurality of image blocks formed by dividing it may be arranged into a queue, and then the order of the image blocks in the queue is shuffled, and then the queues arranged in the queue are arranged. 75% of the image blocks of the tail are masked off.

在进行预训练模型的训练时，将各张第二样本图像剩余的图像块输入待训练的预训练模型，预训练模型输出各张第二样本图像对应的预测还原图像；之后，基于各张第二样本图像、各张预测还原图像以及目标图谱，确定图像恢复损失信息；最后，基于图像恢复损失信息，以恢复各张第二样本图像中被掩膜掉的图像块为目标，对待训练的预训练模型进行训练，得到训练好的预训练模型。During the training of the pre-training model, the remaining image blocks of each second sample image are input into the pre-training model to be trained, and the pre-training model outputs the predicted restored image corresponding to each second sample image; The two-sample images, each predicted restoration image, and the target atlas are used to determine the image restoration loss information; finally, based on the image restoration loss information, the target is to restore the masked image blocks in each second sample image, and the pre-training The training model is trained to obtain a trained pre-trained model.

示例性地，在确定上述图像恢复损失信息时，首先基于各张第二样本图像、各张预测还原图像确定被掩膜掉的各个图像块中各个像素点的损失信息，之后，利用对应像素点在目标图谱中的概率对损失信息进行加权，这样就促使预训练模型训练生成的时候对一些比较难、比较少见的像素点，例如边缘像素点进行重点关注，从而能够有效提高预训练模型的训练精度。最后，可以将掩膜掉的各个图像块中各个像素点加权后的损失信息进行求和运算，得到上述图像恢复损失信息。Exemplarily, when determining the above image restoration loss information, first determine the loss information of each pixel point in each masked image block based on each second sample image and each predicted restoration image, and then use the corresponding pixel point. The probability in the target map weights the loss information, which prompts the pre-training model to focus on some difficult and rare pixels, such as edge pixels, which can effectively improve the training of the pre-training model. precision. Finally, the weighted loss information of each pixel in each masked image block can be summed to obtain the above image restoration loss information.

示例性地，如图2所示，可以将某一张第二样本图像2A掩膜掉部分图像块之后剩余的图像块2B或图像块对应的信息输入预训练模型中的编码器中，编码器对输入的图像块或信息进行图像特征的处理，得到编码信息2C，并基于得到的编码信息将掩膜掉的图像块的信息按照其在图像中的位置插入没有被掩膜掉的图像块的信息形成的队列或列表中。之后，将进行信息插入操作后的队列或列表中的信息输入预训练模型中的解码器中，经过解码器对输入的信息或特征进行处理后，得到解码信息2D，该解码信息中包括恢复或还原后的图像块的信息，当然也可以还包括未被掩膜掉的图像块的信息。最后，基于解码信息能够生成还原后的第二样本图像2E，即上述预测还原图像。Exemplarily, as shown in FIG. 2 , the remaining image block 2B after masking out some image blocks of a second sample image 2A or the information corresponding to the image blocks can be input into the encoder in the pre-training model, and the encoder Perform image feature processing on the input image block or information to obtain coding information 2C, and based on the obtained coding information, the information of the masked image block is inserted into the image block that has not been masked out according to its position in the image. information in a queue or list. After that, the information in the queue or list after the information insertion operation is input into the decoder in the pre-training model, and after the decoder processes the input information or features, the decoded information 2D is obtained, and the decoded information includes recovery or Of course, the information of the restored image block may also include the information of the image block that has not been masked out. Finally, based on the decoded information, a restored second sample image 2E, that is, the above-mentioned predicted restored image can be generated.

上述所有的掩膜掉的图像块patch共同由一个可学习的向量表示，也即是所有的掩膜掉的patch共享这个向量，以让预训练模型知道这个位置是被掩膜掉的。All the above masked image patches are collectively represented by a learnable vector, that is, all masked patches share this vector, so that the pre-training model knows that this position is masked.

示例性地，可以利用如图3所示的编码器对图像块或图像块的信息进行编码处理。Exemplarily, the image block or the information of the image block may be encoded by using the encoder as shown in FIG. 3 .

上述预训练模型的训练过程不需要监督信息，即不需要标注信息，是自监督学习和训练的过程，降低了对训练样本的要求，利用未标注的训练样本即可，这样就比较容易获得大量的训练样本，有利于提高训练精度。The training process of the above pre-training model does not require supervised information, that is, does not require labeling information. It is a process of self-supervised learning and training, which reduces the requirements for training samples. Unlabeled training samples can be used, so it is easier to obtain a large number of training samples. The training samples are beneficial to improve the training accuracy.

在一些实施例中，上述基于多张第三样本图像和第三样本图像对目标对象的标注信息，对预训练模型进行训练，得到针对目标对象的图像分割模型，具体可以利用如下步骤实现：In some embodiments, the pre-training model is trained based on the multiple third sample images and the labeling information of the third sample images on the target object to obtain an image segmentation model for the target object. Specifically, the following steps can be used to achieve:

首先将各张第三样本图像分割成多个图像块，并将各张第三样本图像对应的图像块输入预训练模型，得到各张第三样本图像对应的预测分割图像；之后，基于各张第三样本图像对目标对象的标注信息、各张预测分割图像，确定图像分割损失信息；最后，基于图像分割损失信息，对预训练模型进行训练，直到满足训练的第二截止条件，得到针对目标对象的图像分割模型。上述第二截止条件具体可以是迭代次数，也可以是图像分割模型的分割精度。First, each third sample image is divided into multiple image blocks, and the image blocks corresponding to each third sample image are input into the pre-training model to obtain the predicted segmented images corresponding to each third sample image; The third sample image has the annotation information of the target object and each predicted segmentation image, and determines the image segmentation loss information; finally, based on the image segmentation loss information, the pre-training model is trained until the second cut-off condition of training is met, and the target is obtained. An image segmentation model for objects. The above-mentioned second cutoff condition may specifically be the number of iterations, or may be the segmentation accuracy of the image segmentation model.

示例性地，如图4所示，将某一张第三样本图像4A对应的图像块4B或图像块的信息输入预训练模型中的编码器中，编码器对输入的图像块或信息进行编码处理，得到编码信息4C，之后对得到的编码信息4C进一步进行信息处理，得到处理后的信息4D，之后将处理后的信息4D输入预训练模型中的解码器中，经过解码器对输入的信息或特征进行处理后，得到解码信息4E，该解码信息4E中包括目标对象的分割信息，最后，基于解码信息4E能够生成预测分割图像4F。Exemplarily, as shown in FIG. 4 , the information of the image block 4B or the image block corresponding to a certain third sample image 4A is input into the encoder in the pre-training model, and the encoder encodes the input image block or information. processing to obtain coded information 4C, and then further information processing is performed on the obtained coded information 4C to obtain processed information 4D, and then the processed information 4D is input into the decoder in the pre-training model, and the input information is processed by the decoder. Or after the feature is processed, decoded information 4E is obtained, the decoded information 4E includes segmentation information of the target object, and finally, a predicted segmented image 4F can be generated based on the decoded information 4E.

上述图像分割损失信息包括多个类别的图像分割子损失信息，例如可是交叉熵子损失信息、Dice子损失信息等。示例性地可以利用如下步骤基于上述多个类别的图像分割子损失信息对预训练模型进行训练，得到针对目标对象的图像分割模型：The above-mentioned image segmentation loss information includes multiple categories of image segmentation sub-loss information, such as cross-entropy sub-loss information, Dice sub-loss information, and the like. Exemplarily, the following steps can be used to train the pre-training model based on the image segmentation sub-loss information of the above-mentioned multiple categories to obtain an image segmentation model for the target object:

首先，基于多个类别的图像分割子损失信息，确定目标损失信息；最后，基于目标损失信息，对预训练模型进行训练，得到针对目标对象的图像分割模型。First, based on the sub-loss information of image segmentation of multiple categories, the target loss information is determined; finally, based on the target loss information, the pre-training model is trained to obtain the image segmentation model for the target object.

在确定目标损失信息的时候，可以将多个类别的图像分割子损失信息的和作为目标损失信息。当然也可以对多个类别的图像分割子损失信息进行加权求和的运算来得到目标损失信息。When determining the target loss information, the sum of the sub-loss information of image segmentation of multiple categories can be used as the target loss information. Of course, the target loss information can also be obtained by performing a weighted sum operation on the image segmentation sub-loss information of multiple categories.

上述图像分割模型的初始参数是预训练模型训练得到的参数，即是使用高质量的初始参数进行图像分割模型的训练，能够减少所要使用的有标注信息的训练样本的数量，利用迁移学习方式不仅提高了图像分割的精度，还能够减少所需要的训练样本的数量，适用于进行医学图像分割的分割模型的训练。例如，在具体训练过程中可以采用adam优化器，其学习率为10-5,训练轮次可以为200轮，最终能够得到一个用于进行肝脏分割的图像分割模型。The initial parameters of the above image segmentation model are the parameters obtained from the training of the pre-training model, that is, the use of high-quality initial parameters to train the image segmentation model can reduce the number of training samples with labeled information to be used. The accuracy of image segmentation is improved, the number of required training samples can also be reduced, and the method is suitable for the training of segmentation models for medical image segmentation. For example, the adam optimizer can be used in the specific training process, its learning rate can be 10-5, and the training round can be 200 rounds, and finally an image segmentation model for liver segmentation can be obtained.

综上，如图5所示，本公开上述的网络模型训练方法可以包括如下步骤：To sum up, as shown in FIG. 5 , the above-mentioned network model training method of the present disclosure may include the following steps:

第一、利用具有针对目标对象的标注信息的多张第一样本图像生成概率图谱；First, using a plurality of first sample images with annotation information for the target object to generate a probability atlas;

第二、利用概率图谱和没有标注信息的多张第二样本图像、以恢复第二样本图像中被掩膜掉的图像块为目标进行自监督学习，得到预训练模型。Second, use the probability map and multiple second sample images without label information to perform self-supervised learning with the goal of restoring the masked image blocks in the second sample image to obtain a pre-training model.

第三、利用具有针对目标对象的标注信息的多张第三样本图像，对预训练模型进行进一步地训练，得到训练好的图像分割模型。Third, the pre-training model is further trained by using a plurality of third sample images with labeling information for the target object to obtain a trained image segmentation model.

其中，如图6所示，上述第一个步骤中，在确定概率图谱时，首先需要生成每张第一样本图像对应的掩膜图像，之后，根据各张掩膜图像和各张第一样本图像的标注信息生成概率图谱。Among them, as shown in FIG. 6 , in the first step above, when determining the probability map, it is first necessary to generate a mask image corresponding to each first sample image, and then, according to each mask image and each first sample image The annotation information of the sample image generates a probability map.

如图6所示，上述第二个步骤中，在对预训练模型进行训练时，首先构建自监督学习框架，之后将各张第二样本图像掩膜掉部分图像块后剩余的图像块或剩余图像块的信息输入待训练的预训练模型中，待训练的预训练模型输出预测还原图像；之后，基于预测还原图像和第二样本图像确定被掩膜掉的各个图像块中各个像素点的损失信息；之后，将上述概率图谱中的概率进行取反操作后与对应的损失信息进行加权处理，最后基于各个加权的损失信息，确定图像恢复损失信息。该图像恢复损失信息可以为加权后的各个损失信息的和。As shown in Figure 6, in the second step above, when training the pre-training model, a self-supervised learning framework is first constructed, and then the remaining image blocks or remaining image blocks after partial image blocks are masked out of each second sample image. The information of the image block is input into the pre-training model to be trained, and the pre-training model to be trained outputs the predicted restored image; after that, the loss of each pixel in each masked image block is determined based on the predicted restored image and the second sample image After that, the probability in the above probability map is reversed and then weighted with the corresponding loss information, and finally the image restoration loss information is determined based on each weighted loss information. The image restoration loss information may be the sum of the weighted loss information.

上述实施例是先在大量通用的没有标注信息的训练样本上训练预训练模型，学习到通用的图像特征，然后再针对性地针对任务进行迁移训练，而预训练模型的训练方法可使用自监督学习技术来实现。利用该预训练模型的训练好的参数作为图像分割模型的初始化参数进行迁移学习，使得最终获得一个高精度的图像分割模型。上述实施例中的方法可以适用于医学图像分割领域中，标注信息较少的场景，结合迁移学习方法中自监督学习获得预训练模型，并对预训练模型获得的方式进行了优化，将目标对象本身的特性及先验知识即概率图谱融合到预训练模型的训练中，促使预训练模型在训练过程中对目标对象中的一些重要区域进行重点关注、重点学习，以促进该预训练模型用于迁移到目标对象分割中能够进一步提升训练得到的图像分割模型的分割精度。In the above embodiment, the pre-training model is first trained on a large number of general training samples without label information, and general image features are learned, and then the migration training is carried out for the task in a targeted manner, and the training method of the pre-training model can use self-supervision. Learn techniques to do it. The trained parameters of the pre-training model are used as the initialization parameters of the image segmentation model to perform migration learning, so that a high-precision image segmentation model is finally obtained. The method in the above embodiment can be applied to the scene with less label information in the field of medical image segmentation, combined with the self-supervised learning in the transfer learning method to obtain the pre-training model, and the way of obtaining the pre-training model is optimized, and the target object is obtained. Its own characteristics and prior knowledge, that is, probability maps, are integrated into the training of the pre-training model, which prompts the pre-training model to focus on and focus on some important areas of the target object during the training process, so as to promote the use of the pre-training model for Transferring to target object segmentation can further improve the segmentation accuracy of the trained image segmentation model.

上述实施例的方法适用于医学图像分割，是基于医学图像分割领域中的对象的特性，提出将概率图谱理论应用都自监督学习框架中，然后将自监督学习框架训练得到的预训练模型用于医学图像分割中进行迁移学习。一方面自监督学习得到的预训练模型的使用能够有效提高对于样本量比较少的情况下的医学图像分割任务的精度，另一方面，将先验知识信息(概率图谱)与自监督学习框架相结合，能够更加有针对性地让自监督学习框架更加关注图像分割中的目标对象，进一步提升预训练模型对下游任务(医学图像分割任务)的迁移能力。The method of the above embodiment is suitable for medical image segmentation. Based on the characteristics of objects in the field of medical image segmentation, it is proposed to apply the probability map theory in the self-supervised learning framework, and then use the pre-trained model trained by the self-supervised learning framework to be used. Transfer learning in medical image segmentation. On the one hand, the use of pre-trained models obtained from self-supervised learning can effectively improve the accuracy of medical image segmentation tasks with a small sample size. Combined, it can make the self-supervised learning framework pay more attention to the target object in image segmentation in a more targeted manner, and further improve the migration ability of the pre-training model to downstream tasks (medical image segmentation tasks).

如图7所示，本公开还提供了一种图像恢复模型的训练方法，具体可以包括如下步骤：As shown in FIG. 7 , the present disclosure also provides a training method for an image restoration model, which may specifically include the following steps:

S710、基于多张第一样本图像针对目标对象的标注信息，生成目标对象对应的概率图谱。S710. Generate a probability map corresponding to the target object based on the annotation information of the plurality of first sample images for the target object.

S720、基于概率图谱和多张第二样本图像，以恢复各张第二样本图像中被掩膜掉的图像块为目标进行模型训练，得到图像恢复模型。S720 , based on the probability map and the plurality of second sample images, perform model training with the goal of restoring the masked image blocks in each second sample image to obtain an image restoration model.

上述步骤S710至S720与上述实施例中的步骤S110至S120相同，图像恢复模型对应于预训练模型，因此对于其中相同的内容不再赘述。The above steps S710 to S720 are the same as the steps S110 to S120 in the above embodiment, and the image restoration model corresponds to the pre-training model, so the same content will not be repeated.

在一些实施例中，概率图谱包括预设尺寸的图像中各个像素点属于目标对象的概率。上述基于概率图谱和多张第二样本图像，以恢复各张第二样本图像中被掩膜掉的图像块为目标进行模型训练，得到图像恢复模型，可以利用如下步骤实现：In some embodiments, the probability map includes the probability that each pixel in an image of a preset size belongs to the target object. The above-mentioned model training is performed based on the probability map and multiple second sample images with the goal of recovering the masked image blocks in each second sample image, and the image restoration model is obtained, which can be realized by the following steps:

首先对概率图谱中包括的各个概率进行取反操作，得到目标图谱；之后，基于目标图谱和多张第二样本图像，以恢复各张第二样本图像中被掩膜掉的图像块为目标进行模型训练，得到图像恢复模型。First, invert the probabilities included in the probability map to obtain the target map; then, based on the target map and multiple second sample images, restore the masked image blocks in each second sample image as the goal. The model is trained to obtain an image restoration model.

利用概率图谱和没有标注的图像进行训练，不仅降低了对训练样本的要求，比较容易获得大量的样本，并且先验知识信息(概率图谱)与自监督学习相结合，能够更加有针对性地让自监督学习更加关注目标对象，得到精度较高的图像恢复模型。Using probabilistic maps and unlabeled images for training not only reduces the requirements for training samples, but also makes it easier to obtain a large number of samples, and the combination of prior knowledge information (probability maps) and self-supervised learning can be more targeted. Self-supervised learning pays more attention to the target object and obtains an image restoration model with higher accuracy.

由于概率图谱中概率的值越大的像素点表明该像素点为目标对象的概率越大，同时也是最容易学到的，而数值越小表明该像素点的分割难度一般越大，因此，为了更精确地学习到目标对象的边缘，可以将概率图谱中各个概率进行取反操作，之后进行模型训练，有利于提高恢复或还原目标对象的边缘的精度。Since a pixel with a larger probability value in the probability map indicates that the pixel has a higher probability of being the target object, it is also the easiest to learn, while a smaller value indicates that the pixel is generally more difficult to segment. Therefore, in order to To learn the edge of the target object more accurately, each probability in the probability map can be reversed, and then model training is performed, which is beneficial to improve the accuracy of restoring or restoring the edge of the target object.

基于同一发明构思，本公开实施例中还提供了一种网络模型训练方法对应的网络模型训练装置，其用于训练图像分割模型，由于本公开实施例中的装置解决问题的原理与本公开实施例上述网络模型训练方法相似，因此装置的实施可以参见方法的实施，重复之处不再赘述。Based on the same inventive concept, the embodiments of the present disclosure also provide a network model training device corresponding to the network model training method, which is used for training an image segmentation model. For example, the above-mentioned network model training method is similar, so the implementation of the device can refer to the implementation of the method, and the repetition will not be repeated.

如图8所示，为本公开实施例所提供的网络模型训练装置的结构示意图，包括：As shown in FIG. 8, a schematic structural diagram of a network model training apparatus provided by an embodiment of the present disclosure includes:

第一图谱确定模块810，用于基于多张第一样本图像针对目标对象的标注信息，生成目标对象对应的概率图谱。The first atlas determination module 810 is configured to generate a probability atlas corresponding to the target object based on the annotation information of the plurality of first sample images for the target object.

预训练模块820，用于基于概率图谱和多张第二样本图像，以恢复各张第二样本图像中被掩膜掉的图像块为目标进行模型训练，得到预训练模型。The pre-training module 820 is configured to perform model training based on the probability map and the plurality of second sample images, with the goal of restoring the masked image blocks in each second sample image, to obtain a pre-training model.

分割模型训练模块830，用于基于多张第三样本图像和第三样本图像对目标对象的标注信息，对预训练模型进行训练，得到针对目标对象的图像分割模型。The segmentation model training module 830 is configured to train the pre-training model based on the plurality of third sample images and the labeling information of the target object by the third sample images to obtain an image segmentation model for the target object.

在一些实施例中，概率图谱包括预设尺寸的图像中各个像素点属于目标对象的概率；In some embodiments, the probability map includes the probability that each pixel in an image of a preset size belongs to the target object;

预训练模块820具体用于：The pre-training module 820 is specifically used for:

对概率图谱中包括的各个概率进行取反操作，得到目标图谱；Invert each probability included in the probability map to obtain the target map;

基于目标图谱和多张第二样本图像，以恢复各张第二样本图像中被掩膜掉的图像块为目标进行模型训练，得到预训练模型。Based on the target atlas and multiple second sample images, model training is performed with the goal of restoring the masked image blocks in each second sample image to obtain a pre-training model.

在一些实施例中，预训练模块820具体用于：In some embodiments, the pre-training module 820 is specifically used to:

针对每张第二样本图像，将第二样本图像分割成多个图像块，并掩膜掉至少一个图像块；For each second sample image, dividing the second sample image into a plurality of image blocks, and masking out at least one image block;

将各张第二样本图像剩余的图像块输入待训练的预训练模型，得到各张第二样本图像对应的预测还原图像；Input the remaining image blocks of each second sample image into the pre-training model to be trained to obtain the predicted restoration image corresponding to each second sample image;

基于各张第二样本图像、各张预测还原图像以及目标图谱，确定图像恢复损失信息；Determine image restoration loss information based on each second sample image, each predicted restoration image, and the target atlas;

基于图像恢复损失信息，以恢复各张第二样本图像中被掩膜掉的图像块为目标，对待训练的预训练模型进行训练，得到训练好的预训练模型。Based on the image restoration loss information, with the goal of restoring the masked image blocks in each second sample image, the pre-training model to be trained is trained to obtain a trained pre-training model.

在一些实施例中，预设尺寸的图像与第二样本图像具有相同的分辨率和尺寸。In some embodiments, the image of the preset size has the same resolution and size as the second sample image.

在一些实施例中，第一图谱确定模块810具体用于：In some embodiments, the first map determination module 810 is specifically configured to:

针对每张第一样本图像，基于第一样本图像针对目标对象的标注信息，生成第一样本图像针对目标对象的掩膜图像；For each first sample image, a mask image of the first sample image for the target object is generated based on the annotation information of the first sample image for the target object;

基于各张第二样本图像对应的掩膜图像，生成目标对象对应的概率图谱。Based on the mask image corresponding to each second sample image, a probability map corresponding to the target object is generated.

对第一样本图像进行第一预处理操作，得到具有预设分辨率的第一图像；performing a first preprocessing operation on the first sample image to obtain a first image with a preset resolution;

对第一图像进行第二预处理操作，得到具有预设尺寸的第二图像；performing a second preprocessing operation on the first image to obtain a second image with a preset size;

基于第一样本图像针对目标对象的标注信息和第二图像，生成目标对象对应的掩膜图像。Based on the annotation information of the first sample image for the target object and the second image, a mask image corresponding to the target object is generated.

在一些实施例中，分割模型训练模块830具体用于：In some embodiments, the segmentation model training module 830 is specifically used to:

将各张第三样本图像分割成多个图像块，并将各张第三样本图像对应的图像块输入预训练模型，得到各张第三样本图像对应的预测分割图像；Dividing each third sample image into a plurality of image blocks, and inputting the image blocks corresponding to each third sample image into the pre-training model to obtain a predicted segmented image corresponding to each third sample image;

基于各张第三样本图像对目标对象的标注信息、各张预测分割图像，确定图像分割损失信息；Determine the image segmentation loss information based on the annotation information of each third sample image to the target object and each predicted segmentation image;

基于图像分割损失信息，对预训练模型进行训练，得到针对目标对象的图像分割模型。Based on the image segmentation loss information, the pre-trained model is trained to obtain an image segmentation model for the target object.

在一些实施例中，图像分割损失信息包括多个类别的图像分割子损失信息；In some embodiments, the image segmentation loss information includes multiple categories of image segmentation sub-loss information;

分割模型训练模块830具体用于：The segmentation model training module 830 is specifically used for:

基于多个类别的图像分割子损失信息，确定目标损失信息；Determine the target loss information based on the image segmentation sub-loss information of multiple categories;

基于目标损失信息，对预训练模型进行训练，得到针对目标对象的图像分割模型。Based on the target loss information, the pre-trained model is trained to obtain an image segmentation model for the target object.

基于同一发明构思，本公开实施例中还提供了一种网络模型训练方法对应的网络模型训练装置，其用于训练图像恢复模型，由于本公开实施例中的装置解决问题的原理与本公开实施例上述网络模型训练方法相似，因此装置的实施可以参见方法的实施，重复之处不再赘述。Based on the same inventive concept, the embodiment of the present disclosure also provides a network model training device corresponding to the network model training method, which is used for training an image restoration model. For example, the above-mentioned network model training method is similar, so the implementation of the device can refer to the implementation of the method, and the repetition will not be repeated.

如图9所示，为本公开实施例所提供的网络模型训练装置的结构示意图，包括：As shown in FIG. 9, a schematic structural diagram of a network model training apparatus provided by an embodiment of the present disclosure includes:

第二图谱确定模块910，用于基于多张第一样本图像针对目标对象的标注信息，生成目标对象对应的概率图谱；The second atlas determination module 910 is configured to generate a probability atlas corresponding to the target object based on the annotation information of the plurality of first sample images for the target object;

恢复模型训练模块920，用于基于概率图谱和多张第二样本图像，以恢复各张第二样本图像中被掩膜掉的图像块为目标进行模型训练，得到图像恢复模型。The restoration model training module 920 is configured to perform model training with the goal of restoring the masked image blocks in each second sample image based on the probability map and the plurality of second sample images to obtain an image restoration model.

恢复模型训练模块920具体用于：The recovery model training module 920 is specifically used for:

基于目标图谱和多张第二样本图像，以恢复各张第二样本图像中被掩膜掉的图像块为目标进行模型训练，得到图像恢复模型。Based on the target atlas and multiple second sample images, model training is performed with the goal of restoring the masked image blocks in each second sample image to obtain an image restoration model.

本公开的技术方案中，所涉及的用户个人信息的获取，存储和应用等，均符合相关法律法规的规定，且不违背公序良俗。In the technical solution of the present disclosure, the acquisition, storage and application of the user's personal information involved are all in compliance with the provisions of relevant laws and regulations, and do not violate public order and good customs.

根据本公开的实施例，本公开还提供了一种电子设备、一种可读存储介质和一种计算机程序产品。According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium, and a computer program product.

图10示出了可以用来实施本公开的实施例的示例电子设备1000的示意性框图。电子设备旨在表示各种形式的数字计算机，诸如，膝上型计算机、台式计算机、工作台、个人数字助理、服务器、刀片式服务器、大型计算机、和其它适合的计算机。电子设备还可以表示各种形式的移动装置，诸如，个人数字处理、蜂窝电话、智能电话、可穿戴设备和其它类似的计算装置。本文所示的部件、它们的连接和关系、以及它们的功能仅仅作为示例，并且不意在限制本文中描述的和/或者要求的本公开的实现。10 shows a schematic block diagram of an example electronic device 1000 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers. Electronic devices may also represent various forms of mobile devices, such as personal digital processors, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions are by way of example only, and are not intended to limit implementations of the disclosure described and/or claimed herein.

如图10所示，设备1000包括计算单元1010，其可以根据存储在只读存储器(ROM)1020中的计算机程序或者从存储单元1080加载到随机访问存储器(RAM)1030中的计算机程序，来执行各种适当的动作和处理。在RAM1030中，还可存储设备1000操作所需的各种程序和数据。计算单元1010、ROM 1020以及RAM 1030通过总线1040彼此相连。输入/输出(I/O)接口1050也连接至总线1040。As shown in FIG. 10 , the device 1000 includes a computing unit 1010 that can be executed according to a computer program stored in a read only memory (ROM) 1020 or a computer program loaded from a storage unit 1080 into a random access memory (RAM) 1030 Various appropriate actions and handling. In the RAM 1030, various programs and data necessary for the operation of the device 1000 can also be stored. The computing unit 1010 , the ROM 1020 , and the RAM 1030 are connected to each other through a bus 1040 . An input/output (I/O) interface 1050 is also connected to the bus 1040 .

设备1000中的多个部件连接至I/O接口1050，包括：输入单元1060，例如键盘、鼠标等；输出单元1070，例如各种类型的显示器、扬声器等；存储单元1080，例如磁盘、光盘等；以及通信单元1090，例如网卡、调制解调器、无线通信收发机等。通信单元1090允许设备1000通过诸如因特网的计算机网络和/或各种电信网络与其他设备交换信息/数据。Various components in the device 1000 are connected to the I/O interface 1050, including: an input unit 1060, such as a keyboard, a mouse, etc.; an output unit 1070, such as various types of displays, speakers, etc.; a storage unit 1080, such as a magnetic disk, an optical disk, etc. ; and a communication unit 1090, such as a network card, modem, wireless communication transceiver, and the like. The communication unit 1090 allows the device 1000 to exchange information/data with other devices through a computer network such as the Internet and/or various telecommunication networks.

计算单元1010可以是各种具有处理和计算能力的通用和/或专用处理组件。计算单元1010的一些示例包括但不限于中央处理单元(CPU)、图形处理单元(GPU)、各种专用的人工智能(AI)计算芯片、各种运行机器学习模型算法的计算单元、数字信号处理器(DSP)、以及任何适当的处理器、控制器、微控制器等。计算单元1010执行上文所描述的各个方法和处理，例如方法网络模型训练方法。例如，在一些实施例中，网络模型训练方法可被实现为计算机软件程序，其被有形地包含于机器可读介质，例如存储单元1080。在一些实施例中，计算机程序的部分或者全部可以经由ROM1020和/或通信单元1090而被载入和/或安装到设备500上。当计算机程序加载到RAM 1030并由计算单元1010执行时，可以执行上文描述的网络模型训练方法的一个或多个步骤。备选地，在其他实施例中，计算单元1010可以通过其他任何适当的方式(例如，借助于固件)而被配置为执行网络模型训练方法。Computing unit 1010 may be various general-purpose and/or special-purpose processing components having processing and computing capabilities. Some examples of computing units 1010 include, but are not limited to, central processing units (CPUs), graphics processing units (GPUs), various specialized artificial intelligence (AI) computing chips, various computing units that run machine learning model algorithms, digital signal processing processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 1010 performs the various methods and processes described above, such as the method network model training method. For example, in some embodiments, the network model training method may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 1080 . In some embodiments, part or all of the computer program may be loaded and/or installed on the device 500 via the ROM 1020 and/or the communication unit 1090 . When the computer program is loaded into RAM 1030 and executed by computing unit 1010, one or more steps of the network model training method described above may be performed. Alternatively, in other embodiments, the computing unit 1010 may be configured to perform the network model training method by any other suitable means (eg, by means of firmware).

本文中以上描述的系统和技术的各种实施方式可以在数字电子电路系统、集成电路系统、场可编程门阵列(FPGA)、专用集成电路(ASIC)、专用标准产品(ASSP)、芯片上系统的系统(SOC)、负载可编程逻辑设备(CPLD)、计算机硬件、固件、软件、和/或它们的组合中实现。这些各种实施方式可以包括：实施在一个或者多个计算机程序中，该一个或者多个计算机程序可在包括至少一个可编程处理器的可编程系统上执行和/或解释，该可编程处理器可以是专用或者通用可编程处理器，可以从存储系统、至少一个输入装置、和至少一个输出装置接收数据和指令，并且将数据和指令传输至该存储系统、该至少一个输入装置、和该至少一个输出装置。Various implementations of the systems and techniques described herein above may be implemented in digital electronic circuitry, integrated circuit systems, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), application specific standard products (ASSPs), systems on chips system (SOC), load programmable logic device (CPLD), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include being implemented in one or more computer programs executable and/or interpretable on a programmable system including at least one programmable processor that The processor, which may be a special purpose or general-purpose programmable processor, may receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device an output device.

用于实施本公开的方法的程序代码可以采用一个或多个编程语言的任何组合来编写。这些程序代码可以提供给通用计算机、专用计算机或其他可编程数据处理装置的处理器或控制器，使得程序代码当由处理器或控制器执行时使流程图和/或框图中所规定的功能/操作被实施。程序代码可以完全在机器上执行、部分地在机器上执行，作为独立软件包部分地在机器上执行且部分地在远程机器上执行或完全在远程机器或服务器上执行。Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer or other programmable data processing apparatus, such that the program code, when executed by the processor or controller, performs the functions/functions specified in the flowcharts and/or block diagrams. Action is implemented. The program code may execute entirely on the machine, partly on the machine, partly on the machine and partly on a remote machine as a stand-alone software package or entirely on the remote machine or server.

在本公开的上下文中，机器可读介质可以是有形的介质，其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备，或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。In the context of the present disclosure, a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with the instruction execution system, apparatus or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or devices, or any suitable combination of the foregoing. More specific examples of machine-readable storage media would include one or more wire-based electrical connections, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), fiber optics, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.

为了提供与用户的交互，可以在计算机上实施此处描述的系统和技术，该计算机具有：用于向用户显示信息的显示装置(例如，CRT(阴极射线管)或者LCD(液晶显示器)监视器)；以及键盘和指向装置(例如，鼠标或者轨迹球)，用户可以通过该键盘和该指向装置来将输入提供给计算机。其它种类的装置还可以用于提供与用户的交互；例如，提供给用户的反馈可以是任何形式的传感反馈(例如，视觉反馈、听觉反馈、或者触觉反馈)；并且可以用任何形式(包括声输入、语音输入或者、触觉输入)来接收来自用户的输入。To provide interaction with a user, the systems and techniques described herein may be implemented on a computer having a display device (eg, a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user ); and a keyboard and pointing device (eg, a mouse or trackball) through which a user can provide input to the computer. Other kinds of devices can also be used to provide interaction with the user; for example, the feedback provided to the user can be any form of sensory feedback (eg, visual feedback, auditory feedback, or tactile feedback); and can be in any form (including acoustic input, voice input, or tactile input) to receive input from the user.

可以将此处描述的系统和技术实施在包括后台部件的计算系统(例如，作为数据服务器)、或者包括中间件部件的计算系统(例如，应用服务器)、或者包括前端部件的计算系统(例如，具有图形用户界面或者网络浏览器的用户计算机，用户可以通过该图形用户界面或者该网络浏览器来与此处描述的系统和技术的实施方式交互)、或者包括这种后台部件、中间件部件、或者前端部件的任何组合的计算系统中。可以通过任何形式或者介质的数字数据通信(例如，通信网络)来将系统的部件相互连接。通信网络的示例包括：局域网(LAN)、广域网(WAN)和互联网。The systems and techniques described herein may be implemented on a computing system that includes back-end components (eg, as a data server), or a computing system that includes middleware components (eg, an application server), or a computing system that includes front-end components (eg, a user's computer having a graphical user interface or web browser through which a user may interact with implementations of the systems and techniques described herein), or including such backend components, middleware components, Or any combination of front-end components in a computing system. The components of the system may be interconnected by any form or medium of digital data communication (eg, a communication network). Examples of communication networks include: Local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

计算机系统可以包括客户端和服务器。客户端和服务器一般远离彼此并且通常通过通信网络进行交互。通过在相应的计算机上运行并且彼此具有客户端-服务器关系的计算机程序来产生客户端和服务器的关系。服务器可以是云服务器，也可以为分布式系统的服务器，或者是结合了区块链的服务器。A computer system can include clients and servers. Clients and servers are generally remote from each other and usually interact through a communication network. The relationship of client and server arises by computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, a distributed system server, or a server combined with blockchain.

应该理解，可以使用上面所示的各种形式的流程，重新排序、增加或删除步骤。例如，本发公开中记载的各步骤可以并行地执行也可以顺序地执行也可以不同的次序执行，只要能够实现本公开公开的技术方案所期望的结果，本文在此不进行限制。It should be understood that steps may be reordered, added or deleted using the various forms of flow shown above. For example, the steps described in the present disclosure can be executed in parallel, sequentially, or in different orders. As long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, there is no limitation herein.

上述具体实施方式，并不构成对本公开保护范围的限制。本领域技术人员应该明白的是，根据设计要求和其他因素，可以进行各种修改、组合、子组合和替代。任何在本公开的精神和原则之内所作的修改、等同替换和改进等，均应包含在本公开保护范围之内。The above-mentioned specific embodiments do not constitute a limitation on the protection scope of the present disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may occur depending on design requirements and other factors. Any modifications, equivalent replacements, and improvements made within the spirit and principles of the present disclosure should be included within the protection scope of the present disclosure.

Claims

1. A network model training method, comprising:

generating a probability map corresponding to the target object based on the annotation information of the plurality of first sample images for the target object;

Based on the probability map and the plurality of second sample images, model training is performed with the goal of restoring the masked image blocks in each second sample image to obtain a pre-training model;

The pre-training model is trained based on the plurality of third sample images and the labeling information of the target object by the third sample images to obtain an image segmentation model for the target object.

2. The method according to claim 1, wherein the probability map comprises the probability that each pixel in the image of a preset size belongs to the target object;

Based on the probability map and a plurality of second sample images, model training is performed with the goal of restoring the masked image blocks in each second sample image to obtain a pre-training model, including:

Perform an inversion operation on each probability included in the probability map to obtain a target map;

Based on the target atlas and the plurality of second sample images, model training is performed with the goal of restoring the masked image blocks in each second sample image to obtain a pre-trained model.

3. The method according to claim 2, wherein, based on the target atlas and a plurality of second sample images, the model training is performed with the goal of restoring the masked image blocks in each second sample image, Get a pretrained model, including:

For each second sample image, dividing the second sample image into a plurality of image blocks, and masking out at least one image block;

Input the remaining image blocks of each second sample image into the pre-training model to be trained to obtain the predicted restoration image corresponding to each second sample image;

Determine image restoration loss information based on each second sample image, each predicted restoration image, and the target atlas;

Based on the image restoration loss information, the pre-training model to be trained is trained with the goal of restoring the masked image blocks in each of the second sample images to obtain a trained pre-training model.

4. The method according to claim 2 or 3, wherein the image of the preset size has the same resolution and size as the second sample image.

5. The method according to any one of claims 1 to 4, wherein the generating a probability map corresponding to the target object based on the annotation information of the plurality of first sample images for the target object, comprising:

For each first sample image, based on the annotation information of the first sample image for the target object, generate a mask image of the first sample image for the target object;

Based on the mask images corresponding to each of the first sample images, a probability map corresponding to the target object is generated.

6. The method according to claim 5, wherein generating a mask image of the first sample image for the target object based on the annotation information of the first sample image for the target object comprises:

performing a first preprocessing operation on the first sample image to obtain a first image with a preset resolution;

performing a second preprocessing operation on the first image to obtain a second image with a preset size;

Based on the annotation information of the first sample image for the target object and the second image, a mask image corresponding to the target object is generated.

7. The method according to any one of claims 1 to 6, wherein the pre-training model is trained based on the labeling information of the target object based on the plurality of third sample images and the third sample images to obtain the target object. The image segmentation model of the target object includes:

Dividing each of the third sample images into a plurality of image blocks, and inputting the image blocks of each third sample image into a pre-training model to obtain a predicted segmented image corresponding to each third sample image;

Determine image segmentation loss information based on the annotation information of each third sample image to the target object and each predicted segmentation image;

Based on the image segmentation loss information, the pre-trained model is trained to obtain an image segmentation model for the target object.

8. The method of claim 7, wherein the image segmentation loss information comprises image segmentation sub-loss information of multiple categories;

The pre-training model is trained based on the image segmentation loss information to obtain an image segmentation model for the target object, including:

Determine target loss information based on the image segmentation sub-loss information of the multiple categories;

Based on the target loss information, the pre-training model is trained to obtain an image segmentation model for the target object.

9. The method of any one of claims 1 to 8, wherein the first sample image comprises a medical image; and/or,

the second sample image includes a medical image; and/or,

The third sample image includes a medical image.

10. A network model training method, comprising:

Based on the probability map and the plurality of second sample images, model training is performed with the goal of restoring the masked image blocks in each second sample image to obtain an image restoration model.

11. The method according to claim 10, wherein the probability map comprises the probability that each pixel in the image of preset size belongs to the target object;

Based on the probability map and multiple second sample images, model training is performed with the goal of recovering the masked image blocks in each second sample image, and an image restoration model is obtained, including:

Based on the target atlas and the plurality of second sample images, model training is performed with the goal of restoring the masked image blocks in each second sample image to obtain an image restoration model.

12. A network model training device, comprising:

a first atlas determination module, configured to generate a probability atlas corresponding to the target object based on the annotation information of the plurality of first sample images for the target object;

A pre-training module for performing model training based on the probability atlas and a plurality of second sample images with the goal of restoring the masked image blocks in each second sample image to obtain a pre-training model;

The segmentation model training module is used for training the pre-training model based on the plurality of third sample images and the labeling information of the target object by the third sample images to obtain an image segmentation model for the target object.

13. The apparatus according to claim 12, wherein the probability map comprises the probability that each pixel in the image of a preset size belongs to the target object;

The pre-training module is specifically used for:

14. The apparatus according to claim 13, wherein the pre-training module is specifically used for:

15. The apparatus of claim 13 or 14, wherein the image of the preset size has the same resolution and size as the second sample image.

16. The device according to any one of claims 12 to 15, wherein the first map determination module is specifically used for:

Based on the mask image corresponding to each second sample image, a probability map corresponding to the target object is generated.

17. The apparatus according to claim 16, wherein the first map determination module is specifically used for:

18. The apparatus according to any one of claims 12 to 17, wherein the segmentation model training module is specifically used for:

Divide each of the third sample images into a plurality of image blocks, and input the image blocks corresponding to each of the third sample images into the pre-training model to obtain predicted segmentation images corresponding to each of the third sample images;

19. The apparatus of claim 18, wherein the image segmentation loss information comprises image segmentation sub-loss information of a plurality of categories;

The segmentation model training module is specifically used for:

Based on the target loss information, the pre-trained model is trained to obtain an image segmentation model for the target object.

20. A network model training device, comprising:

a second atlas determination module, configured to generate a probability atlas corresponding to the target object based on the annotation information of the plurality of first sample images for the target object;

The restoration model training module is configured to perform model training with the goal of restoring the masked image blocks in each second sample image based on the probability map and the plurality of second sample images to obtain an image restoration model.

21. The apparatus according to claim 20, wherein the probability map comprises the probability that each pixel in the image of a preset size belongs to the target object;

The recovery model training module is specifically used for:

22. An electronic device comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

The memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor to enable the at least one processor to perform the execution of any one of claims 1 to 11 Methods.

23. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1 to 11.

24. A computer program product comprising a computer program/instructions, wherein the computer program/instructions, when executed by a processor, implement the method of any one of claims 1 to 11.