WO2023151581A1 - 用于图像处理的方法、设备、存储介质和程序产品 - Google Patents

用于图像处理的方法、设备、存储介质和程序产品 Download PDF

Info

Publication number
WO2023151581A1
WO2023151581A1 PCT/CN2023/074970 CN2023074970W WO2023151581A1 WO 2023151581 A1 WO2023151581 A1 WO 2023151581A1 CN 2023074970 W CN2023074970 W CN 2023074970W WO 2023151581 A1 WO2023151581 A1 WO 2023151581A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
training
images
segmentation model
image segmentation
Prior art date
Application number
PCT/CN2023/074970
Other languages
English (en)
French (fr)
Inventor
边成
杨延展
李永会
Original Assignee
北京字节跳动网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字节跳动网络技术有限公司 filed Critical 北京字节跳动网络技术有限公司
Publication of WO2023151581A1 publication Critical patent/WO2023151581A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures

Definitions

  • Various implementations of the present disclosure relate to the computer field, and more specifically, to an image processing method, device, storage medium, and computer program product.
  • Image segmentation is a typical task in image processing, and image segmentation methods based on machine learning have become one of the current mainstreams. In image segmentation methods based on machine learning, the labeling accuracy of training images will greatly affect the accuracy of image segmentation.
  • Weakly annotated images generally include, for example, images that are annotated based on a predetermined segmentation style (eg, circular style, elliptical style, square style, etc.), and strongly annotated images generally include, for example, images that are annotated based on a pixel-by-pixel basis, so the strong annotated Images have more precise annotation information.
  • a predetermined segmentation style eg, circular style, elliptical style, square style, etc.
  • strongly annotated images generally include, for example, images that are annotated based on a pixel-by-pixel basis, so the strong annotated Images have more precise annotation information.
  • a method for image processing includes: obtaining a training image set, the training image set includes a group of strongly labeled images and a group of weakly labeled images, and a group of strongly labeled images has more accurate labels than a group of weakly labeled images.
  • Note information determine the image segmentation model to be trained, the first gradient information associated with the target weakly labeled image in a group of weakly labeled images and the second gradient information associated with a group of strongly labeled images; based on the first gradient information and the similarity between the second gradient information, determining target training weights associated with the target weakly labeled image; and using a set of strongly labeled images and a set of weakly labeled images to train an image segmentation model, wherein the target weakly labeled image pair The influence of is determined based on the target training weights.
  • an electronic device comprising: a memory and a processor; wherein the memory is used to store one or more computer instructions, wherein the one or more computer instructions are executed by the processor to implement the method of the first aspect.
  • a computer-readable storage medium on which one or more computer instructions are stored, wherein one or more computer instructions are executed by a processor to implement the method according to the first aspect of the present disclosure .
  • a computer program product comprising one or more computer instructions, wherein the one or more computer instructions are executed by a processor to implement the method according to the first aspect of the present disclosure.
  • the training weight for a single weakly labeled sample can be determined based on the strongly labeled sample. Therefore, the embodiments of the present disclosure can adaptively adjust the influence of weak standard samples on the training process, thereby improving the performance of the image segmentation model.
  • Figure 1 shows an example strongly annotated image and an example weakly annotated image
  • Figure 2 shows a schematic block diagram of a computing device capable of implementing some embodiments of the present disclosure
  • Fig. 3 shows a schematic diagram of determining training weights according to some embodiments of the present disclosure
  • FIG. 4 shows an illustration of training an image segmentation model according to some embodiments of the present disclosure. intent
  • FIG. 5 shows a flowchart of an example method for image processing according to some embodiments of the present disclosure.
  • image segmentation is an important task, which can extract the region where the object of interest is located from the image.
  • image segmentation technology based on machine learning has become the focus of attention.
  • model In the image segmentation technology based on machine learning, the labeling accuracy of the training image set will directly affect the accuracy of the image segmentation model.
  • model In this disclosure, the terms “model”, “neural network”, “learning model”, “learning network”, and “network” are used interchangeably.
  • training image samples can be divided into strongly labeled images and weakly labeled images, for example, where strongly labeled images have more accurate label information than weakly labeled images.
  • weakly annotated images generally include, for example, images annotated based on a predetermined segmentation pattern (e.g., a circular pattern, an elliptical pattern, a square pattern, etc.), and a strongly annotated image
  • a predetermined segmentation pattern e.g., a circular pattern, an elliptical pattern, a square pattern, etc.
  • images typically include, for example, images that are annotated on a pixel-by-pixel basis.
  • Fig. 1 shows exemplary strongly annotated images and weakly annotated images in the medical field.
  • a strongly annotated image 100 has, for example, pixel-by-pixel annotation information 120 that accurately identifies the outline of a lesion in the image.
  • the annotation information 140-1 to 140-4 corresponding to the weakly annotated images 130-1 to 130-4 cannot accurately indicate the outline of the lesion, which may be roughly annotated based on a predetermined segmentation pattern, for example.
  • a solution for image processing is provided.
  • a training image set can be obtained, wherein the training image set includes a group of strongly labeled images and a group of weakly labeled images.
  • the first gradient information and the second gradient information of the image segmentation model to be trained can be determined, and based on the similarity between the first gradient information and the second gradient information, the target training associated with the target weakly labeled image can be determined. Weights.
  • the first gradient information is determined based on a target weakly labeled image in a group of weakly labeled images
  • the second gradient information is determined based on the group of strongly labeled images.
  • the image segmentation model can be trained by using a set of strongly annotated images and a set of weakly annotated images, wherein the influence of the target weakly annotated image on the training is determined based on the target training weights.
  • the embodiments of the present disclosure can determine the training weight for a single weakly labeled sample based on the strongly labeled sample. Therefore, the embodiments of the present disclosure can adaptively adjust the influence of weak standard samples on the training process, thereby improving the performance of the image segmentation model.
  • FIG. 2 illustrates an example computing device 200 that may be used to implement embodiments of the present disclosure
  • a schematic block diagram of . It should be understood that the device 200 shown in FIG. 2 is exemplary only and should not constitute any limitation on the functionality and scope of the implementations described in this disclosure. 2, components of device 200 may include, but are not limited to, one or more processors or processing units 210, memory 220, storage device 230, one or more communication units 240, one or more input devices 250, and a or multiple output devices 260 .
  • the device 200 may be implemented as various user terminals or service terminals.
  • the service terminal may be a server, a large computing device, etc. provided by various service providers.
  • User terminals such as any type of mobile, stationary or portable terminal, including mobile handsets, multimedia computers, multimedia tablets, Internet nodes, communicators, desktop computers, laptop computers, notebook computers, netbook computers, tablet computers, personal Communication system (PCS) devices, personal navigation devices, personal digital assistants (PDAs), audio/video players, digital cameras/camcorders, pointing devices, television receivers, radio broadcast receivers, e-book devices, gaming devices, or any Combinations, including accessories and peripherals for these devices or any combination thereof.
  • PCS personal Communication system
  • PDAs personal digital assistants
  • audio/video players digital cameras/camcorders
  • pointing devices television receivers, radio broadcast receivers, e-book devices, gaming devices, or any Combinations, including accessories and peripherals for these devices or any combination thereof.
  • device 200 can support any type of user-directed interface (such as "wear
  • the processing unit 220 may be an actual or virtual processor and is capable of performing various processes according to programs stored in the memory 220 . In a multi-processor system, multiple processing units execute computer-executable instructions in parallel to increase the parallel processing capability of the device 200 .
  • the processing unit 220 may also be referred to as a central processing unit (CPU), a microprocessor, a controller, a microcontroller.
  • Device 200 typically includes a plurality of computer storage media. Such media can be any available media that is accessible by device 200, including but not limited to, volatile and nonvolatile media, removable and non-removable media.
  • Memory 220 can be volatile memory (eg, registers, cache, random access memory (RAM), nonvolatile memory (eg, read only memory (ROM), electrically erasable programmable read only memory (EEPROM) , flash memory) or some combination thereof.
  • Memory 220 may include one or more training modules 225 that are configured to perform the functions of various implementations described herein. The training module 225 can be accessed and executed by the processing unit 210 to implement corresponding function.
  • Storage device 230 may be a removable or non-removable medium, and may include machine-readable media that can be used to store information and/or data and that can be accessed within device 200 .
  • device 200 may be implemented in a single computing cluster or as a plurality of computing machines capable of communicating via communication links.
  • device 200 may operate in a networked environment using logical connections to one or more other servers, personal computers (PCs), or another general network node.
  • the device 200 can also communicate with one or more external devices (not shown) through the communication unit 240 as required, such as a database 245, other storage devices, servers, display devices, etc., and one or more external devices that allow users to communicate with the device.
  • the devices 200 interacts communicate with, or with any device (eg, network card, modem, etc.) that enables device 200 to communicate with one or more other computing devices. Such communication may be performed via an input/output (I/O) interface (not shown).
  • I/O input/output
  • the input device 250 may be one or more of various input devices, such as a mouse, a keyboard, a trackball, a voice input device, a camera, and the like.
  • Output device 260 may be one or more output devices, such as a display, speakers, printer, or the like.
  • the device 200 may obtain a training image set 270 , and the training image set 270 may include a set of strongly annotated images 272 and a set of weakly annotated images 274 .
  • the set of strongly annotated images 274 may be annotated on a pixel-by-pixel basis, and the set of weakly annotated images 274 may be annotated based on a predetermined segmentation pattern.
  • the device 200 may train an image segmentation model 280 based on the set of training images 270 .
  • the training parameters of the image segmentation model 280 may be provided to be stored in the storage device 230 , or provided to other computing devices for deployment of the image segmentation model 280 .
  • the device 200 may also provide a trained image segmentation model for determining an image segmentation result based on the received input image.
  • the input image may be, for example, a medical image
  • the image segmentation model 280 may be configured to determine segmentation information associated with lesion sites in the medical image.
  • the training module 225 may acquire a training image set 270 (denoted as ), which may include, for example, a set of strongly annotated images 272 and a set of weakly annotated images 274 .
  • the set of strongly labeled images 272 can be expressed as:
  • the set of weakly labeled images 274 can be expressed as: in denotes the input image of size H ⁇ W ⁇ 3, and y ⁇ ⁇ 0, 1 ⁇ H ⁇ W ⁇ C denotes the corresponding annotation information, which is the corresponding mask with C categories.
  • the number N of strongly labeled images in the training image set 270 may be smaller than the number M of weakly labeled images.
  • the training module 225 can determine the training weights through a gradient descent method, and this problem can be expressed as a two-layer optimization problem:
  • the training module 225 may decompose training into a training weight determination phase and a parameter adjustment phase. In the stage of determining the training weights, the training module 225 can determine the training weights for the weakly labeled samples, and the parameters of the image segmentation model do not change. In the parameter adjustment stage, the training module 225 can adjust the parameters of the image segmentation model based on the determined training weights.
  • the training module 225 may also perform the training weight determination phase and the parameter adjustment phase in an overlapping manner. For example, the training module 225 may perform weight determination according to the adjusted model parameters after performing several parameter adjustment iterations; and perform several parameter adjustment iterations based on the determined weights, and further perform weight determination according to the updated model parameters. Sure. Based on such overlapping execution weight determination and parameter adjustment, the embodiments of the present disclosure can further improve the performance of the trained image segmentation model.
  • the training module 225 may determine a training weight for each weakly labeled sample.
  • the training module 225 can determine the upper layer gradient according to the following formula:
  • the training module 225 can also use the identity matrix I to approximate the inverse Hessian matrix:
  • formula (4) actually represents the similarity between the gradient of the image segmentation model corresponding to the weakly labeled image and the gradient of the image segmentation model corresponding to the strongly labeled image.
  • the weakly labeled image can be determined to have a beneficial effect on the training of the model, and thus can be assigned a larger training weight.
  • the training module 225 can input one or more strongly labeled images 305 in a group of strongly labeled images 272 to the initial image segmentation model 315 to determine the first prediction result 320. Further, the training module 225 can determine the corresponding loss function based on the mask information Y S 340 corresponding to one or more strongly labeled images 305 330, and thus the gradient information (also referred to as the second gradient information) corresponding to one or more strongly labeled images 305 can be determined:
  • the gradient information corresponding to one or more strongly labeled images 305 may be determined based on the mean value of the gradient of each strongly labeled image. That is, the training module 225 Multiple gradients of the image segmentation model may be determined based on multiple strongly labeled images, and the gradient information may be determined based on an average value of the multiple gradients.
  • the training module 225 can also input the target weakly labeled image 310 (denoted as x k ) into the image segmentation model 315 to determine the second prediction result 325. Further, the training module 225 can determine the corresponding loss function based on the mask information y k 345 corresponding to the target weakly labeled image 310 350, and thus the gradient information (also referred to as the first gradient information) corresponding to the weakly labeled image 310 can be determined:
  • the training module 225 can determine the similarity 350 between the first gradient information and the second gradient information, which can be expressed as, for example:
  • the training module 225 can determine the target training weight ⁇ k based on the similarity 350 determined by formula (8), and this process can be expressed as:
  • the target weakly labeled image 310 may be assigned an initial training weight, and iteratively adjusted according to (9) to determine the final target training weight.
  • the embodiments of the present disclosure can determine the training weight for each weakly labeled sample, thereby improving the granularity of control over weakly labeled samples in mixed sample training.
  • the training module 225 can further train the image segmentation model 315 based on the determined training weights. In some embodiments, the training module 225 can adjust the parameters of the image segmentation model 315 based on formula (2), for example. Equation (2) includes the first loss part corresponding to strongly annotated images: Also includes a second loss part corresponding to weakly annotated images:
  • the training module 225 can also optimize the training of the image segmentation model based on pseudo-labels.
  • the training module 225 may first construct a hybrid image 415 based on a pair of images in the training image set 270 (ie, the first image 405 and the second image 410 ).
  • the process of constructing the hybrid image 415 can be expressed as:
  • the training module 225 can be based on the first mask information of the first image x i to determine the first part of the first image x i where the first mask information Used to indicate that the reserved area is to be reserved.
  • the training module 225 can be based on the first mask information Build the second mask information The second mask information is used to indicate that areas outside the predetermined area will be reserved. The training module 225 can also be based on the second mask information Determine the second part of the second image x j
  • the training module 225 can be based on the first part and the second part Build the blended image x m .
  • the first image 405 and the second image 410 may also be sampled from the training image set 270 based on different sampling methods.
  • the training module 225 may determine the probability for sampling the first image 405 based on the size of the target weight of each image in the training image set 270 . It should be understood that, for a strongly labeled image, its weight may be set to a predetermined weight value (for example, 1). Additionally, the training module 225 may sample the second image 410 from the training image set 270 based on uniform probability sampling, for example.
  • the training module 225 may determine an objective function for training, the objective function includes a first part associated with the first image, a second part associated with the second image and a third part associated with the mixed image, the first One part is based on the first training weights of the first image, the second part is based on the second training weights of the second image, and the third part is based on Regarding the third training weight, the third training weight is determined based on the first training weight and the second training weight.
  • the objective function can be expressed as:
  • the training module 225 can adjust the parameters of the image segmentation model 415 based on the objective function (11).
  • the training module 225 can utilize a dual-model structure to perform a model training process. Specifically, the training module 225 can construct an auxiliary model 420 having the same structure as the image segmentation model 315 to be trained (also referred to as the main model 315 ). In some embodiments, master model 315 and slave model 420 may have different initialization parameters, for example.
  • the training module 225 may apply the first image 405 to the main model 315 to determine a prediction result 425 and determine a loss 445 corresponding to the first image 405 based on the difference with the annotation information 460 .
  • This loss 445 can be used to determine the first part of the objective function
  • the training module 225 can apply the second image 410 to the auxiliary model 420 to determine the prediction result 440 , and determine the loss 455 corresponding to the second image 410 based on the difference with the annotation information 475 .
  • This loss 455 can be used to determine the second part of the objective function
  • the training module 225 may apply the blended image 415 to the primary model 315 to determine a prediction 430 , and apply the blended image 415 to the auxiliary model 420 to determine a prediction 435 .
  • the training module 225 can also determine a third loss part based on the difference between the first prediction result and the first mixed label information, the first mixed label information The information is determined by processing the first image and the second image using a second image segmentation model.
  • the training module 225 can also determine the fourth loss part based on the difference between the second prediction result and the second mixed annotation information, the second mixed annotation information is to use the first image segmentation model to process the first image and the second image And sure.
  • the third loss part can be expressed as
  • the fourth loss component can be expressed as Wherein ⁇ 1 and ⁇ 2 represent the parameters of the main model 315 and the auxiliary model 420 respectively, and and Then it represents the pseudo-label information 465 and pseudo-label information 470 corresponding to the mixed image 415 . specifically, It can be expressed as:
  • the training module 225 can determine the third part of the objective function based on the third loss part and the fourth loss part. This process can be expressed, for example, as:
  • the embodiments of the present disclosure can use the pseudo-label mechanism to further extract useful information in weakly labeled images, thereby improving the performance of the trained image segmentation model.
  • training module 225 may iteratively perform a training weight determination phase and a model parameter adjustment phase, as discussed above.
  • the training module 225 may perform a predetermined number of parameter adjustments on the image segmentation model using a set of strongly labeled images and a set of weakly labeled images based on the target training weights, so as to determine to update the image segmentation model.
  • the training module 225 may determine an updated training weight for the target weakly labeled image based on the updated image segmentation model. Further, the training module 225 may use a set of strongly labeled images and a set of weakly labeled images to train and update the image segmentation model based on the updated training weights.
  • FIG. 5 shows a flowchart of a method 500 for image processing according to some implementations of the present disclosure.
  • Method 500 may be implemented by computing device 200 , for example at training module 225 in memory 220 of computing device 200 .
  • the computing device 200 obtains a training image set, the training image set includes a set of strongly annotated images and a set of weakly annotated images, a set of strongly annotated images has more accurate annotations than a set of weakly annotated images information.
  • the computing device 200 determines first gradient information associated with a target weakly labeled image in a set of weakly labeled images and second gradient information associated with a set of strongly labeled images of an image segmentation model to be trained.
  • the computing device 200 determines target training weights associated with the target weakly annotated image based on the similarity between the first gradient information and the second gradient information.
  • computing device 200 trains an image segmentation model using a set of strongly annotated images and a set of weakly annotated images, wherein the impact of the target weakly annotated image on training is determined based on the target training weights.
  • the training image set includes a plurality of medical images, and the images are divided into The cut model is configured to determine segmentation information associated with a lesion site in the medical image.
  • the set of strongly labeled images includes a plurality of strongly labeled images
  • determining the second gradient information includes: determining a plurality of gradients of the image segmentation model based on the plurality of strongly labeled images; and based on an average of the plurality of gradients, Determine second gradient information.
  • determining target training weights associated with the target weakly labeled image includes: determining initial training weights associated with the target weakly labeled image; and based on the similarity, adjusting the initial training weights to determine the target training weights.
  • the method further includes determining the similarity based on the transposition of the second gradient information and the first gradient information.
  • using a set of strongly annotated images and a set of weakly annotated images to train an image segmentation model includes: based on a set of strongly annotated images, determining a first loss part associated with training; based on a set of weakly annotated images and target training weights, determine a second loss component associated with training; and adjust parameters of the image segmentation model based on the first loss component and the second loss component.
  • using a set of strongly labeled images and a set of weakly labeled images to train the image segmentation model includes: constructing a mixed image based on the first image and the second image in the training image set; determining an objective function for training,
  • the objective function includes a first part associated with the first image, a second part associated with the second image, and a third part associated with the blended image, the first part being based on the first training weights for the first image, and the second part being based on The second training weight of the second image, the third part is based on the third training weight, the third training weight is determined based on the first training weight and the second training weight; and based on the objective function, adjusting parameters of the image segmentation model.
  • constructing the blended image includes: determining a first portion of the first image based on first mask information with the first image, the first mask information being used to indicate that a predetermined area is to be preserved; based on the first mask Information, constructing second mask information, the second mask information is used to indicate that the area outside the predetermined area will be reserved; based on the second mask information, determining the second part of the second image; and based on the first part and the second part , to construct a hybrid image.
  • the image segmentation model is a first image segmentation model
  • determining the first part and the second part of the objective function comprises: applying the first image to the first image segmentation model to determine the first part of the objective function; and The second image is applied to a second image segmentation model different from the first image segmentation model to determine a second portion of the objective function.
  • determining the third part of the objective function includes: applying the blended image to the first image segmentation model to determine the first prediction result, and to the second image segmentation model to determine the second prediction result; based on the first The difference between the prediction result and the first mixed annotation information determines the third loss part.
  • the first mixed annotation information is determined by using the second image segmentation model to process the first image and the second image; based on the second prediction result and the first The difference between the two mixed annotation information, determining the fourth loss part, the second mixed annotation information is determined by using the first image segmentation model to process the first image and the second image; and based on the third loss part and the fourth loss part , to determine the third part of the objective function.
  • using a set of strongly labeled images and a set of weakly labeled images to train the image segmentation model includes: using a set of strongly labeled images and a set of weakly labeled images based on target training weights, performing a predetermined number of times on the image segmentation model to determine the updated image segmentation model; based on the updated image segmentation model, determine the updated training weights for the target weakly labeled images; and based on the updated training weights, use a set of strongly labeled images and a set of weakly labeled images to train the updated image Split the model.
  • the method further includes providing a trained image segmentation model for determining an image segmentation result based on the received input image.
  • FPGA field programmable gate array
  • ASIC application specific integrated circuit
  • ASSP application specific standard product
  • SOC system on a chip
  • CPLD load programmable logic device
  • Program codes for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes can be provided to general-purpose computers, special-purpose A processor or controller of a computer or other programmable data processing device such that the program codes cause the functions/operations specified in the flowchart and/or block diagrams to be implemented when executed by the processor or controller.
  • the program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
  • a machine-readable medium may be a tangible medium that may contain or store a program for use by or in conjunction with an instruction execution system, apparatus, or device.
  • a machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • a machine-readable medium may include, but is not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatus, or devices, or any suitable combination of the foregoing.
  • machine-readable storage media would include one or more wire-based electrical connections, portable computer discs, hard drives, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read only memory
  • EPROM or flash memory erasable programmable read only memory
  • CD-ROM compact disk read only memory
  • magnetic storage or any suitable combination of the foregoing.

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

根据本公开的实施例,提供了一种用于图像处理的方法、设备、存储介质和程序产品。在此描述的方法包括:获取训练图像集,训练图像集包括一组强标注图像和一组弱标注图像;确定待训练的图像分割模型的第一梯度信息和第二梯度信息;基于第一梯度信息与第二梯度信息之间的相似性,确定与目标弱标注图像相关联的目标训练权重;以及利用一组强标注图像和一组弱标注图像来训练图像分割模型,其中目标弱标注图像对训练的影响基于目标训练权重而被确定。通过确定针对单个弱标注样本的训练权重,本公开的实施例能够提高图像分割模型的性能。

Description

用于图像处理的方法、设备、存储介质和程序产品
相关申请的交叉引用
本申请要求申请号为202210126764.X,题为“用于图像处理的方法、设备、存储介质和程序产品”、申请日为2022年2月10日的中国发明专利申请的优先权,通过引用的方式将该申请整本并入本文。
技术领域
本公开的各实现方式涉及计算机领域,更具体地,涉及图像处理的方法、设备、存储介质和计算机程序产品。
背景技术
图像分割是图像处理中的一类典型任务,基于机器学习的图像分割方法已经成为当前的主流之一。在基于机器学习的图像分割方法中,训练图像的标注准确性将极大地影响图像分割的准确程度。
在一些特定领域中,人们通常难以足够数量的高质量标注结果。尤其,在医学图像处理领域中,人们难以获得足够数量的像素级别标注结果,这将直接影响医学图像处理的准确性。
目前,如何利用弱标注图像与强标注图像的结合来训练图像处理模型已经成为当前的热点。弱标注图像通常包括例如基于预定的分割样式(例如,圆形样式、椭圆形样式、方形样式等)而被标注的图像,强标注图像通常包括例如基于逐像素而被标注的图像,因此强标注图像具有更加精确的标注信息。
发明内容
在本公开的第一方面,提供了一种用于图像处理的方法。该方法包括:获取训练图像集,训练图像集包括一组强标注图像和一组弱标注图像,一组强标注图像具有比一组弱标注图像更加精确的标 注信息;确定待训练的图像分割模型的、与一组弱标注图像中目标弱标注图像相关联的第一梯度信息和与一组强标注图像相关联的第二梯度信息;基于第一梯度信息与第二梯度信息之间的相似性,确定与目标弱标注图像相关联的目标训练权重;以及利用一组强标注图像和一组弱标注图像来训练图像分割模型,其中目标弱标注图像对训练的影响基于目标训练权重而被确定。
在本公开的第二方面,提供了一种电子设备,包括:存储器和处理器;其中存储器用于存储一条或多条计算机指令,其中一条或多条计算机指令被处理器执行以实现根据本公开的第一方面的方法。
在本公开的第三方面,提供了一种计算机可读存储介质,其上存储有一条或多条计算机指令,其中一条或多条计算机指令被处理器执行实现根据本公开的第一方面的方法。
在本公开的第四方面,提供了一种计算机程序产品,其包括一条或多条计算机指令,其中一条或多条计算机指令被处理器执行实现根据本公开的第一方面的方法。
根据本公开的实施例,能够基于强标注样本来确定针对单个弱标注样本的训练权重。由此,本公开的实施例能够适应性地调整弱标准样本对于训练过程的影响,从而能够提高图像分割模型的性能。
附图说明
结合附图并参考以下详细说明,本公开各实施例的上述和其他特征、优点及方面将变得更加明显。在附图中,相同或相似的附图标注表示相同或相似的元素,其中:
图1示出了示例强标注图像和示例弱标注图像;
图2示出了能够实施本公开的一些实施例的计算设备的示意性框图;
图3示出了根据本公开的一些实施例的确定训练权重的示意图;
图4示出了根据本公开的一些实施例的训练图像分割模型的示 意图;以及
图5示出了根据本公开的一些实施例的用于图像处理的示例方法的流程图。
具体实施方式
下面将参照附图更详细地描述本公开的实施例。虽然附图中显示了本公开的某些实施例,然而应当理解的是,本公开可以通过各种形式来实现,而且不应该被解释为限于这里阐述的实施例,相反提供这些实施例是为了更加透彻和完整地理解本公开。应当理解的是,本公开的附图及实施例仅用于示例性作用,并非用于限制本公开的保护范围。
在本公开的实施例的描述中,术语“包括”及其类似用语应当理解为开放性包含,即“包括但不限于”。术语“基于”应当理解为“至少部分地基于”。术语“一个实施例”或“该实施例”应当理解为“至少一个实施例”。术语“第一”、“第二”等等可以指代不同的或相同的对象。下文还可能包括其他明确的和隐含的定义。
如以上讨论的,在图像处理过程中,图像分割是一类重要的任务,其能够从图像中提取感兴趣目标所在的区域。随着人工智能技术的发展,基于机器学习的图像分割技术已经成为人们关注的焦点。
在基于机器学习的图像分割技术中,训练图像集的标注准确性将直接影响到图像分割模型的准确性。在本公开中,术语“模型”、“神经网络”、“学习模型”、“学习网络”、和“网络”可替换地使用。
在一些领域的图像处理中,通常很难获得足够数量的精确标注的训练图像样本。通常地,训练图像样本例如可以分为强标注图像和弱标注图像,其中强标注图像具有比弱标注图像更加精确的标注信息。
示例性地,弱标注图像通常包括例如基于预定的分割样式(例如,圆形样式、椭圆形样式、方形样式等)而被标注的图像,强标 注图像通常包括例如基于逐像素而被标注的图像。
值得注意的是,在医学图像处理领域中,人们通常难以获得足够数量的强标注图像。图1示出了医学领域中示例性的强标注图像和弱标注图像。如图1所示,强标注图像100例如具有基于逐像素的标注信息120,其准确地标识了图像中病灶的轮廓。相反地,弱标注图像130-1至130-4所对应的标注信息140-1至140-4则无法精确地指示病灶的轮廓,其例如可以是基于预定的分割样式而被粗略地标注。
因此,人们期望能够利用强标注图像与弱标注图像的组合来协同地训练模型。然而,弱标注图像的使用也带来了模型下降的问题。这对于医学图像处理而言是无法接受的。
根据本公开的实现,提供了一种用于图像处理的方案。在该方案中,可以获取训练图像集,其中训练图像集包括一组强标注图像和一组弱标注图像。进一步地,可以确定待训练的图像分割模型的第一梯度信息和第二梯度信息,并基于第一梯度信息与第二梯度信息之间的相似性来确定与目标弱标注图像相关联的目标训练权重。第一梯度信息基于与一组弱标注图像中目标弱标注图像而被确定,第二梯度信息基于该组强标注图像而被确定。
进一步地,可以利用一组强标注图像和一组弱标注图像来训练图像分割模型,其中目标弱标注图像对训练的影响基于目标训练权重而被确定。
基于这样的方式,本公开的实施例能够基于强标注样本来确定针对单个弱标注样本的训练权重。由此,本公开的实施例能够适应性地调整弱标准样本对于训练过程的影响,从而能够提高图像分割模型的性能。
以下参考附图来说明本公开的基本原理和若干示例实现。
示例设备
图2示出了可以用来实施本公开的实施例的示例计算设备200 的示意性框图。应当理解,图2所示出的设备200仅仅是示例性的,而不应当构成对本公开所描述的实现的功能和范围的任何限制。如图2所示,设备200的组件可以包括但不限于一个或多个处理器或处理单元210、存储器220、存储设备230、一个或多个通信单元240、一个或多个输入设备250以及一个或多个输出设备260。
在一些实施例中,设备200可以被实现为各种用户终端或服务终端。服务终端可以是各种服务提供方提供的服务器、大型计算设备等。用户终端诸如是任何类型的移动终端、固定终端或便携式终端,包括移动手机、多媒体计算机、多媒体平板、互联网节点、通信器、台式计算机、膝上型计算机、笔记本计算机、上网本计算机、平板计算机、个人通信系统(PCS)设备、个人导航设备、个人数字助理(PDA)、音频/视频播放器、数码相机/摄像机、定位设备、电视接收器、无线电广播接收器、电子书设备、游戏设备或者其任意组合,包括这些设备的配件和外设或者其任意组合。还可预见到的是,设备200能够支持任何类型的针对用户的接口(诸如“可佩戴”电路等)。
处理单元220可以是实际或虚拟处理器并且能够根据存储器220中存储的程序来执行各种处理。在多处理器系统中,多个处理单元并行执行计算机可执行指令,以提高设备200的并行处理能力。处理单元220也可以被称为中央处理单元(CPU)、微处理器、控制器、微控制器。
设备200通常包括多个计算机存储介质。这样的介质可以是设备200可访问的任何可以获得的介质,包括但不限于易失性和非易失性介质、可拆卸和不可拆卸介质。存储器220可以是易失性存储器(例如寄存器、高速缓存、随机访问存储器(RAM))、非易失性存储器(例如,只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、闪存)或其某种组合。存储器220可以包括一个或多个训练模块225,这些程序模块被配置为执行本文所描述的各种实现的功能。训练模块225可以由处理单元210访问和运行,以实 现相应功能。存储设备230可以是可拆卸或不可拆卸的介质,并且可以包括机器可读介质,其能够用于存储信息和/或数据并且可以在设备200内被访问。
设备200的组件的功能可以以单个计算集群或多个计算机器来实现,这些计算机器能够通过通信连接进行通信。因此,设备200可以使用与一个或多个其他服务器、个人计算机(PC)或者另一个一般网络节点的逻辑连接来在联网环境中进行操作。设备200还可以根据需要通过通信单元240与一个或多个外部设备(未示出)进行通信,外部设备诸如数据库245、其他存储设备、服务器、显示设备等,与一个或多个使得用户与设备200交互的设备进行通信,或者与使得设备200与一个或多个其他计算设备通信的任何设备(例如,网卡、调制解调器等)进行通信。这样的通信可以经由输入/输出(I/O)接口(未示出)来执行。
输入设备250可以是一个或多个各种输入设备,例如鼠标、键盘、追踪球、语音输入设备、相机等。输出设备260可以是一个或多个输出设备,例如显示器、扬声器、打印机等。
在一些实施例中,如图2所示,设备200可以获取训练图像集270,该训练图像集270可以包括强一组标注图像272和一组弱标注图像274。在一些实施例中,该组强标注图像274可以基于逐像素而被标注,该组弱标注图像274可以基于预定的分割样式而被标注。
在一些实施例中,如图2所示,设备200可以基于该组训练图像集270来训练图像分割模型280。示例性地,该图像分割模型280的训练参数可以被提供被存储在存储设备230中,或者被提供至其他计算设备以用于部署图像分割模型280。
在一些实施例中,设备200还可以提供经训练的图像分割模型,以用于基于接收的输入图像来确定图像分割结果。示例性地,该输入图像例如可以是医学图像,图像分割模型280可以被配置用于确定与医学图像中的病灶部位相关联的分割信息。
基本原理
如参考图2介绍的,训练模块225可以获取训练图像集270(表示为),其例如可以包括一组强标注图像272和一组弱标注图像274。该组强标注图像272可以表示为:该组弱标注图像274可以表示为:其中表示尺寸为H×W×3的输入图像,而y∈{0,1}H×W×C则表示对应的标注信息,其为具有C个类别的对应掩码。在一些实施例中,训练图像集270中强标注图像的数目N可以原小于弱标注图像的数目M。
由此,确定训练权重的问题可以表示为弱标注图像添加对应的指示符Г={γkk∈[0,1],k∈[1,M]}。在一些实施例中,训练模块225可以通过梯度下降的方法来确定训练权重,该问题可以表示为双层优化问题:
其中θ表示图像分割模型的参数。进一步地,下层优化问题所对应的损失函数可以表示为:
在一些实施例中,训练模块225可以将训练分解为训练权重确定阶段和参数调整阶段。在训练权重确定阶段,训练模块225可以确定针对弱标注样本的训练权重,图像分割模型的参数不发生变化。在参数调整阶段,训练模块225可以基于所确定的训练权重来调整图像分割模型的参数。
在一些实施例中,训练模块225还可以交叠地执行训练权重确定阶段和参数调整阶段。例如,训练模块225可以在执行若干次参数调整迭代后,再根据调整后的模型参数执行权重确定;并基于所确定的权重再执行若干次参数调整迭代,并进一步根据更新的模型参数来执行权重确定。基于如此交叠的执行权重确定和参数调整,本公开的实施例可以进一步提高经训练的图像分割模型的性能。
确定训练权重
在一些实施例中,为了平衡强标注样本和弱标注样本对于模型训练的影响程度,训练模块225可以确定针对每个弱标注样本的训练权重。
对此公式(1)中的上次优化问题,训练模块225可以根据以下公式来确定上层梯度:
其中,进一步地,公式(3)还可以表示为:
其中,海森矩阵
在一些实施例中,为了减少计算开销,训练模块225还可以利用单位矩阵I来近似逆海森矩阵:
由此,公式(4)实际表征的是弱标注图像所对应的图像分割模型的梯度与强标注图像所对应的图像分割模型的梯度之间的相似性。当特定弱标注图像所对应的梯度与整体强标注图像所对应的梯度接近时,则该弱标注图像可以被确定为对模型的训练具有有利的作用,因而可以被分配具有较大的训练权重。
具体地,以下将参考图3来描述确定训练权重的示例过程。如图3所示,训练模块225可以将一组强标注图像272中的一个或多个强标注图像305输入至初始的图像分割模型315,从而确定第一预测结果320。进一步地,训练模块225可以基于与一个或多个强标注图像305对应的掩码信息YS340来确定对应的损失函数330,并从而可以确定与一个或多个强标注图像305所对应的梯度信息(也称为第二梯度信息):
示例性地,与一个或多个强标注图像305所对应的梯度信息可以基于每个强标注图像的梯度的均值来确定。也即,训练模块225 可以基多个强标注图像来确定图像分割模型的多个梯度,并基于该多个梯度的均值来确定该梯度信息。
进一步地,如图3所示,训练模块225还可以将目标弱标注图像310(表示为xk)输入至图像分割模型315,以确定第二预测结果325。进一步地,训练模块225可以基于与目标弱标注图像310对应的掩码信息yk345来确定对应的损失函数350,并从而可以确定与弱标注图像310所对应的梯度信息(也称为第一梯度信息):
进一步地,训练模块225可以确定第一梯度信息和第二梯度信息之间的相似性350,其例如可以表示为:
其中表示转置运算。
进一步地,训练模块225可以基于公式(8)所确定的相似性350来确定目标训练权重γk,该过程可以表示为:
在一些实施例中,目标弱标注图像310可以被分配具有初始的训练权重,并根据(9)迭代地调整,以确定最终的目标训练权重。
根据这样的方式,本公开的实施例可以确定针对每个弱标注样本的训练权重,从而提高了混合样本训练中对于弱标注样本的控制粒度。
模型参数调整
训练模块225可以进一步基于所确定的训练权重来训练图像分割模型315。在一些实施例中,训练模块225例如可以基于公式(2)来调整图像分割模型315的参数。公式(2)包括与强标注图像所对应的第一损失部分:还包括与弱标注图像所对应的第二损失部分:
在一些实施例中,考虑到弱标注样本本身标注信息可能不够精 确,训练模块225还可以基于伪标签来优化图像分割模型的训练。
具体地,以下将参考图4来描述调整模型参数的示例过程。如图4所示,训练模块225首先可以基于训练图像集270中的一对图像(即第一图像405和第二图像410)来构建混合图像415。构建混合图像415的过程可以表示为:
其中xm表示混合图像415;xi表示第一图像;yi表示第一图像xi的标注信息;xj表示第二图像;c表示yi中的一个类别,其对应的二进制掩码为
具体地,训练模块225可以基于与第一图像xi的第一掩码信息来确定第一图像xi的第一部分其中第一掩码信息用于指示预定区域将被保留。
进一步地,训练模块225可以基于第一掩码信息构建第二掩码信息第二掩码信息用于指示预定区域外的区域将被保留。训练模块225还可以基于第二掩码信息确定第二图像xj的第二部分
进一步地,训练模块225可以基于第一部分和第二部分构建混合图像xm
在一些实施例中,第一图像405和第二图像410还可以基于不同的采样方法而从训练图像集270中被采样。示例性地,训练模块225可以基于训练图像集270中各图像的目标权重的大小来确定用于采样第一图像405的概率。应当理解,对于强标注图像,其权重大小可以被设置为预定的权重值(例如,1)。附加地,训练模块225例如可以基于均匀概率采样的方式来从训练图像集270中采样第二图像410。
进一步地,训练模块225可以确定用于训练的目标函数,目标函数包括与第一图像相关联的第一部分、与第二图像相关联的第二部分和与混合图像相关联的第三部分,第一部分基于第一图像的第一训练权重,第二部分基于第二图像的第二训练权重,第三部分基 于第三训练权重,第三训练权重基于第一训练权重和第二训练权重而被确定。
在一些实施例中,该目标函数例如可以表示为:
其中,表示第一部分,表示第二部分,表示第三部分,其中λ为超参数。
进一步地,训练模块225可以基于目标函数(11)来调整图像分割模型415的参数。
在一些实施例中,如图4所示,训练模块225可以利用双模型结构来执行模型训练过程。具体地,训练模块225可以构建与待训练的图像分割模型315(也称为主模型315)具有相同结构的辅模型420。在一些实施例中,主模型315和辅模型420例如可以具有不同的初始化参数。
如图4所示,在确定目标函数的第一部分时,训练模块225可以将第一图像405应用于主模型315,以确定预测结果425,并基于与标注信息460的差异来确定与第一图像405对应的损失445。该损失445可以用于确定目标函数的第一部分
进一步地,训练模块225可以将第二图像410应用于辅模型420,以确定预测结果440,并基于与标注信息475的差异来确定与第二图像410对应的损失455。该损失455可以用于确定目标函数的第二部分
如图4所示,在确定目标函数的第三部分时,训练模块225可以将混合图像415应用于主模型315以确定预测结果430,并将混合图像415应用于辅模型420以确定预测结果435。
在一些实施例中,训练模块225还可以基于第一预测结果与第一混合标注信息之间的差异,确定第三损失部分,第一混合标注信 息是利用第二图像分割模型处理第一图像和第二图像而确定的。
进一步地,训练模块225还可以基于第二预测结果与第二混合标注信息之间的差异,确定第四损失部分,第二混合标注信息是利用第一图像分割模型处理第一图像和第二图像而确定的。
示例性地,第三损失部分可以表示为第四损失部分可以表示为其中θ1和θ2分别表示主模型315和辅模型420的参数,而则表示与混合图像415所对应的伪标注信息465和伪标注信息470。具体地,可以表示为:
其中表示由主模型315生成的针对第一图像xi和第二图像xj的预测。应当理解,也可以类似地被确定。
在一些实施例中,训练模块225可以基于第三损失部分和第四损失部分来确定目标函数的第三部分。该过程例如可以表示为:
基于这样的方式,本公开的实施例能够利用伪标签机制来进一步提取弱标注图像中的有用信息,从而提高训练得到的图像分割模型的性能。
迭代机制
在一些实施例中,如上文所讨论的,训练模块225可以迭代地执行训练权重确定阶段和模型参数调整阶段。
在一些实施例中,训练模块225可以基于目标训练权重,利用一组强标注图像和一组弱标注图像,对图像分割模型执行预定次数的参数调整,以确定更新图像分割模型。
附加地,训练模块225可以基于更新图像分割模型,确定针对目标弱标注图像的更新训练权重。进一步地,训练模块225可以基于更新训练权重,并利用一组强标注图像和一组弱标注图像训练更新图像分割模型。
示例性地,可以通过以下伪代码来反映本公开的迭代机制:
示例过程
图5示出了根据本公开一些实现的用于图像处理的方法500的流程图。方法500可以由计算设备200来实现,例如可以被实现在计算设备200的存储器220中的训练模块225处。
如图5所示,在框510,计算设备200获取训练图像集,训练图像集包括一组强标注图像和一组弱标注图像,一组强标注图像具有比一组弱标注图像更加精确的标注信息。
在框520,计算设备200确定待训练的图像分割模型的、与一组弱标注图像中目标弱标注图像相关联的第一梯度信息和与一组强标注图像相关联的第二梯度信息。
在框530,计算设备200基于第一梯度信息与第二梯度信息之间的相似性,确定与目标弱标注图像相关联的目标训练权重。
在框540,计算设备200利用一组强标注图像和一组弱标注图像来训练图像分割模型,其中目标弱标注图像对训练的影响基于目标训练权重而被确定。
在一些实施例中,训练图像集包括多个医学图像,并且图像分 割模型被配置用于确定与医学图像中的病灶部位相关联的分割信息。
在一些实施例中,一组强标注图像包括多个强标注图像,并且确定第二梯度信息包括:基于多个强标注图像,确定图像分割模型的多个梯度;以及基于多个梯度的均值,确定第二梯度信息。
在一些实施例中,确定与目标弱标注图像相关联的目标训练权重包括:确定与目标弱标注图像相关联的初始训练权重;以及基于相似性,调整初始训练权重以确定目标训练权重。
在一些实施例中,方法还包括:基于第二梯度信息的转置和第一梯度信息确定相似性。
在一些实施例中,利用一组强标注图像和一组弱标注图像来训练图像分割模型包括:基于一组强标注图像,确定与训练相关联的第一损失部分;基于一组弱标注图像和目标训练权重,确定与训练相关联的第二损失部分;以及基于第一损失部分和第二损失部分,调整图像分割模型的参数。
在一些实施例中,利用一组强标注图像和一组弱标注图像来训练图像分割模型包括:基于训练图像集中的第一图像和第二图像,构建混合图像;确定用于训练的目标函数,目标函数包括与第一图像相关联的第一部分、与第二图像相关联的第二部分和与混合图像相关联的第三部分,第一部分基于第一图像的第一训练权重,第二部分基于第二图像的第二训练权重,第三部分基于第三训练权重,第三训练权重基于第一训练权重和第二训练权重而被确定;以及基于目标函数,调整图像分割模型的参数。
在一些实施例中,构建混合图像包括:基于与第一图像的第一掩码信息,确定第一图像的第一部分,第一掩码信息用于指示预定区域将被保留;基于第一掩码信息,构建第二掩码信息,第二掩码信息用于指示预定区域外的区域将被保留;基于第二掩码信息,确定第二图像的第二部分;以及基于第一部分和第二部分,构建混合图像。
在一些实施例中,图像分割模型为第一图像分割模型,确定目标函数的第一部分和第二部分包括:将第一图像应用于第一图像分割模型,以确定目标函数的第一部分;以及将第二图像应用于与第一图像分割模型不同的第二图像分割模型,以确定目标函数的第二部分。
在一些实施例中,确定目标函数的第三部分包括:将混合图像应用于第一图像分割模型以确定第一预测结果,并应用于第二图像分割模型以确定第二预测结果;基于第一预测结果与第一混合标注信息之间的差异,确定第三损失部分,第一混合标注信息是利用第二图像分割模型处理第一图像和第二图像而确定的;基于第二预测结果与第二混合标注信息之间的差异,确定第四损失部分,第二混合标注信息是利用第一图像分割模型处理第一图像和第二图像而确定的;以及基于第三损失部分和第四损失部分,确定目标函数的第三部分。
在一些实施例中,利用一组强标注图像和一组弱标注图像来训练图像分割模型包括:基于目标训练权重,利用一组强标注图像和一组弱标注图像,对图像分割模型执行预定次数的参数调整,以确定更新图像分割模型;基于更新图像分割模型,确定针对目标弱标注图像的更新训练权重;以及基于更新训练权重,并利用一组强标注图像和一组弱标注图像训练更新图像分割模型。
在一些实施例中,方法还包括:提供经训练的图像分割模型,以用于基于接收的输入图像来确定图像分割结果。
本文中以上描述的功能可以至少部分地由一个或多个硬件逻辑部件来执行。例如,非限制性地,可以使用的示范类型的硬件逻辑部件包括:场可编程门阵列(FPGA)、专用集成电路(ASIC)、专用标准产品(ASSP)、芯片上系统的系统(SOC)、负载可编程逻辑设备(CPLD)等等。
用于实施本公开的方法的程序代码可以采用一个或多个编程语言的任何组合来编写。这些程序代码可以提供给通用计算机、专用 计算机或其他可编程数据处理装置的处理器或控制器,使得程序代码当由处理器或控制器执行时使流程图和/或框图中所规定的功能/操作被实施。程序代码可以完全在机器上执行、部分地在机器上执行,作为独立软件包部分地在机器上执行且部分地在远程机器上执行或完全在远程机器或服务器上执行。
在本公开的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。
此外,虽然采用特定次序描绘了各操作,但是这应当理解为要求这样操作以所示出的特定次序或以顺序次序执行,或者要求所有图示的操作应被执行以取得期望的结果。在一定环境下,多任务和并行处理可能是有利的。同样地,虽然在上面论述中包含了若干具体实现细节,但是这些不应当被解释为对本公开的范围的限制。在单独的实现的上下文中描述的某些特征还可以组合地实现在单个实现中。相反地,在单个实现的上下文中描述的各种特征也可以单独地或以任何合适的子组合的方式实现在多个实现中。
尽管已经采用特定于结构特征和/或方法逻辑动作的语言描述了本主题,但是应当理解所附权利要求书中所限定的主题未必局限于上面描述的特定特征或动作。相反,上面所描述的特定特征和动作仅仅是实现权利要求书的示例形式。

Claims (16)

  1. 一种用于图像处理的方法,包括:
    获取训练图像集,所述训练图像集包括一组强标注图像和一组弱标注图像,所述一组强标注图像具有比所述一组弱标注图像更加精确的标注信息;
    确定待训练的图像分割模型的、与所述一组弱标注图像中目标弱标注图像相关联的第一梯度信息和与所述一组强标注图像相关联的第二梯度信息;
    基于所述第一梯度信息与所述第二梯度信息之间的相似性,确定与所述目标弱标注图像相关联的目标训练权重;以及
    利用所述一组强标注图像和所述一组弱标注图像来训练所述图像分割模型,其中所述目标弱标注图像对所述训练的影响基于所述目标训练权重而被确定。
  2. 根据权利要求1所述的方法,其中所述训练图像集包括多个医学图像,并且所述图像分割模型被配置用于确定与医学图像中的病灶部位相关联的分割信息。
  3. 根据权利要求1所述的方法,其中所述一组强标注图像包括多个强标注图像,并且确定所述第二梯度信息包括:
    基于所述多个强标注图像,确定所述图像分割模型的多个梯度;以及
    基于所述多个梯度的均值,确定所述第二梯度信息。
  4. 根据权利要求1所述的方法,其中确定与所述目标弱标注图像相关联的目标训练权重包括:
    确定与所述目标弱标注图像相关联的初始训练权重;以及
    基于所述相似性,调整所述初始训练权重以确定所述目标训练权重。
  5. 根据权利要求1所述的方法,还包括:
    基于所述第二梯度信息的转置和所述第一梯度信息确定所述相 似性。
  6. 根据权利要求1所述的方法,其中利用所述一组强标注图像和所述一组弱标注图像来训练所述图像分割模型包括:
    基于所述一组强标注图像,确定与所述训练相关联的第一损失部分;
    基于所述一组弱标注图像和所述目标训练权重,确定与所述训练相关联的第二损失部分;以及
    基于所述第一损失部分和所述第二损失部分,调整所述图像分割模型的参数。
  7. 根据权利要求1所述的方法,其中利用所述一组强标注图像和所述一组弱标注图像来训练所述图像分割模型包括:
    基于所述训练图像集中的第一图像和第二图像,构建混合图像;
    确定用于所述训练的目标函数,所述目标函数包括与所述第一图像相关联的第一部分、与所述第二图像相关联的第二部分和与所述混合图像相关联的第三部分,所述第一部分基于所述第一图像的第一训练权重,所述第二部分基于所述第二图像的第二训练权重,所述第三部分基于第三训练权重,所述第三训练权重基于所述第一训练权重和所述第二训练权重而被确定;以及
    基于所述目标函数,调整所述图像分割模型的参数。
  8. 根据权利要求7所述的方法,其中构建所述混合图像包括:
    基于与所述第一图像的第一掩码信息,确定所述第一图像的第一部分,所述第一掩码信息用于指示预定区域将被保留;
    基于所述第一掩码信息,构建第二掩码信息,所述第二掩码信息用于指示所述预定区域外的区域将被保留;
    基于所述第二掩码信息,确定所述第二图像的第二部分;以及
    基于所述第一部分和所述第二部分,构建所述混合图像。
  9. 根据权利要求8所述的方法,其中所述图像分割模型为第一图像分割模型,确定所述目标函数的所述第一部分和所述第二部分包括:
    将所述第一图像应用于所述第一图像分割模型,以确定所述目标函数的所述第一部分;以及
    将所述第二图像应用于与所述第一图像分割模型不同的第二图像分割模型,以确定所述目标函数的所述第二部分。
  10. 根据权利要求9所述的方法,其中确定所述目标函数的所述第三部分包括:
    将所述混合图像应用于所述第一图像分割模型以确定第一预测结果,并应用于所述第二图像分割模型以确定第二预测结果;
    基于所述第一预测结果与第一混合标注信息之间的差异,确定第三损失部分,所述第一混合标注信息是利用所述第二图像分割模型处理所述第一图像和所述第二图像而确定的;
    基于所述第二预测结果与第二混合标注信息之间的差异,确定第四损失部分,所述第二混合标注信息是利用所述第一图像分割模型处理所述第一图像和所述第二图像而确定的;以及
    基于所述第三损失部分和所述第四损失部分,确定所述目标函数的所述第三部分。
  11. 根据权利要求1所述的方法,其中利用所述一组强标注图像和所述一组弱标注图像来训练所述图像分割模型包括:
    基于所述目标训练权重,利用所述一组强标注图像和所述一组弱标注图像,对所述图像分割模型执行预定次数的参数调整,以确定更新图像分割模型;
    基于所述更新图像分割模型,确定针对所述目标弱标注图像的更新训练权重;以及
    基于所述更新训练权重,并利用所述一组强标注图像和所述一组弱标注图像训练所述更新图像分割模型。
  12. 根据权利要求1所述的方法,还包括:
    提供经训练的所述图像分割模型,以用于基于接收的输入图像来确定图像分割结果。
  13. 根据权利要求1所述的方法,其中所述一组强标注图像基于 逐像素而被标注,所述一组弱标注图像基于预定的分割样式而被标注。
  14. 一种电子设备,包括:
    存储器和处理器;
    其中所述存储器用于存储一条或多条计算机指令,其中所述一条或多条计算机指令被所述处理器执行以实现根据权利要求1至13中任一项所述的方法。
  15. 一种计算机可读存储介质,其上存储有一条或多条计算机指令,其中所述一条或多条计算机指令被处理器执行以实现根据权利要求1至13中任一项所述的方法。
  16. 一种计算机程序产品,包括一条或多条计算机指令,其中所述一条或多条计算机指令被处理器执行以实现根据权利要求1至13中任一项所述的方法。
PCT/CN2023/074970 2022-02-10 2023-02-08 用于图像处理的方法、设备、存储介质和程序产品 WO2023151581A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210126764.XA CN114170481B (zh) 2022-02-10 2022-02-10 用于图像处理的方法、设备、存储介质和程序产品
CN202210126764.X 2022-02-10

Publications (1)

Publication Number Publication Date
WO2023151581A1 true WO2023151581A1 (zh) 2023-08-17

Family

ID=80489735

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/074970 WO2023151581A1 (zh) 2022-02-10 2023-02-08 用于图像处理的方法、设备、存储介质和程序产品

Country Status (2)

Country Link
CN (1) CN114170481B (zh)
WO (1) WO2023151581A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114170481B (zh) * 2022-02-10 2022-06-17 北京字节跳动网络技术有限公司 用于图像处理的方法、设备、存储介质和程序产品

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10430946B1 (en) * 2019-03-14 2019-10-01 Inception Institute of Artificial Intelligence, Ltd. Medical image segmentation and severity grading using neural network architectures with semi-supervised learning techniques
CN110781934A (zh) * 2019-10-15 2020-02-11 深圳市商汤科技有限公司 监督学习、标签预测方法及装置、电子设备和存储介质
CN111932547A (zh) * 2020-09-24 2020-11-13 平安科技(深圳)有限公司 图像中目标物的分割方法、装置、电子设备及存储介质
CN111968124A (zh) * 2020-10-26 2020-11-20 四川省肿瘤医院 基于半监督语义分割的肩部肌骨超声结构分割方法
CN114170481A (zh) * 2022-02-10 2022-03-11 北京字节跳动网络技术有限公司 用于图像处理的方法、设备、存储介质和程序产品

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105184303B (zh) * 2015-04-23 2019-08-09 南京邮电大学 一种基于多模态深度学习的图像标注方法
CN109378052B (zh) * 2018-08-31 2019-07-30 透彻影像(北京)科技有限公司 图像标注的预处理方法及系统
CN109359666B (zh) * 2018-09-07 2021-05-28 佳都科技集团股份有限公司 一种基于多特征融合神经网络的车型识别方法及处理终端
US11816566B2 (en) * 2020-05-18 2023-11-14 Microsoft Technology Licensing, Llc Joint learning from explicit and inferred labels
CN111951274A (zh) * 2020-07-24 2020-11-17 上海联影智能医疗科技有限公司 图像分割方法、系统、可读存储介质和设备
CN111967459B (zh) * 2020-10-21 2021-01-22 北京易真学思教育科技有限公司 模型训练方法、图像识别方法、装置、设备及存储介质
CN112668586B (zh) * 2020-12-18 2024-05-14 北京百度网讯科技有限公司 模型训练、图片处理方法及设备、存储介质、程序产品
CN112927172B (zh) * 2021-05-10 2021-08-24 北京市商汤科技开发有限公司 图像处理网络的训练方法和装置、电子设备和存储介质
CN113724132B (zh) * 2021-11-03 2022-02-18 浙江宇视科技有限公司 图像风格迁移处理方法、装置、电子设备及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10430946B1 (en) * 2019-03-14 2019-10-01 Inception Institute of Artificial Intelligence, Ltd. Medical image segmentation and severity grading using neural network architectures with semi-supervised learning techniques
CN110781934A (zh) * 2019-10-15 2020-02-11 深圳市商汤科技有限公司 监督学习、标签预测方法及装置、电子设备和存储介质
CN111932547A (zh) * 2020-09-24 2020-11-13 平安科技(深圳)有限公司 图像中目标物的分割方法、装置、电子设备及存储介质
CN111968124A (zh) * 2020-10-26 2020-11-20 四川省肿瘤医院 基于半监督语义分割的肩部肌骨超声结构分割方法
CN114170481A (zh) * 2022-02-10 2022-03-11 北京字节跳动网络技术有限公司 用于图像处理的方法、设备、存储介质和程序产品

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
PAN JUNWEN, BI QI, YANG YANZHAN, ZHU PENGFEI, BIAN CHENG: "Label-Efficient Hybrid-Supervised Learning for Medical Image Segmentation", PROCEEDINGS OF THE AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, vol. 36, no. 2, 10 March 2022 (2022-03-10), pages 2026 - 2034, XP093084839, ISSN: 2159-5399, DOI: 10.1609/aaai.v36i2.20098 *

Also Published As

Publication number Publication date
CN114170481B (zh) 2022-06-17
CN114170481A (zh) 2022-03-11

Similar Documents

Publication Publication Date Title
US10565729B2 (en) Optimizations for dynamic object instance detection, segmentation, and structure mapping
US10733431B2 (en) Systems and methods for optimizing pose estimation
Luo et al. Robust discrete code modeling for supervised hashing
US20210275918A1 (en) Unsupervised learning of scene structure for synthetic data generation
CN108399383B (zh) 表情迁移方法、装置存储介质及程序
US20200334867A1 (en) Face synthesis
US10943068B2 (en) N-ary relation prediction over text spans
WO2020244151A1 (zh) 图像处理方法、装置、终端及存储介质
WO2023151581A1 (zh) 用于图像处理的方法、设备、存储介质和程序产品
DE102021124372A1 (de) Wissensentdeckung mit einem neuronalen netz
DE102022129436A1 (de) Bilderzeugung mit einem oder mehreren neuronalen Netzen
WO2023045925A1 (zh) 构建聚类模型的方法、设备、介质和程序产品
US11481599B2 (en) Understanding a query intention for medical artificial intelligence systems using semi-supervised deep learning
CN109657693B (zh) 一种基于相关熵和迁移学习的分类方法
WO2022227214A1 (zh) 分类模型训练方法、装置、终端设备及存储介质
US20190138930A1 (en) Systems and methods for real-time data processing analytics engine with artificial intelligence for target information protection
WO2023246735A1 (zh) 一种项目推荐方法及其相关设备
CN113408265B (zh) 基于人机交互的语义解析方法、装置、设备及存储介质
CN116030375A (zh) 视频特征提取、模型训练方法、装置、设备及存储介质
DE112022001091T5 (de) Objektverfolgung unter verwendung von optischem fluss
Yu et al. Protein function prediction using weak-label learning
Mao et al. A deep learning approach to track Arabidopsis seedlings’ circumnutation from time-lapse videos
Wetstein et al. Detection of acini in histopathology slides: towards automated prediction of breast cancer risk
Ke et al. A high-throughput tumor location system with deep learning for colorectal cancer histopathology image
Li et al. Research and implementation of a fabric printing detection system based on a field programmable gate array and deep neural network

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23752361

Country of ref document: EP

Kind code of ref document: A1