CN115439713A - Model training method and device, image segmentation method, equipment and storage medium - Google Patents

Model training method and device, image segmentation method, equipment and storage medium Download PDF

Info

Publication number
CN115439713A
CN115439713A CN202211053523.3A CN202211053523A CN115439713A CN 115439713 A CN115439713 A CN 115439713A CN 202211053523 A CN202211053523 A CN 202211053523A CN 115439713 A CN115439713 A CN 115439713A
Authority
CN
China
Prior art keywords
image
sub
model
segmentation
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211053523.3A
Other languages
Chinese (zh)
Inventor
唐晓颖
王仲华
吕俊延
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southwest University of Science and Technology
Original Assignee
Southwest University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southwest University of Science and Technology filed Critical Southwest University of Science and Technology
Priority to CN202211053523.3A priority Critical patent/CN115439713A/en
Publication of CN115439713A publication Critical patent/CN115439713A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06V10/7753Incorporation of unlabelled data, e.g. multiple instance learning [MIL]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/778Active pattern-learning, e.g. online learning of image or video features

Abstract

The invention discloses a model training method and device, an image segmentation method, image segmentation equipment and a storage medium. The model training method is used for training the image filling model and comprises the following steps: acquiring a first image containing a first feature; performing superpixel segmentation processing on the first image to obtain a first number of superpixels; acquiring a first sub-image and a second sub-image; inputting the first sample image, the second sample image and the third sample image into a preset original filling model for training to obtain an image filling model; the first sample image is an image obtained by performing gray value zeroing processing on a first sub-image and a second sub-image of the first image, the second sample image is a binary image of the first sub-image, and the third sample image is a binary image of the second sub-image; the image filling model is used for carrying out image restoration operation. The invention enables preprocessing of the target model using the unlabelled first image.

Description

Model training method and device, image segmentation method, equipment and storage medium
Technical Field
The present invention relates to the field of image processing technologies, and in particular, to a model training method and apparatus, an image segmentation method, an image segmentation device, and a storage medium.
Background
Currently, the target model is pre-trained on a large natural database (ImageNet) to initialize the target model and accelerate convergence of the target model in the training process.
In the related art, the pre-training is directed to a classification task, which is a large gap compared to a segmentation task. Meanwhile, there is a large gap between the natural images in the large natural database and the medical images, that is, when the pre-training strategy on the natural images is applied to the medical image data set, the expected effect cannot be obtained. In addition, pre-training on a large natural database cannot utilize unmarked medical images, thereby affecting the training performance of the target model.
Disclosure of Invention
The present invention is directed to solving at least one of the problems of the prior art. Therefore, the invention provides a model training method and device, an image segmentation method, equipment and a storage medium, which can realize the preprocessing of a target model by using a label-free medical image.
A model training method according to an embodiment of a first aspect of the present invention is for training an image filling model, the method including:
acquiring a first image containing a first feature;
performing superpixel segmentation processing on the first image to obtain a first number of superpixels;
acquiring a first sub-image and a second sub-image; wherein the sum of the number of the first sub-images and the second sub-images is a second number; the first sub-image representing the superpixel including the first feature, the second sub-image representing the superpixel including a second feature;
inputting a first sample image, a second sample image and a third sample image into a preset original filling model for training to obtain an image filling model; the first sample image is an image obtained by performing gray value zeroing processing on the first sub-image and the second sub-image of the first image, the second sample image is a binary image of the first sub-image, and the third sample image is a binary image of the second sub-image; the image filling model is used for carrying out image restoration operation.
The model training method provided by the embodiment of the invention at least has the following beneficial effects: the method comprises the steps of learning a first image without a label through an original filling model, and obtaining the prior knowledge of a first sub-image and the prior knowledge of a second sub-image through learning, so that a foundation is laid for training an image segmentation model in a downstream task. Therefore, pre-training of a target model for a segmentation task is achieved, label-free image data can be used for pre-training, and the problem that effective pre-training cannot be conducted due to the fact that medical images in the related technology have the factors of small data volume, difficulty in data labeling, data privacy and the like is solved.
According to some embodiments of the invention, the obtaining the first sub-image and the second sub-image comprises:
acquiring a conversion image; the conversion image is an image obtained by performing HSV space conversion on the first image;
obtaining a pixel value of the conversion image according to the tone data of the conversion image, the saturation data of the conversion image and a preset weight value;
obtaining a selection probability according to the pixel value;
and obtaining the first sub-image from the converted image according to the relation between the selection probability and the first characteristic, and obtaining the second sub-image from the converted image according to the relation between the selection probability and the second characteristic.
According to some embodiments of the invention, the method further comprises:
processing the first image; wherein the processing operation comprises any one of an image rotation classification operation, an image filling operation and an image coloring operation;
obtaining a third quantity and a fourth quantity according to the first image after the processing operation and a preset Bayesian optimization strategy;
and updating the first number according to the third number, updating the second number according to the fourth number, and executing the super-pixel segmentation processing on the first image again.
A model training method according to an embodiment of a second aspect of the present invention is used for training an image segmentation model, and the method includes:
obtaining an original segmentation model from the image fill model according to any one of the first aspect;
acquiring a second image containing a first characteristic and label data corresponding to the second image;
inputting the second image and the label data into a preset original segmentation model for training to obtain the image segmentation model; wherein the image segmentation model is used to segment a first feature in the second image.
An image segmentation method according to an embodiment of the third aspect of the present invention, the method comprising:
acquiring current image data;
inputting the current image data into an image segmentation model for segmentation processing to obtain a segmentation result; wherein the image segmentation model is obtained by training according to the model training method of the second aspect.
A model training apparatus according to a fourth aspect embodiment of the present invention includes:
the super-pixel segmentation module is used for acquiring a first image containing first characteristics and performing super-pixel segmentation processing on the first image to obtain a first number of super-pixels;
a superpixel selection module to obtain a first sub-image and a second sub-image; wherein the sum of the number of the first sub-images and the second sub-images is a second number; the first sub-image represents the superpixels including the first feature, the second sub-image represents the superpixels including a second feature;
the super-pixel filling module is used for inputting the first sample image, the second sample image and the third sample image into a preset original filling model for training processing to obtain an image filling model; the first sample image is an image obtained by performing gray value zeroing processing on the first sub-image and the second sub-image of the first image, the second sample image is a binary image of the first sub-image, and the third sample image is a binary image of the second sub-image; the image filling model is used for carrying out image restoration operation;
the segmentation module is used for acquiring a second image containing a first characteristic and label data corresponding to the second image; the second image and the label data are input into a preset original segmentation model for training processing to obtain the image segmentation model; wherein the image segmentation model is used to segment a first feature in the second image.
An electronic device according to an embodiment of the fifth aspect of the present invention includes:
at least one memory;
at least one processor;
at least one computer program;
the computer program is stored in the memory, and the at least one computer program is executed by the processor to implement:
the method of any one of the first aspect; or
The method of the second aspect.
A computer-readable storage medium according to a sixth aspect of the present invention stores computer-executable instructions for causing a computer to perform:
the method of any one of the first aspect; or
The method of the second aspect.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The invention is further described with reference to the following figures and examples, in which:
FIG. 1 is a schematic flow chart of an embodiment of the present invention for training an image filling model;
FIG. 2 is a schematic flow chart illustrating a process for training an image filling model according to an embodiment of the present invention;
FIG. 3 is another schematic flow chart of the embodiment of the present invention for training the image filling model;
FIG. 4 is a flowchart illustrating a process for training an image segmentation model according to an embodiment of the present invention;
FIG. 5 is a diagram illustrating image segmentation comparison using different training methods according to an embodiment of the present invention;
FIG. 6 is a diagram of a training image fill model and a training image segmentation model according to an embodiment of the present invention;
FIG. 7 is a flowchart illustrating an image segmentation method according to an embodiment of the present invention;
FIG. 8 is a block diagram of a model training apparatus according to an embodiment of the present invention;
fig. 9 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
It should be noted that although functional blocks are partitioned in a schematic diagram of an apparatus and a logical order is shown in a flowchart, in some cases, the steps shown or described may be performed in a different order than the partitioning of blocks in the apparatus or the order in the flowchart. The terms first, second and the like in the description and in the claims, and the drawings described above, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the present application only and is not intended to be limiting of the application.
First, several terms referred to in the present application are resolved:
artificial Intelligence (AI): the method is a new technical science for researching and developing theories, methods, technologies and application systems for simulating, extending and expanding human intelligence; artificial intelligence is a branch of computer science that attempts to understand the essence of intelligence and produces a new intelligent machine that can react in a manner similar to human intelligence, and research in this field includes robotics, language recognition, image recognition, natural language processing, and expert systems, among others. The artificial intelligence can simulate the information process of human consciousness and thinking. Artificial intelligence is also a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results.
Super-pixel segmentation: refers to the process of subdividing a digital image into a plurality of image sub-regions (sets of pixels, also referred to as superpixels). The super-pixel is a small area formed by a series of pixel points which are adjacent in position and similar in characteristics such as color, brightness, texture and the like. Most of these small regions retain effective information for further image segmentation, and generally do not destroy the boundary information of objects in the image. A small amount of superpixels are used for replacing a large amount of pixels to express image characteristics, the complexity of image processing is reduced, and therefore superpixel segmentation can be used as a preprocessing step of a segmentation algorithm.
Simple Linear Iterative Clustering (SLIC): the method is a process of converting a color image into five-dimensional characteristic vectors in a CIELAB color space and XY coordinates, then constructing a distance measurement standard for the five-dimensional characteristic vectors, and performing local clustering on image pixels. The SLIC algorithm is able to generate compact, approximately uniform superpixels.
Bayesian optimization (Bayesian Optimzation): is a method that uses bayes' theorem to guide a search to find the minimum or maximum of an objective function. Specifically, before each new loop iteration, the optimization is performed by using the prior knowledge observed at the previous time, and the optimal solution is continuously approached, so that the search efficiency is improved.
Gold standard: the final target of the image restoration operation and the image segmentation operation.
At present, compared with natural images, due to the fact that the medical images have the factors of small data quantity, difficulty in data annotation, data privacy and the like, how to reasonably use a large amount of label-free data becomes an important research direction for training a model for carrying out image segmentation on the medical images.
Based on this, the embodiment of the application provides a model training method and device, an image segmentation method, equipment and a storage medium, wherein the model training method comprises a training method for training an image filling model and a training method for training an image segmentation model. It can be understood that the original segmentation model of the image segmentation model can be obtained by image filling model migration, that is, the training process of the image filling model is a pre-training process of the image segmentation model, and the training method of the image segmentation model is a training process of a downstream task.
The embodiment of the application can acquire and process related data based on an artificial intelligence technology. Among them, artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result.
The artificial intelligence base technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
The embodiment of the application provides a model training method and an image segmentation method, relates to the technical field of artificial intelligence, and particularly relates to the technical field of image processing. The model training method or the image segmentation method provided by the embodiment of the application can be applied to a terminal, a server side and software running in the terminal or the server side. In some embodiments, the terminal may be a smartphone, tablet, laptop, desktop computer, smart watch, or the like; the server can be an independent server, and can also be a cloud server providing basic cloud computing services such as cloud service, a cloud database, cloud computing, a cloud function, cloud storage, network service, cloud communication, middleware service, domain name service, security service, content Delivery Network (CDN), big data and artificial intelligence platform and the like; the software may be an application that implements a model training method or an image segmentation method, etc., but is not limited to the above form.
The application is operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
Referring to fig. 1, an embodiment of the present application provides a model training method for training an image filling model. The model training method includes, but is not limited to, steps S110 to S140.
S110, acquiring a first image containing a first characteristic;
it should be noted that the image filling model provided in the embodiments of the present application may be used to perform pixel filling operation on an image in any field to achieve image restoration. In the following embodiments, the application of the image filling model to the medical field is taken as an example. And acquiring a first image, wherein the first image is image data containing a first feature, and the first feature and an image feature segmented by an image segmentation model in a downstream task are similar features. For example, if the first image is image data including a skin cancer feature, the task of the image segmentation model is to segment an image of a region where the skin cancer feature is located in the input image data.
S120, performing superpixel segmentation processing on the first image to obtain a first number of superpixels;
it will be appreciated that the first image is subjected to an image segmentation process using superpixel segmentation to construct irregular blocks of pixels (i.e. superpixels) of pixels having similar texture, colour, brightness etc. characteristics with some visual significance. It is to be understood that the super-pixel segmentation process may be any one of Graph-based, NCut, turbopixel, quick-shift, graph-cut a, graph-cut b, and SLIC, and the embodiment of the present application is not particularly limited.
Taking SLIC as an example, it includes an image cluster initialization operation and a cluster center initialization operation. The number of clusters is determined in the image cluster initialization operation, and the number of clusters determines the number (namely, the first number) of the super pixels which are finally divided; in the initialization operation of the cluster centers, after the cluster number is determined, the center of each cluster is determined in a random sampling mode. It will be appreciated that the choice of cluster center determines to some extent the shape and number of superpixels formed by the segmentation.
It is understood that, in the initialization operation, the specific value of the first quantity can be initialized according to actual needs or experimental theories, and the embodiment of the present application is not particularly limited.
S130, acquiring a first sub-image and a second sub-image; the sum of the number of the first sub-images and the second sub-images is a second number; the first sub-image representing superpixels containing the first feature, the second sub-image representing superpixels containing the second feature;
it will be appreciated that the selection of the super-pixels affects the ability of the subsequent image filling model and the image segmentation model to some extent, and therefore, in order for the image filling model to learn a priori knowledge of the target data set, even if the image segmentation model can segment the region of the input image data where the first feature is located, the selected super-pixels need to contain the super-pixels corresponding to the first feature and the super-pixels corresponding to the second feature. Wherein the first feature and the second feature are two features of different types, for example: when the first feature is a lesion feature, the second feature is a normal feature. Thus, the first sub-image is a superpixel containing skin cancerous features and the second sub-image is a superpixel containing normal skin features.
It is understood that the a priori knowledge includes information on image color, image pixel density, image content structure, etc. The original filling model of the image filling model learns the density difference and the color difference of the first sub-image and the second sub-image through training processing, and therefore a foundation is laid for the subsequent image segmentation model to perform image segmentation operation.
S140, inputting the first sample image, the second sample image and the third sample image into a preset original filling model for training to obtain an image filling model; the first sample image is an image obtained by performing gray value zeroing processing on a first sub-image and a second sub-image of the first image, the second sample image is a binary image of the first sub-image, and the third sample image is a binary image of the second sub-image; the image filling model is used for carrying out image restoration operation.
It will be appreciated that the task of the image fill model is to restore the super-pixel-missing input image to a normally complete image, and therefore, the neural network model can be selected as the original fill model. The original filling model needs to include a network structure of generator-decoders. The final convolution layer of the original filling model needs to be modified, namely, an output channel of the original filling model is modified into three RGB channels corresponding to image data to be filled. Meanwhile, the size of a convolution kernel of the last convolution layer of the original filling model is set to be 1, and the number of convolution layers is modified to be one layer, so that the output result of the image filling model can reserve semantic information obtained by decoding to the maximum extent.
Specifically, the image data (i.e., the first sample image) with the gray value of the region where the super pixel is located being set to zero, and the binary image (i.e., the second sample image and the third sample image) with the gray value of the super pixel region being set to zero are used as the input data of the original filling model. The original filling model learns the difference of the prior knowledge of the first sub-image and the second sub-image according to the image restoration capability of the original filling model, so that a foundation is laid for the training of a subsequent image segmentation model, namely the pre-training of the image segmentation model is realized.
In the training process of the original filling model, a loss function, which is a mean square error, is often used as a global error in the related art. However, in the model training method provided in the embodiment of the present application, in order to make the trained image filling model focus more on the super-pixel region to be restored, the embodiment of the present application performs error calculation by the following formula (1).
Figure BDA0003824684940000071
Wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003824684940000081
representing a first sample image; n represents the number of pixels in the first sample image; theta represents the model parameters of the original filling model; f. of θ (y n ) Representing the prediction result of the original filling model on the first sample image, namely the filling result; m denotes a first sample image and a second sample image. In the embodiment of the application, the error in the training process of the original filling model is calculated by the formula (1), so that the image filling model obtained by training only focuses on the recovery result of the missing super-pixel region of the first sample image, but other un-missing super-pixel regions may generateOf (2) is detected. It can be understood that in the training process of the original filling model, the objective of minimizing or maximizing the error calculation is not strictly pursued, that is, the recovery result is not strictly pursued to be identical to the gold standard, but rather, whether the original filling model can learn the priori knowledge of the first sub-image and the priori knowledge of the second sub-image is more concerned, so that the original filling model is prevented from excessively paying attention to other invalid information in the first image due to excessively pursuing the filling effect on the image, and the performance loss of a downstream task (that is, an image segmentation model training task) is avoided.
According to the model training method provided by the embodiment of the application, the unlabeled first image is learned through the original filling model, and the priori knowledge of the first sub-image and the priori knowledge of the second sub-image are obtained through learning, so that a foundation is laid for training of an image segmentation model in a downstream task. Therefore, pre-training of a target model for a segmentation task is achieved, label-free image data can be used for pre-training, and the problem that effective pre-training cannot be conducted due to the fact that medical images in the related technology have the factors of small data volume, difficulty in data labeling, data privacy and the like is solved.
Referring to fig. 2, in some embodiments, step S130 includes, but is not limited to, substeps S210 through substep S240.
S210, acquiring a conversion image; the conversion image is an image obtained by performing HSV space conversion on the super pixels;
it is understood that the first image is subjected to a spatial conversion operation, i.e. the first image is converted from RGB space to HSV space, so as to obtain a corresponding converted image. Specifically, see the following formulae (2) and (3).
Figure BDA0003824684940000082
Figure BDA0003824684940000083
Wherein, R ' = R/255, G ' = G/255, B ' = B/255, C max =max(R',G',B'),C min =min(R',G',B'),Δ=C max -C min . R denotes R channel data of the first image in RGB space, G denotes G channel data of the first image in RGB space, and B denotes B channel data of the first image in RGB space. H represents H channel data of the first image in HSV space, and S represents S channel data of the first image in HSV space.
S220, obtaining a pixel value of the converted image according to the tone data, the saturation data and the preset weight value of the converted image;
specifically, H-channel data (i.e., hue data) and S-channel data (i.e., saturation data) of the converted image are obtained, and a new pixel value corresponding to the converted image is calculated according to a pixel value mapping relationship described in the following formula (4), so as to better distinguish the difference between the first sub-image and the second sub-image in terms of image density.
M i =α×H i +β×S i ..
Wherein M is i Pixel values representing a new mapping of the ith converted image; h i H channel data representing the ith converted image; s i S-channel data representing the ith converted image; both α and β represent preset weight values. It can be understood that specific values of the preset weight values may be adaptively set according to actual needs, for example, α is set to 0.55, and β is set to 0.45, which is not specifically limited in this embodiment of the application.
S230, obtaining a selection probability according to the pixel value;
it will be appreciated that in order to select the appropriate second number of superpixels from the first number, a probability calculation is performed for each superpixel in the converted image as described in equation (5) below. Wherein the calculated selection probability represents the probability that the corresponding superpixel is selected.
Figure BDA0003824684940000091
Wherein M is i (j) Representing a converted image M i The jth super pixel in (a); p (M) i (j) Represents the probability that the jth superpixel is selected (i.e., the selection probability); n represents the number of superpixels in each converted image, i.e. the specific value of the first number.
S240, obtaining a first sub-image from the plurality of conversion images according to the relation between the selection probability and the first characteristic, and obtaining a second sub-image from the plurality of conversion images according to the relation between the selection probability and the second characteristic.
It will be appreciated that, from experimental statistics, it has been found that, of the newly mapped pixel values, superpixels with smaller pixel values are more likely to contain the first feature, i.e. the pixel values of the first sub-image are smaller than the pixel values of the second sub-image. Therefore, the selection probability of each super pixel is calculated by utilizing the reciprocal of the pixel value obtained by new mapping, a certain number of super pixels are selected from a plurality of super pixels with higher selection probability to serve as a first sub-image, and a certain number of super pixels are selected from a plurality of super pixels with lower selection probability to serve as a second sub-image. And in the first image, carrying out gray value zeroing operation on the selected area corresponding to the first sub-image and the selected area corresponding to the second sub-image so as to obtain a first sample image. The first sample image, the second sample image and the third sample image are used as input data of an original filling model, and the original filling model is trained according to the method described in any one of the above embodiments, so that the original filling model can learn to obtain the prior knowledge of the first sub-image and the prior knowledge of the second sub-image, and pre-training of a downstream task image segmentation model is further achieved.
Referring to fig. 3, in some embodiments, the model training method provided in the embodiments of the present application further includes steps S310 to S330.
S310, processing operation is carried out on the first image; wherein the processing operation comprises any one of an image rotation classification operation, an image filling operation and an image coloring operation;
s320, obtaining a third quantity and a fourth quantity according to the first image after the processing operation and a preset Bayesian optimization strategy;
s330, updating the first quantity according to the third quantity, updating the second quantity according to the fourth quantity, and executing the step S120 again.
It can be understood that, in the training process of the original filling model, the number K of superpixels (i.e. the first number) obtained after the superpixel segmentation process is performed on the first image, and the selected number n of superpixels (i.e. the second number) subjected to the training process of the original filling model all affect the training result of the original filling model. Specifically, the first number K and the second number n have the following four relationships:
firstly, when the value of K is small and the value of n is also small, the proportion of the total pixel area of the selected superpixels in the pixel area of the first image is moderate, but at the moment, the superpixels with continuous positions may be selected, thereby influencing the training of the subsequent downstream tasks.
Secondly, when the value of K is large and the value of n is small, the proportion of the selected total super-pixel area in the pixel area of the first image is small, and at the moment, useful prior knowledge is difficult to learn by an original filling model in training processing.
Thirdly, when the value of K is small and the value of n is large, the proportion of the total pixel area of the selected superpixels in the pixel area of the first image is large, and the training process and the training difficulty of the original filling model may be increased at this time.
Fourthly, when the value of K is large and the value of n is large, the area of the superpixel pixel obtained by the superpixel segmentation processing is moderate, but the values between K and n need to be balanced, so that the subsequent training of the original filling model can achieve the best effect. If the value of n is large, the selected multiple superpixels are excessively dispersed, and the training difficulty of a subsequent original filling model is increased.
Based on the four relationships, in order to balance the values of K and n and enable the original filling model to successfully learn the prior knowledge of the first sub-image and the prior knowledge of the second sub-image, the embodiment of the application adopts a preset Bayesian optimization strategy to perform numerical search so as to obtain different K values and n values, performs superpixel segmentation processing on the first image according to the different K values, and selects the first sub-image and the second sub-image with the total number of n from the K first superpixels according to the corresponding n values, so that the original filling model is trained according to the method described in any one of the embodiments, and a better recovery effect is expected to be obtained and effective prior knowledge is learned.
It can be understood that, in the embodiment of the present application, the first image needs to be processed again, and then the K value and the n value need to be determined according to the processed first image and the preset self-supervision task optimization. Hereinafter, the image rotation classification operation, the image filling operation, and the image coloring operation will be described.
First, an image rotation classification operation will be described. Performing superpixel segmentation processing on the first image, performing gray value zeroing operation, rotating the segmented superpixels by 0 degrees, 90 degrees, 180 degrees and 270 degrees respectively, inputting the rotated superpixels into a classification network of partial weights of an image filling model encoder to perform rotation angle judgment, and taking the accuracy of final rotation angle judgment as an evaluation index. Namely, when the accuracy of the judgment of the rotation angle reaches the highest, the K value and the n value selected at this time are used as the optimal solution, and a group of K value and n value with the highest sensitivity of the image filling model encoder is obtained.
Next, an image filling operation will be described. Similar to the image filling operation described above, but different super-pixel segmentation processes and gray value zeroing operations need to be performed on the first image again to verify the degree of a priori knowledge learned by the image filling model. It is understood that in the image filling operation, the minimization task result is used as an evaluation index, so that the image filling model can achieve the best image restoration effect.
Finally, the image coloring operation is explained. And performing superpixel segmentation processing on the first image, converting the superpixel from a color image into a black-and-white image, and restoring the black-and-white image into the color image by using the expected image filling model so as to verify the degree of the priori knowledge learned by the image filling model. It is understood that in the image filling operation, the minimization task result is used as an evaluation index, so that the image filling model can achieve the best image restoration effect.
It can be understood that, for different data sets (i.e., the set of first images), any one of an image rotation classification operation, an image filling operation, and an image coloring operation may be selected for processing, and a preset bayesian optimization strategy is combined to perform optimization on the K value and the n value, so that the training process of the image filling model is more flexible, and the universality of the image filling model is improved.
Referring to fig. 4, the present application provides another model training method for training an image segmentation model, which includes, but is not limited to, steps S410 to S430.
S410, obtaining an original segmentation model according to the image filling model;
it is to be understood that the image filling model trained according to any of the above embodiments is migrated to obtain the original segmentation model. Specifically, during the migration process, the whole weight of the image filling model may be loaded, or only a part of the weights of the encoder in the image filling model may be loaded, which is not specifically described in this embodiment of the present application.
S420, acquiring a second image containing the first characteristic and label data corresponding to the second image;
it will be appreciated that a second image is acquired that is annotated and contains the first feature. Wherein the label data comprises information of the first feature in the second image.
S430, inputting the second image and the label data into a preset original segmentation model for training to obtain an image segmentation model; the image segmentation model is used for segmenting a first feature in a second image of the image.
And taking the second image and the label data as input data of the original segmentation model to obtain a segmentation result. And performing error optimization on the original segmentation model according to the segmentation gold standard and the segmentation result so as to obtain the image segmentation model. It will be appreciated that the image segmentation model is used to segment the region of the input image in which the first feature is located. Referring to fig. 5, when a deep learning model of a deplabv 3 plus is used as an original filling model of an image filling model, the model training method provided by the embodiment of the application can effectively improve the segmentation effect of a downstream segmentation task. Wherein the first column 100 in fig. 5 represents the input image; the second column 200 in fig. 5 represents the segmentation result obtained by using the deep learning model of deplab v3 plus as the original filling model, but without using the model training method of the embodiment of the present application; the third column 300 in fig. 5 represents the segmentation result obtained by using the deep learning model of deplab v3 plus as the original filling model and using the model training method of the embodiment of the present application; the fourth column 400 in fig. 5 represents the image segmentation gold criteria.
Table 1:
Figure BDA0003824684940000121
it can be understood that table 1 is the performance data obtained by selecting three common deep learning models, i.e., U-Net, resNet, deplabv 3 plus, and an slsdepmodel proposed by mosafa as the original filling model for testing. In this test, the data set used included Ham10000, ISIC2017, and PH2. Wherein the Ham10000 dataset comprises 10014 unlabeled skin cancer images, which are used as training data and test data in the training of the image filling model. The ISIC2017 dataset is an ISIC Archive 2017 challenge specific dataset, which is divided into 2000 training data, 150 verification data, and 600 test data. The ISIC2017 dataset is used for verifying the model training method provided by the embodiment of the application in training of an image segmentation model, wherein all data in the dataset comprise annotations. The PH2 data set contains 200 labeled skin cancer images, which serve as additional test data for verifying the model training method provided in the embodiment of the present application in the training of the image segmentation model. It can be understood that, as can be seen from the data in table 1 and fig. 5, after the model training method provided in the embodiment of the present application is used for training the model, the performance of the model is effectively improved, so that the effectiveness of the model training method provided in the embodiment of the present application is proved.
Referring to FIG. 6, in a particular embodiment, the dashed arrows represent the flow of training the image population model, and the solid arrows represent the flow of training the image segmentation model. Specifically, the image 501 is an image including skin cancer features, the image 501 is subjected to superpixel segmentation to obtain an image 502, the image 502 is subjected to pixel value probability selection and gray value zeroing operation to obtain an image 503, the image 503 is used as input data of an original filling model, the original filling model is used for performing restoration operation on the image 503 to obtain an image 504 output by the original filling model, and the original filling model is subjected to parameter adjustment according to loss values calculated by the image 504 and the image 505 to obtain an image filling model. Transferring partial weight or all weight of the image filling model to the original segmentation model, using the image 506 as input data of the original segmentation model to obtain a segmentation result (namely, an image 507) of the skin cancer characteristic region in the image 506 by the original segmentation model, and performing parameter adjustment on the original segmentation model according to the loss value calculated by the image 507 and a gold standard (namely, the image 508) to obtain the image segmentation model. In the process of training the image filling model, a super-pixel strategy based on bayesian optimization is performed on the image 503 to update the number n and the number K of super-pixels subjected to the original filling model training process.
Referring to fig. 7, an embodiment of the present application further provides an image segmentation method, which includes, but is not limited to, steps S710 to S720.
S710, acquiring current image data;
and S720, inputting the current image data into the image segmentation model for segmentation processing to obtain a segmentation result.
Specifically, image data that needs to be segmented at present is obtained, and the image data is input to the image segmentation model obtained by training according to the model training method described in any of the above embodiments. The image segmentation model segments the region of the first feature in the image data according to the learned prior knowledge of the first feature, so as to facilitate the subsequent medical treatment.
Referring to fig. 8, an embodiment of the present application further provides a model training apparatus, including:
the super-pixel segmentation module is used for acquiring a first image containing first characteristics and performing super-pixel segmentation processing on the first image to obtain a first number of super-pixels;
the super pixel selection module is used for acquiring a first sub-image and a second sub-image; the sum of the number of the first sub-images and the second sub-images is a second number; the first sub-image representing superpixels containing the first feature, the second sub-image representing superpixels containing the second feature;
the super-pixel filling module is used for inputting the first sample image, the second sample image and the third sample image into a preset original filling model for training processing to obtain an image filling model; the first sample image is an image obtained by performing gray value zeroing processing on a first sub-image and a second sub-image of the first image, the second sample image is a binary image of the first sub-image, and the third sample image is a binary image of the second sub-image; the image filling model is used for carrying out image restoration operation;
the segmentation module is used for acquiring a second image containing the first characteristics and label data corresponding to the second image; the second image and the label data are input into a preset original segmentation model for training processing to obtain an image segmentation model; wherein the image segmentation model is used for segmenting the first feature in the second image.
It can be seen that, the contents in the above embodiment of the model training method are all applicable to the embodiment of the model training device, the functions specifically implemented by the embodiment of the model training device are the same as those in the above embodiment of the model training method, and the beneficial effects achieved by the embodiment of the model training method are also the same as those achieved by the embodiment of the model training method.
An embodiment of the present application further provides an electronic device, including:
at least one memory;
at least one processor;
at least one program;
a program is stored in the memory and the processor executes at least one program to implement the present disclosure to implement the model training method or the image segmentation method described above. The electronic device may be any intelligent terminal including a mobile phone, a tablet computer, a Personal Digital Assistant (PDA), a vehicle-mounted computer, and the like.
Referring to fig. 9, fig. 9 illustrates a hardware structure of an electronic apparatus of another embodiment, the electronic apparatus including:
the processor 901 may be implemented by a general Central Processing Unit (CPU), a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits, and is configured to execute a relevant program to implement the technical solution provided by the embodiment of the present disclosure;
the Memory 902 may be implemented in the form of a Read Only Memory (ROM), a static storage device, a dynamic storage device, or a Random Access Memory (RAM). The memory 902 may store an operating system and other application programs, and when the technical solution provided by the embodiments of the present disclosure is implemented by software or firmware, the relevant program codes are stored in the memory 902 and called by the processor 901 to execute a dialog generation model training method or a dialog generation method according to the embodiments of the present disclosure;
an input/output interface 903 for implementing information input and output;
a communication interface 904, configured to implement communication interaction between the device and another device, where communication may be implemented in a wired manner (e.g., USB, network cable, etc.), or in a wireless manner (e.g., mobile network, WIFI, bluetooth, etc.);
a bus 905 that transfers information between various components of the device (e.g., the processor 901, the memory 902, the input/output interface 903, and the communication interface 904);
wherein the processor 901, the memory 902, the input/output interface 903 and the communication interface 904 are communicatively connected to each other within the device via a bus 905.
The disclosed embodiment also provides a storage medium, which is a computer-readable storage medium, and the computer-readable storage medium stores computer-executable instructions for causing a computer to execute the above model training method or image segmentation method.
The memory, which is a non-transitory computer readable storage medium, may be used to store non-transitory software programs as well as non-transitory computer executable programs. Further, the memory may include high speed random access memory, and may also include non-transitory memory, such as at least one disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory located remotely from the processor, and these remote memories may be connected to the processor through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The embodiments described in the embodiments of the present disclosure are for more clearly illustrating the technical solutions of the embodiments of the present disclosure, and do not constitute a limitation to the technical solutions provided in the embodiments of the present disclosure, and it is obvious to those skilled in the art that the technical solutions provided in the embodiments of the present disclosure are also applicable to similar technical problems with the evolution of technology and the emergence of new application scenarios.
Those skilled in the art will appreciate that the solutions shown in the figures are not intended to limit embodiments of the present disclosure, and may include more or less steps than those shown, or some of the steps may be combined, or different steps.
The above-described embodiments of the apparatus are merely illustrative, wherein the units illustrated as separate components may or may not be physically separate, i.e. may be located in one place, or may also be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
One of ordinary skill in the art will appreciate that all or some of the steps of the methods, systems, functional modules/units in the devices disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof.
The terms "first," "second," "third," "fourth," and the like in the description of the application and the above-described figures, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It should be understood that in the present application, "at least one" means one or more, "a plurality" means two or more. "and/or" for describing an association relationship of associated objects, indicating that there may be three relationships, e.g., "a and/or B" may indicate: only A, only B and both A and B are present, wherein A and B may be singular or plural. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. "at least one of the following" or similar expressions refer to any combination of these items, including any combination of single item(s) or plural items. For example, at least one (one) of a, b, or c, may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a unit is merely a logical division, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes multiple instructions for enabling an electronic device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method of the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing programs, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The preferred embodiments of the present disclosure have been described above with reference to the accompanying drawings, and therefore do not limit the scope of the claims of the embodiments of the present disclosure. Any modifications, equivalents and improvements within the scope and spirit of the embodiments of the present disclosure should be considered within the scope of the claims of the embodiments of the present disclosure by those skilled in the art.

Claims (8)

1. A model training method for training an image filling model, the method comprising:
acquiring a first image containing a first feature;
performing superpixel segmentation processing on the first image to obtain a first number of superpixels;
acquiring a first sub-image and a second sub-image; wherein the sum of the number of the first sub-images and the second sub-images is a second number; the first sub-image representing the superpixel including the first feature, the second sub-image representing the superpixel including a second feature;
inputting a first sample image, a second sample image and a third sample image into a preset original filling model for training to obtain an image filling model; the first sample image is an image obtained by performing gray value zeroing processing on the first sub-image and the second sub-image of the first image, the second sample image is a binary image of the first sub-image, and the third sample image is a binary image of the second sub-image; the image filling model is used for carrying out image restoration operation.
2. The model training method of claim 1, wherein the obtaining the first sub-image and the second sub-image comprises:
acquiring a conversion image; the conversion image is an image obtained by performing HSV space conversion on the first image;
obtaining a pixel value of the conversion image according to the tone data of the conversion image, the saturation data of the conversion image and a preset weight value;
obtaining a selection probability according to the pixel value;
and obtaining the first sub-image from the converted image according to the relation between the selection probability and the first characteristic, and obtaining the second sub-image from the converted image according to the relation between the selection probability and the second characteristic.
3. The model training method according to any one of claims 1 or 2, characterized in that the method further comprises:
processing the first image; wherein the processing operation comprises any one of an image rotation classification operation, an image filling operation and an image coloring operation;
obtaining a third quantity and a fourth quantity according to the first image after the processing operation and a preset Bayesian optimization strategy;
and updating the first number according to the third number, updating the second number according to the fourth number, and executing the super-pixel segmentation processing on the first image again.
4. A model training method for training an image segmentation model, the method comprising:
-filling the model with images according to any of claims 1 to 3 resulting in an original segmentation model;
acquiring a second image containing a first characteristic and label data corresponding to the second image;
inputting the second image and the label data into a preset original segmentation model for training to obtain the image segmentation model; wherein the image segmentation model is used to segment a first feature in the second image.
5. A method of image segmentation, the method comprising:
acquiring current image data;
inputting the current image data into an image segmentation model for segmentation processing to obtain a segmentation result; wherein the image segmentation model is trained according to the model training method as claimed in claim 4.
6. A model training apparatus, comprising:
the super-pixel segmentation module is used for acquiring a first image containing first characteristics and performing super-pixel segmentation processing on the first image to obtain a first number of super-pixels;
a superpixel selection module to obtain a first sub-image and a second sub-image; wherein the sum of the number of the first sub-images and the second sub-images is a second number; the first sub-image represents the superpixels including the first feature, the second sub-image represents the superpixels including a second feature;
the super-pixel filling module is used for inputting the first sample image, the second sample image and the third sample image into a preset original filling model for training processing to obtain an image filling model; the first sample image is an image obtained by performing gray value zeroing processing on the first sub-image and the second sub-image of the first image, the second sample image is a binary image of the first sub-image, and the third sample image is a binary image of the second sub-image; the image filling model is used for carrying out image restoration operation;
the segmentation module is used for acquiring a second image containing a first characteristic and label data corresponding to the second image; the second image and the label data are input into a preset original segmentation model for training processing to obtain the image segmentation model; wherein the image segmentation model is used to segment a first feature in the second image.
7. An electronic device, comprising:
at least one memory;
at least one processor;
at least one computer program;
the computer program is stored in the memory, and the at least one computer program is executed by the processor to implement:
the method of any one of claims 1 to 3; or
The method of claim 4.
8. A computer-readable storage medium having computer-executable instructions stored thereon for causing a computer to perform:
the method of any one of claims 1 to 3; or
The method of claim 4.
CN202211053523.3A 2022-08-31 2022-08-31 Model training method and device, image segmentation method, equipment and storage medium Pending CN115439713A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211053523.3A CN115439713A (en) 2022-08-31 2022-08-31 Model training method and device, image segmentation method, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211053523.3A CN115439713A (en) 2022-08-31 2022-08-31 Model training method and device, image segmentation method, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115439713A true CN115439713A (en) 2022-12-06

Family

ID=84243691

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211053523.3A Pending CN115439713A (en) 2022-08-31 2022-08-31 Model training method and device, image segmentation method, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115439713A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116188919A (en) * 2023-04-25 2023-05-30 之江实验室 Test method and device, readable storage medium and electronic equipment

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116188919A (en) * 2023-04-25 2023-05-30 之江实验室 Test method and device, readable storage medium and electronic equipment

Similar Documents

Publication Publication Date Title
CN111898696B (en) Pseudo tag and tag prediction model generation method, device, medium and equipment
CN107292352B (en) Image classification method and device based on convolutional neural network
CN112308866B (en) Image processing method, device, electronic equipment and storage medium
US20200242353A1 (en) Generating shift-invariant neural network outputs
US20220108478A1 (en) Processing images using self-attention based neural networks
CN114638960A (en) Model training method, image description generation method and device, equipment and medium
CN116580257A (en) Feature fusion model training and sample retrieval method and device and computer equipment
CN115205150A (en) Image deblurring method, device, equipment, medium and computer program product
JP2023131117A (en) Joint perception model training, joint perception method, device, and medium
CN116452810A (en) Multi-level semantic segmentation method and device, electronic equipment and storage medium
CN115439713A (en) Model training method and device, image segmentation method, equipment and storage medium
CN112115744A (en) Point cloud data processing method and device, computer storage medium and electronic equipment
CN115909336A (en) Text recognition method and device, computer equipment and computer-readable storage medium
CN112906517A (en) Self-supervision power law distribution crowd counting method and device and electronic equipment
CN113569855A (en) Tongue picture segmentation method, equipment and storage medium
CN115115910A (en) Training method, using method, device, equipment and medium of image processing model
CN114897053A (en) Subspace clustering method, subspace clustering device, subspace clustering equipment and storage medium
CN114692715A (en) Sample labeling method and device
US11983903B2 (en) Processing images using self-attention based neural networks
CN117593610B (en) Image recognition network training and deployment and recognition methods, devices, equipment and media
CN114626520B (en) Method, device, equipment and storage medium for training model
CN117218300B (en) Three-dimensional model construction method, three-dimensional model construction training method and device
US11875442B1 (en) Articulated part extraction from sprite sheets using machine learning
CN114782768A (en) Training method of pre-training network model, medical image processing method and equipment
Wang et al. MADB-RemdNet for Few-Shot Learning in Remote Sensing Classification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination