WO2023226606A1 - Image segmentation sample generation method and apparatus, method and apparatus for pre-training image segmentation model, and device and medium - Google Patents

Image segmentation sample generation method and apparatus, method and apparatus for pre-training image segmentation model, and device and medium Download PDF

Info

Publication number
WO2023226606A1
WO2023226606A1 PCT/CN2023/087460 CN2023087460W WO2023226606A1 WO 2023226606 A1 WO2023226606 A1 WO 2023226606A1 CN 2023087460 W CN2023087460 W CN 2023087460W WO 2023226606 A1 WO2023226606 A1 WO 2023226606A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
sample
image segmentation
classification
model
Prior art date
Application number
PCT/CN2023/087460
Other languages
French (fr)
Chinese (zh)
Inventor
李徐泓
熊昊一
刘毅
窦德景
Original Assignee
北京百度网讯科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京百度网讯科技有限公司 filed Critical 北京百度网讯科技有限公司
Publication of WO2023226606A1 publication Critical patent/WO2023226606A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Definitions

  • the present disclosure relates to the field of computer vision and image segmentation, for example, to methods and devices for generating image segmentation samples, pre-training methods and devices for image segmentation models, equipment, and media.
  • Image semantic segmentation is a traditional task in the field of computer vision, aiming to identify and classify every pixel in the image.
  • Image segmentation technology can be applied to a variety of scenarios, such as driving scene understanding and medical image analysis. This technology mainly achieves image semantic segmentation by training image segmentation models.
  • an image segmentation model can be obtained by training a large number of image segmentation samples.
  • related technologies mainly use existing image classification samples to pre-train the backbone model parameters of the deep learning model, and then continue to use image segmentation samples to predict the depth of the model.
  • the learning model is re-adjusted to finally obtain the desired image segmentation model.
  • the present disclosure provides methods and devices for generating image segmentation samples, pre-training methods and devices for image segmentation models, equipment, and media.
  • a method for generating image segmentation samples including:
  • image classification samples where the image classification samples include sample images and classification labels of the sample images;
  • an image segmentation sample corresponding to the image classification sample is formed.
  • a pre-training method for an image segmentation model including:
  • each image classification sample in the image classification sample set is processed to generate an image segmentation sample set corresponding to the image classification sample set;
  • all model parameters included in the preset machine learning model are trained to obtain a pre-trained image segmentation model.
  • a device for generating image segmentation samples including:
  • the classification sample acquisition module is configured to obtain image classification samples, where the image classification samples include sample images and classification labels of the sample images;
  • the weight value determination module is set to determine the weight value of each pixel of the sample image under the action of the image classification model through an interpretable algorithm
  • the forward label pixel filtering module is set to select forward label pixels in the sample image based on the weight value of each pixel;
  • the image segmentation sample generation module is configured to form an image segmentation sample corresponding to the image classification sample based on the forward label pixels and classification labels of the sample image.
  • a pre-training device for an image segmentation model including:
  • the sample set acquisition module is set to obtain the image classification sample set
  • the image segmentation sample set generation module is configured to use the above-mentioned image segmentation sample generation method to process each image classification sample in the image classification sample set, and generate an image segmentation sample set corresponding to the image classification sample set;
  • the pre-trained image segmentation model acquisition module is configured to train all model parameters included in the preset machine learning model according to the image segmentation sample set to obtain a pre-trained image segmentation model.
  • an electronic device including:
  • processors one or more processors
  • a storage device configured to store one or more programs
  • the one or more processors When the one or more programs are executed by the one or more processors, the one or more processors are caused to implement the above-mentioned generation method of image segmentation samples, or the above-mentioned pre-training of the image segmentation model.
  • a non-transitory computer-readable storage medium of computer instructions wherein the computer instructions are used to cause the computer to execute the above-mentioned generation method of image segmentation samples, or the above-mentioned prediction of the image segmentation model. train.
  • Figure 1 is a flow chart of a method for generating image segmentation samples provided by an embodiment of the present disclosure
  • Figure 2a is a flow chart of another method for generating image segmentation samples provided by an embodiment of the present disclosure
  • Figure 2b is a flow chart of another method for generating image segmentation samples provided by an embodiment of the present disclosure
  • Figure 3 is a flow chart of a pre-training method for an image segmentation model provided by an embodiment of the present disclosure
  • Figure 4a is a flow chart of another pre-training method for an image segmentation model provided by an embodiment of the present disclosure
  • Figure 4b is a flow chart of another pre-training method for an image segmentation model provided by an embodiment of the present disclosure
  • Figure 5 is a logical schematic diagram of pixel weight value denoising provided by an embodiment of the present disclosure
  • Figure 6 is a schematic diagram before binarization processing of an average result provided by an embodiment of the present disclosure.
  • Figure 7 is a schematic diagram after binarization processing of an average result provided by an embodiment of the present disclosure.
  • Figure 8 is a schematic diagram of a publicly provided device for generating image segmentation samples
  • Figure 9 is a schematic diagram of a publicly provided pre-training device for an image segmentation model
  • Figure 10 is a schematic diagram of another publicly provided pre-training device for an image segmentation model
  • Figure 11 is a schematic diagram of another publicly provided pre-training device for an image segmentation model
  • Figure 12 is a schematic block diagram of an electronic device provided by an embodiment of the present disclosure.
  • Figure 1 is a flow chart of a method for generating image segmentation samples provided by an embodiment of the present disclosure. This embodiment can be applied to the case of adding classification labels to pixels in the sample image. This method can be based on the image
  • the segmented sample is generated by a device, which can be implemented by at least one of software and hardware, and can generally be integrated in an electronic device.
  • the method includes the following operations:
  • Image classification samples can be understood as images labeled with classification labels. Image classification samples can generally be used as training samples for training image classification models.
  • the image classification sample may include sample images and classification labels of the sample images.
  • Sample images can be images in any scene.
  • Classification labels can be used to characterize the category to which a sample image belongs.
  • the classification label of a sample image containing zebras can be zebra or animal
  • the classification label of a sample image containing apples can be apples or fruits, etc.
  • the embodiments of the present disclosure do not limit the sample images and the classification labels of the sample images.
  • a sample image of any scene can be obtained, and a classification label corresponding to the sample image can be determined, so that the sample image and the classification label matching the sample image are used as image classification samples.
  • the interpretable algorithm can be an algorithm that makes the model prediction results interpretable and is used to establish the correlation between the model input data and the model prediction process.
  • the image classification model can be any machine learning model that can identify the category to which the image belongs.
  • the types of image classification models can include implementations based on convolutional neural networks (Convolutional Neural Networks, CNN) or recurrent neural networks (Recurrent Neural Networks, RNN).
  • an interpretable algorithm can be selected from known interpretable algorithms, such as an interpretable algorithm based on gradients, or an interpretable algorithm based on integral gradients, etc., and then the sample image is input to the image After the classification model is established, each pixel in the sample image can be determined based on the interpretable algorithm, and the degree of effect it plays when the image classification model determines the category to which the sample image belongs, and the degree of effect is quantified and used as the weight value of the pixel.
  • known interpretable algorithms such as an interpretable algorithm based on gradients, or an interpretable algorithm based on integral gradients, etc.
  • the forward label pixels can be determined from all pixels in the sample image according to the weight value. part of the pixels.
  • multiple pixels in the sample image can be filtered according to the preset filtering conditions of the pixels and the weight value of each pixel, and the forward label pixels in the sample image can be obtained. That is, the pixels that determine the sample label of the sample image are obtained, and the pixels in the sample image that play an important role in image classification are obtained based on the interpretable algorithm.
  • the filtering conditions for pixels may include the number of pixels to be filtered out, or the weight value of the pixels being greater than a set threshold, etc.
  • S140 Form an image segmentation sample corresponding to the image classification sample based on the forward label pixels and classification labels of the sample image.
  • Image segmentation samples can be understood as images marked with segmentation labels, and segmentation labels can be understood as classification labels that mark segmentation objects (for example, the zebra pattern in the aforementioned example image) in the image in units of pixels.
  • Image segmentation samples can generally be used as training samples for training image segmentation models.
  • the image segmentation model that has completed training has image segmentation capabilities, that is, it can identify objects with different attributes (classification labels) in the image.
  • the forward label pixels and the classification labels matching the forward label pixels can be used as image segmentation samples corresponding to the image classification samples, and then the image segmentation model is trained through the image segmentation samples, so as to The image segmentation model trained with image segmentation samples is used to segment the input image, achieving simple and convenient generation of image segmentation samples based on the technology of image classification samples.
  • the technical solution of the embodiment of the present disclosure is to obtain an image classification sample including a sample image and a classification label of the sample image, and then determine the weight value of each pixel of the sample image under the action of the image classification model through an interpretable algorithm, According to the weight value of each pixel, the forward label pixels are selected in the sample image, thereby forming an image segmentation sample corresponding to the image classification sample according to the forward label pixels and classification labels of the sample image.
  • Technical means can be used to The interpretation algorithm obtains the pixels in the sample image that play an important role in image classification. Therefore, it can simply, conveniently and accurately add corresponding classification labels to the pixels that play an important role in the sample image to form an image segmentation sample.
  • a large number of image segmentation samples can be obtained at a very small cost, which improves the generation efficiency of image segmentation samples and improves the pre-training performance of the image segmentation model to a certain extent.
  • Figure 2a is a flow chart of a method for generating image segmentation samples provided by an embodiment of the present disclosure.
  • This embodiment provides an interpretable algorithm that determines each sample image under the action of the image classification model.
  • the method includes the following operations:
  • Image classification samples include sample images and classification labels of the sample images.
  • the weight value is used to measure the importance of pixels in the classification process of sample images by the image classification model.
  • the model parameters may be configuration parameters of the image classification model, for example, weight coefficients.
  • the model parameters of the image classification model and an interpretable algorithm selected from known interpretable algorithms can be determined, and then the sample image and the model parameters of the image classification model can be input
  • the algorithm model that matches the selected interpretable algorithm use the algorithm model to calculate the importance of each pixel in the image classification model's classification process of the sample image, and then compare it with the pixel in the image classification model's classification of the sample. The degree of importance in the image classification process is quantified, and the weight value of the pixel in the sample image is obtained.
  • the action process of the image classification model can be explained, and a quantitative value of the degree of adjustment of the model parameters by the image classification model according to the pixels in the sample image is given, which facilitates rapid positioning in the classification process of the sample image. important pixels.
  • inputting the model parameters of the sample image and the image classification model into an algorithm model that matches the interpretable algorithm, and obtaining the weight value of each pixel in the sample image may include: obtaining multiple Image classification model; input the model parameters of the sample image and each image classification model into the algorithm model respectively, and obtain the single model weight value of each pixel in the sample image under the action of each image classification model; add the same model in the sample image
  • the multiple single model weight values at the pixel position are weighted and averaged to obtain the weight value of the pixel in the sample image.
  • the single model weight value can be the weight value of the pixel determined by the algorithm model of the interpretable algorithm based on a single image classification model and the sample image.
  • multiple trained image classification models can be obtained, and then the model parameters of each image classification model can be obtained, and the sample image and the model parameters matching the current image classification model can be input to the algorithm model of the interpretable algorithm. , obtain the single model weight value of each pixel of the sample image during the classification of the sample image by the current image classification model.
  • selecting forward label pixels in the sample image based on the weight value of each pixel may include: determining the pixels based on the total number of pixels in the sample image and a preset selection ratio. The number of selected points; in order of weight value from large to small, select the forward label pixels that match the number of selected pixels.
  • the total number of pixels may be the total number of pixels included in the image.
  • the selection ratio can be preset, describing the ratio of filtering pixels from the pixels of the sample image.
  • the number of pixels to be selected can be determined based on the total number of pixels in the sample image and the selection ratio, and the number of pixels that need to be filtered from the pixels in the sample image.
  • the total number of pixels of the sample image can be determined first, and then the preset selection ratio can be obtained, and then the number of pixels to be selected can be determined based on the product of the total number of pixels of the sample image and the selection ratio, and the number of pixels to be selected can be determined in order from largest to largest.
  • the weight values of multiple pixels in the sample image are sorted in order of small, so that the pixels matching the selected number of pixels are selected in order of weight values from large to small, and the selected pixels are used as forward label pixels.
  • the classification of the image can be determined based on the key pixels. By selecting the forward label pixels that match the selected number of pixels in order of weight values from large to small, the image classification can be filtered out In the process, the positive label pixels that have a greater impact will be reduced to reduce the data processing time of subsequent forward label pixels.
  • S240 Form an image segmentation sample corresponding to the image classification sample based on the forward label pixels and classification labels of the sample image.
  • forming an image segmentation sample corresponding to the image classification sample based on the forward label pixels and the classification label of the sample image may include: using the classification label to classify each forward label pixel in the sample image. Points are labeled to form image segmentation samples.
  • the classification label of the sample image can be obtained, and then each forward label pixel in the sample image is labeled with a classification label that matches the sample image to which the forward label pixel belongs, forming an image segmentation sample. Since forward label pixels are pixels that have a greater impact in the image classification process, using forward label pixels labeled with classification labels as image segmentation samples can minimize the amount of training data while ensuring the training effect. , improve the training speed of the model.
  • the technical solution of the embodiment of the present disclosure obtains the image classification sample, and then inputs the model parameters of the sample image and the image classification model into an algorithm model that matches the interpretable algorithm, and obtains the weight value of each pixel in the sample image, Therefore, according to the weight value of each pixel, the forward label pixels are selected in the sample image, and based on the forward label pixels and classification labels of the sample image, a
  • the image segmentation sample corresponding to the image classification sample is convenient for quickly locating important pixel points in the classification process of the sample image, and the key pixel points and classification labels are used to generate the image segmentation sample to form an image segmentation sample corresponding to the image classification sample.
  • Technical means use interpretable algorithms to obtain the pixels in the sample image that play an important role in image classification.
  • the corresponding classification labels can be added to the pixels that play an important role in the sample image simply, conveniently and with high accuracy.
  • Forming image segmentation samples can obtain a large number of image segmentation samples at a very small cost, improve the generation efficiency of image segmentation samples, and improve the pre-training performance of the image segmentation model to a certain extent.
  • Figure 2b is a flow chart of another method for generating image segmentation samples provided by an embodiment of the present disclosure. As shown in Figure 2b, the method includes:
  • Image classification samples include sample images and classification labels of the sample images.
  • SS2110 may include:
  • the technical solution of the embodiment of the present disclosure obtains image classification samples, obtains multiple image classification models, and inputs the sample image and the model parameters of each image classification model into the algorithm model respectively to obtain the sample image in each image classification.
  • the single model weight value of each pixel is weighted and averaged to obtain the weight value of the pixel in the sample image, and then based on the weight value of the pixel in the sample image.
  • the total number of pixels, and the pre- Assume the selection ratio determine the number of pixels to be selected, and select the forward label pixels that match the number of pixels selected in order of weight value from large to small, thereby forming a formula based on the forward label pixels and classification labels of the sample image.
  • image segmentation samples corresponding to image classification samples uses an interpretable algorithm to obtain the pixels in the sample image that play an important role in image classification. Therefore, it can be simple, convenient and highly accurate to play an important role in the sample image. Adding corresponding classification labels to the affected pixels forms image segmentation samples, which can obtain a large number of image segmentation samples at a very small cost, improve the generation efficiency of image segmentation samples, and improve the pre-training performance of the image segmentation model to a certain extent. .
  • FIG. 3 is a flow chart of a pre-training method for an image segmentation model provided by an embodiment of the present disclosure. This embodiment can be applied to the case of pre-training an image segmentation model.
  • the method can be composed of an image segmentation model.
  • the pre-training device is implemented by at least one of software and hardware, and can generally be integrated into an electronic device.
  • the method includes the following operations:
  • the image classification sample set may be a collection of image classification samples.
  • multiple image classification samples in any scenario can be obtained to obtain an image classification sample set including multiple image classification samples.
  • S320 Use the image segmentation sample generation method to process each image classification sample in the image classification sample set, and generate an image segmentation sample set corresponding to the image classification sample set.
  • the image segmentation sample set may be a set of image segmentation samples generated according to the method for generating image segmentation samples in any of the above embodiments.
  • the forward label pixels and classification labels of the sample images of each image classification sample in the image classification sample set can be determined according to the method for generating image segmentation samples in any of the above embodiments, so that according to each The forward label pixels and classification labels of the sample images of the image classification samples respectively form image segmentation samples corresponding to the image classification samples, that is, the image segmentation sample set corresponding to the image classification sample set is obtained.
  • the pre-trained image segmentation model may be a model obtained by training a preset machine learning model through an image segmentation sample set.
  • a preset machine learning model can be obtained and the preset machine learning model can be determined Learn all model parameters included in the model, and then use the image segmentation sample set to train all model parameters included in the preset machine learning model. Use the trained machine learning model as a pre-trained image segmentation model, and then use the pre-trained image The segmentation model performs image segmentation on the input image.
  • the technical solution of the embodiment of the present disclosure is to obtain an image classification sample set and then use an image segmentation sample generation method to process each image classification sample in the image classification sample set to generate an image segmentation sample set corresponding to the image classification sample set. , based on the image segmentation sample set, train all model parameters included in the preset machine learning model to obtain a pre-trained image segmentation model.
  • Embodiments of the present disclosure can use an interpretable algorithm to obtain the pixels in the sample image that have an important impact on image classification based on the image segmentation sample generation method in any of the above embodiments. Therefore, it can be simple, convenient and highly accurate. Add corresponding classification labels to the pixels that play an important role in the sample image to form image segmentation samples, thereby obtaining an image segmentation sample set.
  • a large number of image segmentation sample sets can be obtained at a very small cost, which improves the generation of image segmentation samples.
  • Efficiency improves the pre-training performance of the image segmentation model and improves the training effect of the pre-trained model.
  • Figure 4a is a flow chart of another pre-training method for an image segmentation model provided by an embodiment of the present disclosure.
  • the method includes the following operations:
  • S420 Use an image segmentation sample generation method to process each image classification sample in the image classification sample set, and generate an image segmentation sample set corresponding to the image classification sample set.
  • the preset machine learning model before training all model parameters included in the preset machine learning model according to the image segmentation sample set to obtain the pre-trained image segmentation model, it may also include: based on the image classification sample set For all corresponding classification labels, select a heterogeneous label that is different from all classification labels; all pixels in the image segmentation samples in the image segmentation sample set that are not labeled with classification labels are labeled with heterogeneous labels.
  • the heterogeneous label may be a label that does not exist in all classification labels corresponding to the image segmentation sample set and has nothing to do with the classification of the sample image.
  • heterogeneous labels can be used to identify image backgrounds.
  • all classification labels corresponding to the sample images in the image classification sample set can be obtained, and then all classification labels can be parsed to determine labels that are different from all classification labels as heterogeneous labels, thereby segmenting the images in the sample set.
  • the pixels in the segmented sample except the forward label pixels that is, at least one pixel that is not labeled with a classification label) are labeled with heterogeneous labels.
  • the image segmentation task scene may be the scene to which the picture for image segmentation belongs.
  • the standard image segmentation sample set may be an image segmentation sample set matching the image segmentation task scenario.
  • the image segmentation task scene can be determined first, and then an image classification sample set matching the image segmentation task scene can be obtained, and according to the method for generating image segmentation samples in any of the above embodiments, an image segmentation sample set corresponding to the image segmentation sample set can be generated. Standard image segmentation sample set for task scene matching.
  • S450 Use the standard image segmentation sample set to fine-tune the pre-trained image segmentation model to obtain a target image segmentation model that matches the image segmentation task scenario.
  • the target image segmentation model may be a model obtained by fine-tuning a pre-trained image segmentation model using a standard image segmentation sample set.
  • the pre-trained image segmentation model can be trained using a standard image segmentation sample set, thereby adjusting the model parameters of the pre-trained image segmentation model to obtain a target image segmentation model that matches the image segmentation task scenario, and also That is, the target image segmentation model can perform high-precision image segmentation on images that match the image segmentation task scene.
  • Training the pre-trained image segmentation model through the standard image segmentation sample set can make the pre-trained model image segmentation model have stronger image segmentation capabilities for images in specific image segmentation task scenarios only on the premise of fine-tuning the model parameters.
  • Image segmentation task scenarios may include at least one of the following: driving scenarios, medical imaging scenarios, robot perception scenarios, and remote sensing satellite image segmentation scenarios.
  • the image segmentation sample sets in different image segmentation task scenarios have image characteristics unique to the image segmentation task scenario, so that using the image segmentation sample set in the image segmentation task scenario to train the pre-trained model image segmentation model can make the pre-trained model image
  • the segmentation model has stronger image segmentation capabilities for specific image segmentation task scenarios.
  • the technical solution of the embodiment of the present disclosure is to obtain an image classification sample set and then use an image segmentation sample generation method to process each image classification sample in the image classification sample set to generate an image segmentation sample set corresponding to the image classification sample set. , thereby training all model parameters included in the preset machine learning model according to the image segmentation sample set, obtaining a pre-trained image segmentation model, and obtaining a standard image segmentation sample set that matches the image segmentation task scenario.
  • an interpretable algorithm is used to obtain the pixel points in the sample image that have an important impact on the image classification, so it can be simple, convenient and highly accurate
  • a large number of image segmentation samples can be obtained at a very small cost and improve It improves the generation efficiency of image segmentation samples, improves the pre-training performance of the image segmentation model to a certain extent, and enables the pre-trained model image segmentation model to be capable of images in specific image segmentation task scenarios only on the premise of fine-tuning the model parameters. Stronger image segmentation capabilities.
  • Figure 4b is a flow chart of another pre-training method for an image segmentation model provided by an embodiment of the present disclosure.
  • the method includes the following operations:
  • S4160 Use the standard image segmentation sample set to fine-tune the pre-trained image segmentation model to obtain a target image segmentation model that matches the image segmentation task scenario.
  • the technical solution of the embodiment of the present disclosure is to obtain an image classification sample set, and then select a heterogeneous label that is different from all classification labels based on all classification labels corresponding to the image classification sample set, thereby segmenting the images in the image segmentation sample set. All pixels in the sample that are not labeled with classification labels are labeled with heterogeneous labels.
  • the image segmentation sample generation method is used to process each image classification sample in the image classification sample set to generate an image segmentation corresponding to the image classification sample set. sample set, and train all model parameters included in the preset machine learning model according to the image segmentation sample set to obtain a pre-trained image segmentation model.
  • an interpretable algorithm is used to obtain the pixel points in the sample image that have an important impact on the image classification, so it can be simple, convenient and highly accurate
  • a large number of image segmentation samples can be obtained at a very small cost, which improves the generation efficiency of image segmentation samples to a certain extent.
  • the complete training process of the image segmentation model in the embodiment of the present disclosure can be divided into two parts: pre-training and downstream task fine-tuning.
  • the pre-training of traditional image segmentation models is only performed on the image classification sample set, and only a part of the backbone of the model is trained.
  • Downstream task fine-tuning is to use the pre-trained model to perform fine-tuning training on a specific image segmentation sample set (a standard image segmentation sample set that matches the image segmentation task scenario) to solve the problem of images in a specific image segmentation task scenario.
  • Split tasks The complete training process of the image segmentation model is as follows:
  • an image classification sample set such as ImageNet
  • an interpretability algorithm such as an algorithm based on input gradients
  • multiple trained image classification models Through the interpretable algorithm, the important pixels (forward label pixels) in the sample image input to the image classification model can be determined, and these important pixels are more consistent with the labels of image segmentation.
  • the interpretable algorithm acts on the image classification model. For the three primary color output channels of each sample image, the gradient is calculated, the module value is extracted, and the weight value of the pixel is obtained.
  • the interpretable algorithm can be used to perform a weighted average of the single model weight value under multiple image classification models to obtain the average result, that is, the final weight value of the pixel point. Reduce a lot of noise.
  • the logic diagram of pixel weight value denoising can be seen in Figure 5.
  • the average result is binarized, and the binarized result is used as an image segmentation pseudo-label. See Figure 6 before the binarization process, and Figure 7 after the binarization process. The first 10% of the pixels are selected as positive label pixels, and the remaining pixels (negative pixels) are labeled with heterogeneous labels.
  • Pre-training of traditional image classification models uses the classification labels of sample images as supervisory information to perform part of the model parameters of the image classification model (ie, the parameters of a part of the backbone architecture of the image segmentation model).
  • the pre-training method of the image segmentation model proposed in the embodiment of the present disclosure uses the image segmentation sample set as supervision information to train the entire image segmentation model.
  • the image segmentation labels in the image segmentation sample set come from two parts: image classification labels and binarized image segmentation pseudo-labels. All positive label pixels will be assigned as the classification label of the sample image, and all negative pixels will be assigned as a background category (heterogeneous label). Because the image segmentation sample set is established based on the image classification sample set, each image The images have both corresponding classification labels and corresponding binarized image segmentation pseudo-labels.
  • the calculation logic of (1)-(4) is: input: an image classification sample set D, K deep image classification models f k , and an interpretable algorithm A. S1. Calculate multiple single model weight values for each pixel in each sample image I i in D based on A. S2. For each sample image I i in D, calculate the mean of the single model weight values of the K depth image classification models f k to obtain the weight value of each pixel. S3.
  • the calculation logic of (5)-(7) is: input: an image classification sample set D (classification label category is Nc), an image segmentation sample set P corresponding to the image classification sample set D, a preset machine learning model f, Image segmentation task H. S1. For each sample image I i in D, its classification label is d i . Set the image segmentation label of the pixel corresponding to 1 in the binarization result to d i , and set it to d i . The image segmentation label of the pixel corresponding to 0 is set as a heterogeneous label. S2. Use the labels in S1 and use conventional deep learning optimization algorithms to train the preset machine learning model f to obtain the pretrained image segmentation model f'. S3. Use f' obtained by training in S2 to fine-tune the image segmentation task H. Output: The target image segmentation model trained on H.
  • Figure 8 is a schematic diagram of a publicly provided device for generating image segmentation samples.
  • the device for generating image segmentation samples includes a classification sample acquisition module 510, a weight value determination module 520, and a forward label pixel filtering module 530. and image segmentation sample generation module 540, wherein:
  • the classification sample acquisition module 510 is configured to obtain image classification samples, where the image classification samples include sample images and classification labels of the sample images; the weight value determination module 520 is configured to determine the sample image in the image classification model through an interpretable algorithm. Under the influence of the weight value of each pixel; the forward label pixel screening module 530 is set to select the forward label pixel in the sample image according to the weight value of each pixel; the image segmentation sample generation module 540, set as root According to the forward label pixels and classification labels of the sample image, an image segmentation sample corresponding to the image classification sample is formed.
  • the technical solution of the embodiment of the present disclosure is to obtain an image classification sample including a sample image and a classification label of the sample image, and then determine the weight value of each pixel of the sample image under the action of the image classification model through an interpretable algorithm, According to the weight value of each pixel, the forward label pixels are selected in the sample image, thereby forming an image segmentation sample corresponding to the image classification sample according to the forward label pixels and classification labels of the sample image.
  • Technical means can be used to The interpretation algorithm obtains the pixels in the sample image that play an important role in image classification. Therefore, it can simply, conveniently and accurately add corresponding classification labels to the pixels that play an important role in the sample image to form an image segmentation sample.
  • a large number of image segmentation samples can be obtained at a very small cost, which improves the generation efficiency of image segmentation samples and improves the pre-training performance of the image segmentation model to a certain extent.
  • the weight value determination module 520 is configured to input the model parameters of the sample image and the image classification model into an algorithm model that matches the interpretable algorithm, and obtain the weight value of each pixel in the sample image; wherein, The weight value is used to measure the importance of each pixel in the classification process of the sample image by the image classification model.
  • the weight value determination module 520 is configured to obtain multiple image classification models; input the sample image and the model parameters of each image classification model into the algorithm model respectively, and obtain the sample image in each image classification model. Under the action, the single model weight value of each pixel is weighted and averaged by multiple single model weight values of the same pixel position in the sample image to obtain the weight value of the same pixel in the sample image.
  • the forward label pixel filtering module 530 is configured to determine the number of pixels to be selected based on the total number of pixels in the sample image and the preset selection ratio; in order of the weight value from large to small, select and Select pixels with a matching number of forward label pixels.
  • the image segmentation sample generation module 540 is configured to use the classification label to label each forward label pixel in the sample image to form the image segmentation sample.
  • the above-mentioned device for generating image segmentation samples can execute the method for generating image segmentation samples provided by any embodiment of the present disclosure, and has corresponding functional modules and effects for executing the method for generating image segmentation samples.
  • Figure 9 is a schematic diagram of a publicly provided pre-training device for an image segmentation model.
  • the pre-training device for the image segmentation model includes a sample set acquisition module 610, an image segmentation sample set generation module 620 and a pre-training image segmentation module.
  • Model acquisition module 630 wherein:
  • the sample set acquisition module 610 is configured to obtain an image classification sample set; an image segmentation sample set is generated
  • the generation module 620 is configured to use the image segmentation sample generation method in any of the above embodiments to process each image classification sample in the image classification sample set, and generate an image segmentation sample set corresponding to the image classification sample set; pre-training
  • the image segmentation model acquisition module 630 is configured to train all model parameters included in the preset machine learning model according to the image segmentation sample set to obtain a pre-trained image segmentation model.
  • the technical solution of the embodiment of the present disclosure is to obtain an image classification sample set and then use an image segmentation sample generation method to process each image classification sample in the image classification sample set to generate an image segmentation sample set corresponding to the image classification sample set. , based on the image segmentation sample set, train all model parameters included in the preset machine learning model to obtain a pre-trained image segmentation model.
  • Embodiments of the present disclosure can use an interpretable algorithm to obtain the pixels in the sample image that have an important impact on image classification based on the image segmentation sample generation method in any of the above embodiments. Therefore, it can be simple, convenient and highly accurate. Add corresponding classification labels to the pixels that play an important role in the sample image to form image segmentation samples, thereby obtaining an image segmentation sample set.
  • a large number of image segmentation sample sets can be obtained at a very small cost, which improves the generation of image segmentation samples.
  • Efficiency improves the pre-training performance of the image segmentation model and improves the training effect of the pre-trained model.
  • Figure 10 is a schematic diagram of another publicly provided pre-training device for the image segmentation model.
  • the pre-training device for the image segmentation model also includes a heterogeneous label labeling module 640.
  • the heterogeneous label labeling module 640 is configured to classify the image according to the sample set. For all corresponding classification labels, select a heterogeneous label that is different from all classification labels; all pixels in the image segmentation samples in the image segmentation sample set that are not labeled with classification labels are labeled with the heterogeneous label.
  • Figure 11 is a schematic diagram of another publicly provided pre-training device for an image segmentation model.
  • the pre-training device for the image segmentation model also includes a target image segmentation model 650.
  • the target image segmentation model 650 is configured to obtain images matching the image segmentation task scene. Standard image segmentation sample set; use the standard image segmentation sample set to fine-tune the pre-trained image segmentation model to obtain a target image segmentation model that matches the image segmentation task scenario.
  • the image segmentation task scene includes at least one of the following: driving scene, medical imaging scene, robot perception scene, and remote sensing satellite image segmentation scene.
  • the above-mentioned image segmentation model pre-training device can execute the image segmentation model pre-training method provided by any embodiment of the present disclosure, and has corresponding functional modules and effects for executing the image segmentation model pre-training method.
  • the present disclosure also provides an electronic device, a computer-readable storage medium, and a computer program product to implement the methods in the above-mentioned embodiments.
  • FIG 12 is a schematic block diagram of an electronic device provided by an embodiment of the present disclosure.
  • Electronic device 10 is intended to represent many forms of digital computers, including desktop computers, workstations, personal digital assistants, servers, mainframe computers, and other suitable computers.
  • the components shown herein, their connections and relationships, and their functions are examples only and are not intended to limit implementations of the disclosure described and/or claimed herein.
  • the electronic device 10 includes at least one processor 11, and a memory communicatively connected to the at least one processor 11, such as a read-only memory (Read-Only Memory, ROM) 12, a random access memory (Random Access Memory, RAM) 13, etc., wherein the memory stores a computer program that can be executed by at least one processor.
  • the processor 11 can execute according to the computer program stored in the ROM 12 or loaded from the storage unit 18 into the RAM 13. A variety of appropriate actions and treatments.
  • various programs and data required for the operation of the electronic device 10 can also be stored.
  • the processor 11, the ROM 12 and the RAM 13 are connected to each other via the bus 14.
  • An input/output (I/O) interface 15 is also connected to the bus 14 .
  • the I/O interface 15 Multiple components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16, such as a keyboard, a mouse, etc.; an output unit 17, such as various types of displays, speakers, etc.; a storage unit 18, such as a magnetic disk, an optical disk, etc. etc.; and communication unit 19, such as network card, modem, wireless communication transceiver, etc.
  • the communication unit 19 allows the electronic device 10 to exchange information/data with other devices through a computer network such as the Internet and/or various telecommunications networks.
  • Processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of the processor 11 include a central processing unit (CPU), a graphics processing unit (GPU), a variety of dedicated artificial intelligence (Artificial Intelligence, AI) computing chips, and a variety of running machine learning models. Algorithm processor, digital signal processor (Digital Signal Processing, DSP), and any appropriate processor, controller, microcontroller, etc.
  • the processor 11 performs multiple methods and processes described above, such as the method for generating image segmentation samples given in any embodiment, or the method for pre-training the image segmentation model.
  • the given method for generating image segmentation samples, or the method for pre-training the image segmentation model can be implemented as a computer program, which is tangibly included in a computer-readable storage medium, such as the storage unit 18 .
  • part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19 .
  • processor 11 may The formula is configured (eg, by means of firmware) to perform a method of generating image segmentation samples, or a method of pretraining an image segmentation model.
  • Various implementations of the systems and techniques described above may be implemented in digital electronic circuit systems, integrated circuit systems, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Parts (ASSP), System on Chip (SOC), Complex Programmable Logic Device (CPLD), computer hardware, firmware, software, and/or their realized in combination.
  • Various implementations may include implementation in one or more computer programs that may be executed and/or interpreted on a programmable system including at least one programmable processor that may is a special-purpose or general-purpose programmable processor that can receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device.
  • Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general-purpose computer, special-purpose computer, or other programmable data processing device, such that the program codes, when executed by the processor or controller, cause the functions specified in the flowcharts and/or block diagrams/ The operation is implemented.
  • the program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
  • a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with an instruction execution system, apparatus, or device.
  • the machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • Machine-readable media may include electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or devices, or any suitable combination of the foregoing.
  • machine-readable storage media examples include one or more wire-based electrical connections, laptop disks, hard drives, RAM, ROM, Erasable Programmable Read-Only Memory (EPROM), or flash memory ), optical fiber, portable compact disk read-only memory (Compact Disc Read-Only Memory, CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • the systems and techniques described herein may be implemented on a computer having a display device (eg, a cathode ray tube (CRT)) or a liquid crystal display (e.g., a cathode ray tube (CRT)) configured to display information to a user.
  • a display device eg, a cathode ray tube (CRT)
  • a liquid crystal display e.g., a cathode ray tube (CRT)
  • LCD Liquid Crystal Display
  • keyboard and pointing device eg, a mouse or a trackball
  • the feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and can be in any form (including acoustic input, speech input, or tactile feedback). input) to receive input from the user.
  • sensory feedback e.g., visual feedback, auditory feedback, or tactile feedback
  • tactile feedback can be any form (including acoustic input, speech input, or tactile feedback). input) to receive input from the user.
  • the systems and techniques described herein may be implemented in a computing system that includes back-end components (e.g., as a data server), or a computing system that includes middleware components (e.g., an application server), or a computing system that includes front-end components (e.g., A user's computer having a graphical user interface or web browser through which the user can interact with implementations of the systems and technologies described herein), or including such backend components, middleware components, or any combination of front-end components in a computing system.
  • the components of the system may be interconnected by any form or medium of digital data communication (eg, a communications network). Examples of communication networks include: Local Area Network (LAN), Wide Area Network (Wide Area Network, WAN), and the Internet.
  • Computer systems may include clients and servers. Clients and servers are generally remote from each other and typically interact over a communications network. The relationship of client and server is created by computer programs running on corresponding computers and having a client-server relationship with each other.
  • the server can be a cloud server, also known as cloud computing server or cloud host. It is a host product in the cloud computing service system to solve the problems that exist in traditional physical host and virtual private server (VPS) services. It has the disadvantages of difficult management and weak business scalability.
  • the server can also be a distributed system server or a server combined with a blockchain.
  • Artificial intelligence is the study of using computers to simulate some human thinking processes and intelligent behaviors (such as learning, reasoning, thinking, planning, etc.). It has both hardware-level technology and software-level technology. Artificial intelligence hardware technology generally includes sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing and other technologies; artificial intelligence software technology mainly includes computer vision technology, speech recognition technology, natural language processing technology and machine learning/depth Learning technology, big data processing technology, knowledge graph technology and other major directions.
  • Cloud computing refers to a flexible and scalable shared physical or virtual resource pool through network access.
  • Resources can include servers, operating systems, networks, software, applications, storage devices, etc., and can be on-demand and self-service.
  • Operations can be reordered, added, or deleted using various forms of the process shown above.
  • multiple operations recorded in this disclosure can be performed in parallel, sequentially, or in different orders.
  • the desired results of the technical solution provided by this disclosure can be achieved, there is no limitation here.

Abstract

Provided in the present disclosure are an image segmentation sample generation method and apparatus, a method and apparatus for pre-training an image segmentation model, and a device and a medium. The image segmentation sample generation method comprises: acquiring an image classification sample, wherein the image classification sample comprises a sample image and a classification label of the sample image; determining, by means of an interpretable algorithm, a weight value of each pixel point of the sample image under the action of an image classification model; selecting a forward label pixel point from the sample image according to the weight value of each pixel point; and forming, according to the forward label pixel point and the classification label of the sample image, an image segmentation sample corresponding to the image classification sample.

Description

图像分割样本的生成方法及装置、图像分割模型的预训练方法及装置、设备、介质Image segmentation sample generation method and device, image segmentation model pre-training method and device, equipment, media
本申请要求在2022年05月23日提交中国专利局、申请号为202210567293.6的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。This application claims priority to the Chinese patent application with application number 202210567293.6, which was submitted to the China Patent Office on May 23, 2022. The entire content of this application is incorporated into this application by reference.
技术领域Technical field
本公开涉及计算机视觉领域,图像分割领域,例如涉及图像分割样本的生成方法及装置、图像分割模型的预训练方法及装置、设备、介质。The present disclosure relates to the field of computer vision and image segmentation, for example, to methods and devices for generating image segmentation samples, pre-training methods and devices for image segmentation models, equipment, and media.
背景技术Background technique
图像语义分割是计算机视觉领域的一种传统任务,旨在对图像中的每一个像素点进行识别与分类。图像分割的技术可应用于多种场景,比如驾驶场景理解以及医疗影像分析等,该技术主要通过训练得到图像分割模型实现图像语义的分割。Image semantic segmentation is a traditional task in the field of computer vision, aiming to identify and classify every pixel in the image. Image segmentation technology can be applied to a variety of scenarios, such as driving scene understanding and medical image analysis. This technology mainly achieves image semantic segmentation by training image segmentation models.
理论上来说,可以通过大量的图像分割样本训练得到图像分割模型。但是,考虑到图像分割样本的标注难度较大,成本高且耗时长,相关技术主要使用现存的图像分类样本对深度学习模型的模型骨干参数进行预训练,之后再继续利用图像分割样本对该深度学习模型进行再调整,以最终得到所需的图像分割模型。Theoretically, an image segmentation model can be obtained by training a large number of image segmentation samples. However, considering that labeling image segmentation samples is difficult, costly and time-consuming, related technologies mainly use existing image classification samples to pre-train the backbone model parameters of the deep learning model, and then continue to use image segmentation samples to predict the depth of the model. The learning model is re-adjusted to finally obtain the desired image segmentation model.
但是,由于上述图像分割模型的预训练过程仅能对模型骨干参数进行预训练,无法对全部模型参数进行有效的预训练,导致整个预训练过程的性能较差。However, since the above pre-training process of the image segmentation model can only pre-train the backbone parameters of the model, it cannot effectively pre-train all model parameters, resulting in poor performance of the entire pre-training process.
发明内容Contents of the invention
本公开提供了图像分割样本的生成方法及装置、图像分割模型的预训练方法及装置、设备、介质。The present disclosure provides methods and devices for generating image segmentation samples, pre-training methods and devices for image segmentation models, equipment, and media.
根据本公开的一方面,提供了一种图像分割样本的生成方法,包括:According to one aspect of the present disclosure, a method for generating image segmentation samples is provided, including:
获取图像分类样本,其中,图像分类样本中包括样本图像,以及样本图像的分类标签;Obtain image classification samples, where the image classification samples include sample images and classification labels of the sample images;
通过可解释算法,确定样本图像在图像分类模型的作用下,每个像素点的权重值;Through the interpretable algorithm, determine the weight value of each pixel of the sample image under the action of the image classification model;
根据每个像素点的权重值,在样本图像中选取正向标签像素点; According to the weight value of each pixel, select the forward label pixel in the sample image;
根据样本图像的正向标签像素点和分类标签,形成与图像分类样本对应的图像分割样本。According to the forward label pixels and classification labels of the sample image, an image segmentation sample corresponding to the image classification sample is formed.
根据本公开的另一方面,提供了一种图像分割模型的预训练方法,包括:According to another aspect of the present disclosure, a pre-training method for an image segmentation model is provided, including:
获取图像分类样本集;Obtain image classification sample set;
采用上述的图像分割样本的生成方法,对图像分类样本集中的每个图像分类样本进行处理,生成与所述图像分类样本集对应的图像分割样本集;Using the above method for generating image segmentation samples, each image classification sample in the image classification sample set is processed to generate an image segmentation sample set corresponding to the image classification sample set;
根据所述图像分割样本集,对预设的机器学习模型中包括的全部模型参数进行训练,得到预训练图像分割模型。According to the image segmentation sample set, all model parameters included in the preset machine learning model are trained to obtain a pre-trained image segmentation model.
根据本公开的另一方面,提供了一种图像分割样本的生成装置,包括:According to another aspect of the present disclosure, a device for generating image segmentation samples is provided, including:
分类样本获取模块,设置为获取图像分类样本,其中,图像分类样本中包括样本图像,以及样本图像的分类标签;The classification sample acquisition module is configured to obtain image classification samples, where the image classification samples include sample images and classification labels of the sample images;
权重值确定模块,设置为通过可解释算法,确定样本图像在图像分类模型的作用下,每个像素点的权重值;The weight value determination module is set to determine the weight value of each pixel of the sample image under the action of the image classification model through an interpretable algorithm;
正向标签像素点筛选模块,设置为根据每个像素点的权重值,在样本图像中选取正向标签像素点;The forward label pixel filtering module is set to select forward label pixels in the sample image based on the weight value of each pixel;
图像分割样本生成模块,设置为根据样本图像的正向标签像素点和分类标签,形成与图像分类样本对应的图像分割样本。The image segmentation sample generation module is configured to form an image segmentation sample corresponding to the image classification sample based on the forward label pixels and classification labels of the sample image.
根据本公开的另一方面,提供了一种图像分割模型的预训练装置,包括:According to another aspect of the present disclosure, a pre-training device for an image segmentation model is provided, including:
样本集获取模块,设置为获取图像分类样本集;The sample set acquisition module is set to obtain the image classification sample set;
图像分割样本集生成模块,设置为采用上述的图像分割样本的生成方法,对图像分类样本集中的每个图像分类样本进行处理,生成与所述图像分类样本集对应的图像分割样本集;The image segmentation sample set generation module is configured to use the above-mentioned image segmentation sample generation method to process each image classification sample in the image classification sample set, and generate an image segmentation sample set corresponding to the image classification sample set;
预训练图像分割模型获取模块,设置为根据所述图像分割样本集,对预设的机器学习模型中包括的全部模型参数进行训练,得到预训练图像分割模型。The pre-trained image segmentation model acquisition module is configured to train all model parameters included in the preset machine learning model according to the image segmentation sample set to obtain a pre-trained image segmentation model.
根据本公开的另一方面,提供了一种电子设备,包括:According to another aspect of the present disclosure, an electronic device is provided, including:
一个或多个处理器;one or more processors;
存储装置,设置为存储一个或多个程序;a storage device configured to store one or more programs;
当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现上述的图像分割样本的生成方法,或者,上述的图像分割模型的预训练。 When the one or more programs are executed by the one or more processors, the one or more processors are caused to implement the above-mentioned generation method of image segmentation samples, or the above-mentioned pre-training of the image segmentation model.
根据本公开的另一方面,提供了一种计算机指令的非瞬时计算机可读存储介质,其中,计算机指令用于使计算机执行上述的图像分割样本的生成方法,或者,上述的图像分割模型的预训练。According to another aspect of the present disclosure, a non-transitory computer-readable storage medium of computer instructions is provided, wherein the computer instructions are used to cause the computer to execute the above-mentioned generation method of image segmentation samples, or the above-mentioned prediction of the image segmentation model. train.
附图说明Description of the drawings
图1是本公开实施例提供的一种图像分割样本的生成方法的流程图;Figure 1 is a flow chart of a method for generating image segmentation samples provided by an embodiment of the present disclosure;
图2a是本公开实施例提供的另一种图像分割样本的生成方法的流程图;Figure 2a is a flow chart of another method for generating image segmentation samples provided by an embodiment of the present disclosure;
图2b是本公开实施例提供的另一种图像分割样本的生成方法的流程图;Figure 2b is a flow chart of another method for generating image segmentation samples provided by an embodiment of the present disclosure;
图3是本公开实施例提供的一种图像分割模型的预训练方法的流程图;Figure 3 is a flow chart of a pre-training method for an image segmentation model provided by an embodiment of the present disclosure;
图4a是本公开实施例提供的另一种图像分割模型的预训练方法的流程图;Figure 4a is a flow chart of another pre-training method for an image segmentation model provided by an embodiment of the present disclosure;
图4b是本公开实施例提供的另一种图像分割模型的预训练方法的流程图;Figure 4b is a flow chart of another pre-training method for an image segmentation model provided by an embodiment of the present disclosure;
图5是本公开实施例提供的一种像素点权重值去噪音的逻辑示意图;Figure 5 is a logical schematic diagram of pixel weight value denoising provided by an embodiment of the present disclosure;
图6为本公开实施例提供的一种平均结果二值化处理之前的示意图;Figure 6 is a schematic diagram before binarization processing of an average result provided by an embodiment of the present disclosure;
图7为本公开实施例提供的一种平均结果二值化处理之后的示意图;Figure 7 is a schematic diagram after binarization processing of an average result provided by an embodiment of the present disclosure;
图8是公开提供的一种图像分割样本的生成装置的示意图;Figure 8 is a schematic diagram of a publicly provided device for generating image segmentation samples;
图9是公开提供的一种图像分割模型的预训练装置的示意图;Figure 9 is a schematic diagram of a publicly provided pre-training device for an image segmentation model;
图10是公开提供的另一种图像分割模型的预训练装置的示意图;Figure 10 is a schematic diagram of another publicly provided pre-training device for an image segmentation model;
图11是公开提供的另一种图像分割模型的预训练装置的示意图;Figure 11 is a schematic diagram of another publicly provided pre-training device for an image segmentation model;
图12是本公开实施例提供的一种电子设备的示意性框图。Figure 12 is a schematic block diagram of an electronic device provided by an embodiment of the present disclosure.
具体实施方式Detailed ways
以下结合附图对本公开的示范性实施例做出说明,其中包括本公开实施例的多种细节以助于理解,应当将它们认为仅仅是示范性的。为了清楚和简明,以下的描述中省略了对公知功能和结构以及与下述实施例相关性低的功能和结构的描述。Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the present disclosure are included to facilitate understanding, and they should be considered to be exemplary only. For the sake of clarity and conciseness, descriptions of well-known functions and structures as well as functions and structures that are less relevant to the embodiments described below are omitted from the following description.
在一个示例中,图1是本公开实施例提供的一种图像分割样本的生成方法的流程图,本实施例可适用于为样本图像中的像素点加入分类标签的情况,该方法可以由图像分割样本的生成装置来执行,该装置可以由软件和硬件中的至少一项的方式来实现,并一般可以集成在电子设备中。相应的,如图1所示,该方法包括如下操作: In one example, Figure 1 is a flow chart of a method for generating image segmentation samples provided by an embodiment of the present disclosure. This embodiment can be applied to the case of adding classification labels to pixels in the sample image. This method can be based on the image The segmented sample is generated by a device, which can be implemented by at least one of software and hardware, and can generally be integrated in an electronic device. Correspondingly, as shown in Figure 1, the method includes the following operations:
S110、获取图像分类样本。S110. Obtain image classification samples.
图像分类样本可以理解为标注有分类标签的图像。图像分类样本一般可以用来作为训练图像分类模型的训练样本。Image classification samples can be understood as images labeled with classification labels. Image classification samples can generally be used as training samples for training image classification models.
图像分类样本中可以包括样本图像,以及样本图像的分类标签。样本图像可以是任意场景下的图像。分类标签可以用于表征样本图像所属类别。例如,一张包含有斑马的样本图像的分类标签可以是斑马也可以是动物,一张包含苹果的样本图像的分类标签可以是苹果也可以是水果等。本公开实施例并不对样本图像以及样本图像的分类标签进行限定。The image classification sample may include sample images and classification labels of the sample images. Sample images can be images in any scene. Classification labels can be used to characterize the category to which a sample image belongs. For example, the classification label of a sample image containing zebras can be zebra or animal, and the classification label of a sample image containing apples can be apples or fruits, etc. The embodiments of the present disclosure do not limit the sample images and the classification labels of the sample images.
在本公开实施例中,可以获取任意场景的样本图像,并确定与样本图像对应的分类标签,从而将样本图像以及与样本图像匹配的分类标签,作为图像分类样本。In the embodiment of the present disclosure, a sample image of any scene can be obtained, and a classification label corresponding to the sample image can be determined, so that the sample image and the classification label matching the sample image are used as image classification samples.
一般来说,互联网上有大量且公开的图像分类样本数据集,涉及多个场景,进而可以从上述图像分类样本数据集中获取所需的图像分类样本。Generally speaking, there are a large number of public image classification sample data sets on the Internet, involving multiple scenarios, and the required image classification samples can be obtained from the above image classification sample data sets.
S120、通过可解释算法,确定样本图像在图像分类模型的作用下,每个像素点的权重值。S120. Determine the weight value of each pixel of the sample image under the action of the image classification model through an interpretable algorithm.
可解释算法可以是使模型预测结果具备可解释性的算法,用于建立模型输入数据与模型预测过程的关联。图像分类模型可以是任意的,能够识别图像所属分类的机器学习模型。图像分类模型的类型可以包括基于卷积神经网络(Convolutional Neural Networks,CNN)或者,循环神经网络(Recurrent Neural Network,RNN)等实现。The interpretable algorithm can be an algorithm that makes the model prediction results interpretable and is used to establish the correlation between the model input data and the model prediction process. The image classification model can be any machine learning model that can identify the category to which the image belongs. The types of image classification models can include implementations based on convolutional neural networks (Convolutional Neural Networks, CNN) or recurrent neural networks (Recurrent Neural Networks, RNN).
在本公开实施例中,可以从已知的可解释算法中选择一种可解释算法,例如:基于梯度的可解释算法,或者基于积分梯度的可解释算法等,进而在将样本图像输入至图像分类模型之后,可以根据可解释算法确定出样本图像中的每个像素点,在图像分类模型确定样本图像所属分类时的作用程度,并将作用程度进行量化,作为该像素点的权重值。In the embodiment of the present disclosure, an interpretable algorithm can be selected from known interpretable algorithms, such as an interpretable algorithm based on gradients, or an interpretable algorithm based on integral gradients, etc., and then the sample image is input to the image After the classification model is established, each pixel in the sample image can be determined based on the interpretable algorithm, and the degree of effect it plays when the image classification model determines the category to which the sample image belongs, and the degree of effect is quantified and used as the weight value of the pixel.
样本图像中的一个像素点的权重值越高,图像分类模型在确定该样本图像的所属分类(也即,该分类标签)时,对该像素点的参考程度也就越大,进而,该像素点属于与该分类标签对应的分类对象像素点的概率也就越大。The higher the weight value of a pixel in the sample image, the greater the degree of reference the image classification model will have to the pixel when determining the category to which the sample image belongs (that is, the classification label). In turn, the pixel will be The probability that a point belongs to the classification object pixel corresponding to the classification label is greater.
例如,在一张包含有斑马的样本图像中,一个像素点的权重值越高,该像素点为该样本图像中用于构成斑马图样的像素点的概率也越大。For example, in a sample image containing a zebra, the higher the weight value of a pixel, the greater the probability that the pixel is a pixel used to form a zebra pattern in the sample image.
S130、根据每个像素点的权重值,在样本图像中选取正向标签像素点。S130. Select the forward label pixel in the sample image according to the weight value of each pixel.
正向标签像素点可以是根据权重值大小从样本图像的全部像素点中确定 的部分像素点。The forward label pixels can be determined from all pixels in the sample image according to the weight value. part of the pixels.
在本公开实施例中,可以根据预设的像素点的筛选条件以及每个像素点的权重值,对样本图像中的多个像素点进行筛选,得到样本图像中的正向标签像素点,也即得到决定样本图像的样本标签的像素点,从而基于可解释算法获取样本图像中对图像分类起到重要影响的像素点。In the embodiment of the present disclosure, multiple pixels in the sample image can be filtered according to the preset filtering conditions of the pixels and the weight value of each pixel, and the forward label pixels in the sample image can be obtained. That is, the pixels that determine the sample label of the sample image are obtained, and the pixels in the sample image that play an important role in image classification are obtained based on the interpretable algorithm.
像素点的筛选条件可以包括所需筛选出的像素点的数量,或者像素点的权重值大于设定阈值等。The filtering conditions for pixels may include the number of pixels to be filtered out, or the weight value of the pixels being greater than a set threshold, etc.
S140、根据样本图像的正向标签像素点和分类标签,形成与图像分类样本对应的图像分割样本。S140: Form an image segmentation sample corresponding to the image classification sample based on the forward label pixels and classification labels of the sample image.
图像分割样本可以理解为标注有分割标签的图像,分割标签可以理解为在图像中以像素点为单位标注出分割对象(例如,前述实例图像中的斑马图样)的分类标签。图像分割样本一般可以用来作为训练图像分割模型的训练样本。完成训练的图像分割模型具备图像分割能力,即能够识别图像中不同属性(分类标签)的对象。Image segmentation samples can be understood as images marked with segmentation labels, and segmentation labels can be understood as classification labels that mark segmentation objects (for example, the zebra pattern in the aforementioned example image) in the image in units of pixels. Image segmentation samples can generally be used as training samples for training image segmentation models. The image segmentation model that has completed training has image segmentation capabilities, that is, it can identify objects with different attributes (classification labels) in the image.
在本公开实施例中,可以将正向标签像素点,和与正向标签像素点匹配的分类标签,作为与图像分类样本对应的图像分割样本,进而通过图像分割样本训练图像分割模型,从而将利用图像分割样本训练完成的图像分割模型对输入图像进行图像分割,实现了在图像分类样本的技术上,简单、便捷的生成图像分割样本。In the embodiment of the present disclosure, the forward label pixels and the classification labels matching the forward label pixels can be used as image segmentation samples corresponding to the image classification samples, and then the image segmentation model is trained through the image segmentation samples, so as to The image segmentation model trained with image segmentation samples is used to segment the input image, achieving simple and convenient generation of image segmentation samples based on the technology of image classification samples.
本公开实施例的技术方案,通过获取包括样本图像,以及样本图像的分类标签的图像分类样本,进而通过可解释算法,确定样本图像在图像分类模型的作用下,每个像素点的权重值,根据每个像素点的权重值,在样本图像中选取正向标签像素点,从而根据样本图像的正向标签像素点和分类标签,形成与图像分类样本对应的图像分割样本的技术手段,使用可解释算法获取了样本图像中对图像分类起到重要影响的像素点,因此可以简单、便捷且高精度的在该样本图像中起到重要影响的像素点上加入相应的分类标签形成图像分割样本,能够以非常小的代价获取大量的图像分割样本,提高了图像分割样本的生成效率,在一定程度上提高了图像分割模型的预训练性能。The technical solution of the embodiment of the present disclosure is to obtain an image classification sample including a sample image and a classification label of the sample image, and then determine the weight value of each pixel of the sample image under the action of the image classification model through an interpretable algorithm, According to the weight value of each pixel, the forward label pixels are selected in the sample image, thereby forming an image segmentation sample corresponding to the image classification sample according to the forward label pixels and classification labels of the sample image. Technical means can be used to The interpretation algorithm obtains the pixels in the sample image that play an important role in image classification. Therefore, it can simply, conveniently and accurately add corresponding classification labels to the pixels that play an important role in the sample image to form an image segmentation sample. A large number of image segmentation samples can be obtained at a very small cost, which improves the generation efficiency of image segmentation samples and improves the pre-training performance of the image segmentation model to a certain extent.
在一个示例中,图2a是本公开实施例提供的一种图像分割样本的生成方法的流程图,本实施例给出了通过可解释算法,确定样本图像在图像分类模型的作用下,每个像素点的权重值的一种实施方式。相应的,如图2a所示,该方法包括如下操作: In one example, Figure 2a is a flow chart of a method for generating image segmentation samples provided by an embodiment of the present disclosure. This embodiment provides an interpretable algorithm that determines each sample image under the action of the image classification model. An implementation of the weight value of pixels. Correspondingly, as shown in Figure 2a, the method includes the following operations:
S210、获取图像分类样本。S210. Obtain image classification samples.
图像分类样本中包括样本图像,以及样本图像的分类标签。Image classification samples include sample images and classification labels of the sample images.
S220、将样本图像和图像分类模型的模型参数输入至与可解释算法匹配的算法模型中,获取样本图像中的每个像素点的权重值。S220. Input the model parameters of the sample image and the image classification model into an algorithm model that matches the interpretable algorithm, and obtain the weight value of each pixel in the sample image.
权重值用于衡量像素点在图像分类模型对样本图像的分类过程中的重要性程度。模型参数可以是图像分类模型的配置参数,例如,权重系数。The weight value is used to measure the importance of pixels in the classification process of sample images by the image classification model. The model parameters may be configuration parameters of the image classification model, for example, weight coefficients.
在本公开实施例中,在获取样本图像之后,可以确定图像分类模型的模型参数,以及从已知可解释算法中选择的一种可解释算法,进而将样本图像和图像分类模型的模型参数输入至与选定的可解释算法匹配的算法模型中,利用该算法模型计算每个像素点在图像分类模型对样本图像的分类过程中的重要性程度,进而将与该像素在图像分类模型对样本图像的分类过程中的重要性程度进行量化,得到样本图像中的该像素点的权重值。即通过可解释算法的算法模型,使图像分类模型的作用过程可解释,给出了图像分类模型根据样本图像中像素点对模型参数的调整程度的量化值,便于快速定位在样本图像的分类过程中重要的像素点。In the embodiment of the present disclosure, after obtaining the sample image, the model parameters of the image classification model and an interpretable algorithm selected from known interpretable algorithms can be determined, and then the sample image and the model parameters of the image classification model can be input To the algorithm model that matches the selected interpretable algorithm, use the algorithm model to calculate the importance of each pixel in the image classification model's classification process of the sample image, and then compare it with the pixel in the image classification model's classification of the sample The degree of importance in the image classification process is quantified, and the weight value of the pixel in the sample image is obtained. That is, through the algorithm model of the interpretable algorithm, the action process of the image classification model can be explained, and a quantitative value of the degree of adjustment of the model parameters by the image classification model according to the pixels in the sample image is given, which facilitates rapid positioning in the classification process of the sample image. important pixels.
在本公开的一个实施例中,将样本图像和图像分类模型的模型参数输入至与可解释算法匹配的算法模型中,获取样本图像中的每个像素点的权重值,可以包括:获取多个图像分类模型;将样本图像和每个图像分类模型的模型参数分别输入至算法模型中,获取样本图像在每个图像分类模型作用下,每个像素点的单模型权重值;将样本图像中同一像素位置的多个单模型权重值进行加权平均,得到样本图像中的该像素点的权重值。In one embodiment of the present disclosure, inputting the model parameters of the sample image and the image classification model into an algorithm model that matches the interpretable algorithm, and obtaining the weight value of each pixel in the sample image may include: obtaining multiple Image classification model; input the model parameters of the sample image and each image classification model into the algorithm model respectively, and obtain the single model weight value of each pixel in the sample image under the action of each image classification model; add the same model in the sample image The multiple single model weight values at the pixel position are weighted and averaged to obtain the weight value of the pixel in the sample image.
单模型权重值可以是可解释算法的算法模型根据单个图像分类模型以及样本图像,确定的像素点的权重值。The single model weight value can be the weight value of the pixel determined by the algorithm model of the interpretable algorithm based on a single image classification model and the sample image.
在本公开实施例中,可以获取多个完成训练的图像分类模型,进而获取每个图像分类模型的模型参数,将样本图像以及和当前图像分类模型匹配的模型参数输入至可解释算法的算法模型中,获取样本图像在当前图像分类模型对样本图像分类过程中,每个像素点的单模型权重值。以此类推,在得到样本图像在每个图像分类模型作用下,每个像素点的单模型权重值后,按照像素位置确定样本图像在同一像素位置的多个单模型权重值,即确定样本图像中与同一像素点对应的多个单模型权重值,进而对与该像素点匹配的多个单模型权重值进行加权平均处理,得到样本图像中该像素点的权重值。由于单个图像分类模型可能存在模型精度较差的问题,利用多个图像分类模型,得到每个像素点的单模型权重值,再将单模型权重值的加权平均结果作为像 素点的权重值,可以避免由于单个图像分类模型自身精度问题引起的权重值计算误差大的问题。In the embodiment of the present disclosure, multiple trained image classification models can be obtained, and then the model parameters of each image classification model can be obtained, and the sample image and the model parameters matching the current image classification model can be input to the algorithm model of the interpretable algorithm. , obtain the single model weight value of each pixel of the sample image during the classification of the sample image by the current image classification model. By analogy, after obtaining the single model weight value of each pixel of the sample image under the action of each image classification model, multiple single model weight values of the sample image at the same pixel position are determined according to the pixel position, that is, the sample image is determined Multiple single model weight values corresponding to the same pixel are obtained, and then the weighted average processing of multiple single model weight values matching the pixel is performed to obtain the weight value of the pixel in the sample image. Since a single image classification model may have poor model accuracy, multiple image classification models are used to obtain the single model weight value of each pixel, and then the weighted average result of the single model weight value is used as the image The weight value of the prime point can avoid the problem of large weight value calculation error caused by the accuracy of a single image classification model itself.
S230、根据每个像素点的权重值,在样本图像中选取正向标签像素点。S230. Select the forward label pixel in the sample image according to the weight value of each pixel.
在本公开的一个实施例中,根据每个像素点的权重值,在样本图像中选取正向标签像素点,可以包括:根据样本图像中的像素点总数,以及预设的选取比例,确定像素点选取数量;按照权重值由大到小的顺序,选取与像素点选取数量匹配的正向标签像素点。In one embodiment of the present disclosure, selecting forward label pixels in the sample image based on the weight value of each pixel may include: determining the pixels based on the total number of pixels in the sample image and a preset selection ratio. The number of selected points; in order of weight value from large to small, select the forward label pixels that match the number of selected pixels.
像素点总数可以是图像中包括的像素点的总数。选取比例可以预先设定的,描述从样本图像的像素中筛选像素点的比例。像素点选取数量可以是根据样本图像的像素点总数以及选取比例确定的,需要从样本图像的像素中筛选的像素点的数量。The total number of pixels may be the total number of pixels included in the image. The selection ratio can be preset, describing the ratio of filtering pixels from the pixels of the sample image. The number of pixels to be selected can be determined based on the total number of pixels in the sample image and the selection ratio, and the number of pixels that need to be filtered from the pixels in the sample image.
在本公开实施例中,可以先确定样本图像的像素点总数,进而获取预设的选取比例,进而根据样本图像的像素点总数以及选取比例的乘积,确定像素点选取数量,并按照由大到小的顺序对样本图像中多个像素点的权重值进行排序,从而按照权重值由大到小的顺序选取与像素点选取数量匹配的像素点,将选取的像素点作为正向标签像素点。在实际图像分类过程中,根据关键像素点即可确定图像的分类,通过按照权重值由大到小的顺序,选取与像素点选取数量匹配的正向标签像素点,可以通过筛选出在图像分类过程中影响较大的正向标签像素点,降低后续正向标签像素点的数据处理时长。In the embodiment of the present disclosure, the total number of pixels of the sample image can be determined first, and then the preset selection ratio can be obtained, and then the number of pixels to be selected can be determined based on the product of the total number of pixels of the sample image and the selection ratio, and the number of pixels to be selected can be determined in order from largest to largest. The weight values of multiple pixels in the sample image are sorted in order of small, so that the pixels matching the selected number of pixels are selected in order of weight values from large to small, and the selected pixels are used as forward label pixels. In the actual image classification process, the classification of the image can be determined based on the key pixels. By selecting the forward label pixels that match the selected number of pixels in order of weight values from large to small, the image classification can be filtered out In the process, the positive label pixels that have a greater impact will be reduced to reduce the data processing time of subsequent forward label pixels.
S240、根据样本图像的正向标签像素点和分类标签,形成与图像分类样本对应的图像分割样本。S240: Form an image segmentation sample corresponding to the image classification sample based on the forward label pixels and classification labels of the sample image.
在本公开的一个实施例中,根据样本图像的正向标签像素点和分类标签,形成与图像分类样本对应的图像分割样本,可以包括:使用分类标签对样本图像中的每个正向标签像素点进行标注,形成图像分割样本。In one embodiment of the present disclosure, forming an image segmentation sample corresponding to the image classification sample based on the forward label pixels and the classification label of the sample image may include: using the classification label to classify each forward label pixel in the sample image. Points are labeled to form image segmentation samples.
在本公开实施例中,可以获取样本图像的分类标签,进而将样本图像中每个正向标签像素点,标注与正向标签像素点所属样本图像匹配的分类标签,形成图像分割样本。由于正向标签像素点是图像分类过程中影响较大的像素点,将标注有分类标签的正向标签像素点作为图像分割样本,可以在保证训练效果的前提下,最大程度减少训练数据的数量,提升模型的训练速度。In the embodiment of the present disclosure, the classification label of the sample image can be obtained, and then each forward label pixel in the sample image is labeled with a classification label that matches the sample image to which the forward label pixel belongs, forming an image segmentation sample. Since forward label pixels are pixels that have a greater impact in the image classification process, using forward label pixels labeled with classification labels as image segmentation samples can minimize the amount of training data while ensuring the training effect. , improve the training speed of the model.
本公开实施例的技术方案,通过获取图像分类样本,进而将样本图像和图像分类模型的模型参数输入至与可解释算法匹配的算法模型中,获取样本图像中的每个像素点的权重值,从而根据每个像素点的权重值,在样本图像中选取正向标签像素点,根据样本图像的正向标签像素点和分类标签,形成 与图像分类样本对应的图像分割样本,便于快速定位在样本图像的分类过程中重要的像素点,并利用关键的像素点以及分类标签生成图像分割样本,形成与图像分类样本对应的图像分割样本的技术手段,使用可解释算法获取了样本图像中对图像分类起到重要影响的像素点,因此可以简单、便捷且高精度的在该样本图像中起到重要影响的像素点上加入相应的分类标签形成图像分割样本,能够以非常小的代价获取大量的图像分割样本,提高了图像分割样本的生成效率,在一定程度上提高了图像分割模型的预训练性能。The technical solution of the embodiment of the present disclosure obtains the image classification sample, and then inputs the model parameters of the sample image and the image classification model into an algorithm model that matches the interpretable algorithm, and obtains the weight value of each pixel in the sample image, Therefore, according to the weight value of each pixel, the forward label pixels are selected in the sample image, and based on the forward label pixels and classification labels of the sample image, a The image segmentation sample corresponding to the image classification sample is convenient for quickly locating important pixel points in the classification process of the sample image, and the key pixel points and classification labels are used to generate the image segmentation sample to form an image segmentation sample corresponding to the image classification sample. Technical means use interpretable algorithms to obtain the pixels in the sample image that play an important role in image classification. Therefore, the corresponding classification labels can be added to the pixels that play an important role in the sample image simply, conveniently and with high accuracy. Forming image segmentation samples can obtain a large number of image segmentation samples at a very small cost, improve the generation efficiency of image segmentation samples, and improve the pre-training performance of the image segmentation model to a certain extent.
在一个示例中,图2b是本公开实施例提供的另一种图像分割样本的生成方法的流程图,如图2b所示,方法包括:In one example, Figure 2b is a flow chart of another method for generating image segmentation samples provided by an embodiment of the present disclosure. As shown in Figure 2b, the method includes:
S2100、获取图像分类样本。S2100. Obtain image classification samples.
图像分类样本中包括样本图像,以及样本图像的分类标签。Image classification samples include sample images and classification labels of the sample images.
S2110、将样本图像和图像分类模型的模型参数输入至与可解释算法匹配的算法模型中,获取样本图像中的每个像素点的权重值。S2110. Input the model parameters of the sample image and the image classification model into an algorithm model that matches the interpretable algorithm, and obtain the weight value of each pixel in the sample image.
在本公开的一个实施例中,SS2110可以包括:In one embodiment of the present disclosure, SS2110 may include:
S2111、获取多个图像分类模型。S2111. Obtain multiple image classification models.
S2112、将样本图像和每个图像分类模型的模型参数分别输入至算法模型中,获取样本图像在每个图像分类模型作用下,每个像素点的单模型权重值。S2112. Input the sample image and the model parameters of each image classification model into the algorithm model respectively, and obtain the single model weight value of each pixel of the sample image under the action of each image classification model.
S2113、将样本图像中同一像素位置的多个单模型权重值进行加权平均,得到样本图像中的该同一像素点的权重值。S2113. Perform a weighted average of multiple single model weight values at the same pixel position in the sample image to obtain the weight value of the same pixel point in the sample image.
S2120、根据样本图像中的像素点总数,以及预设的选取比例,确定像素点选取数量。S2120. Determine the number of pixels to be selected based on the total number of pixels in the sample image and the preset selection ratio.
S2130、按照权重值由大到小的顺序,选取与像素点选取数量匹配的正向标签像素点。S2130. Select the forward label pixels that match the selected number of pixels in order from large to small weight values.
S2140、使用样本图像的分类标签对样本图像中的每个正向标签像素点进行标注,形成图像分割样本。S2140. Use the classification label of the sample image to label each forward label pixel in the sample image to form an image segmentation sample.
本公开实施例的技术方案,通过获取图像分类样本,且获取多个图像分类模型,并将样本图像和每个图像分类模型的模型参数分别输入至算法模型中,获取样本图像在每个图像分类模型作用下,每个像素点的单模型权重值,从而将样本图像中同一像素位置的多个单模型权重值进行加权平均,得到样本图像中的该像素点的权重值,进而根据样本图像中的像素点总数,以及预 设的选取比例,确定像素点选取数量,按照权重值由大到小的顺序,选取与像素点选取数量匹配的正向标签像素点,从而根据样本图像的正向标签像素点和分类标签,形成与图像分类样本对应的图像分割样本的技术手段,使用可解释算法获取了样本图像中对图像分类起到重要影响的像素点,因此可以简单、便捷且高精度的在该样本图像中起到重要影响的像素点上加入相应的分类标签形成图像分割样本,能够以非常小的代价获取大量的图像分割样本,提高了图像分割样本的生成效率,在一定程度上提高了图像分割模型的预训练性能。The technical solution of the embodiment of the present disclosure obtains image classification samples, obtains multiple image classification models, and inputs the sample image and the model parameters of each image classification model into the algorithm model respectively to obtain the sample image in each image classification. Under the action of the model, the single model weight value of each pixel is weighted and averaged to obtain the weight value of the pixel in the sample image, and then based on the weight value of the pixel in the sample image. the total number of pixels, and the pre- Assume the selection ratio, determine the number of pixels to be selected, and select the forward label pixels that match the number of pixels selected in order of weight value from large to small, thereby forming a formula based on the forward label pixels and classification labels of the sample image. The technical means of image segmentation samples corresponding to image classification samples uses an interpretable algorithm to obtain the pixels in the sample image that play an important role in image classification. Therefore, it can be simple, convenient and highly accurate to play an important role in the sample image. Adding corresponding classification labels to the affected pixels forms image segmentation samples, which can obtain a large number of image segmentation samples at a very small cost, improve the generation efficiency of image segmentation samples, and improve the pre-training performance of the image segmentation model to a certain extent. .
在一个示例中,图3是本公开实施例提供的一种图像分割模型的预训练方法的流程图,本实施例可适用于对图像分割模型进行预训练的情况,该方法可以由图像分割模型的预训练装置来执行,该装置可以由软件和硬件中的至少一项的方式来实现,并一般可以集成在电子设备中。相应的,如图3所示,该方法包括如下操作:In one example, FIG. 3 is a flow chart of a pre-training method for an image segmentation model provided by an embodiment of the present disclosure. This embodiment can be applied to the case of pre-training an image segmentation model. The method can be composed of an image segmentation model. The pre-training device is implemented by at least one of software and hardware, and can generally be integrated into an electronic device. Correspondingly, as shown in Figure 3, the method includes the following operations:
S310、获取图像分类样本集。S310. Obtain the image classification sample set.
图像分类样本集可以是图像分类样本的集合。The image classification sample set may be a collection of image classification samples.
在本公开实施例中,可以获取任意场景下的多个图像分类样本,以得到包括多个图像分类样本的图像分类样本集合。In the embodiment of the present disclosure, multiple image classification samples in any scenario can be obtained to obtain an image classification sample set including multiple image classification samples.
S320、采用图像分割样本的生成方法,对图像分类样本集中的每个图像分类样本进行处理,生成与图像分类样本集对应的图像分割样本集。S320. Use the image segmentation sample generation method to process each image classification sample in the image classification sample set, and generate an image segmentation sample set corresponding to the image classification sample set.
图像分割样本集可以是根据如上述任一实施例中的图像分割样本的生成方法,生成的图像分割样本的集合。The image segmentation sample set may be a set of image segmentation samples generated according to the method for generating image segmentation samples in any of the above embodiments.
在本公开实施例中,可以根据如上述任一实施例中的图像分割样本的生成方法,确定图像分类样本集中每个图像分类样本的样本图像的正向标签像素点以及分类标签,从而根据每个图像分类样本的样本图像的正向标签像素点以及分类标签,分别形成与该图像分类样本对应的图像分割样本,即得到与图像分类样本集对应的图像分割样本集。In the embodiments of the present disclosure, the forward label pixels and classification labels of the sample images of each image classification sample in the image classification sample set can be determined according to the method for generating image segmentation samples in any of the above embodiments, so that according to each The forward label pixels and classification labels of the sample images of the image classification samples respectively form image segmentation samples corresponding to the image classification samples, that is, the image segmentation sample set corresponding to the image classification sample set is obtained.
S330、根据图像分割样本集,对预设的机器学习模型中包括的全部模型参数进行训练,得到预训练图像分割模型。S330. According to the image segmentation sample set, train all model parameters included in the preset machine learning model to obtain a pre-trained image segmentation model.
预训练图像分割模型可以是通过图像分割样本集对预设的机器学习模型训练后得到的模型。The pre-trained image segmentation model may be a model obtained by training a preset machine learning model through an image segmentation sample set.
在本公开实施例中,可以获取预设的机器学习模型,并确定预设的机器 学习模型中包括的全部模型参数,进而利用图像分割样本集对预设的机器学习模型中包括的全部模型参数进行训练,将训练完成的机器学习模型作为预训练图像分割模型,进而通过预训练图像分割模型对输入的图像进行图像分割。In the embodiment of the present disclosure, a preset machine learning model can be obtained and the preset machine learning model can be determined Learn all model parameters included in the model, and then use the image segmentation sample set to train all model parameters included in the preset machine learning model. Use the trained machine learning model as a pre-trained image segmentation model, and then use the pre-trained image The segmentation model performs image segmentation on the input image.
本公开实施例的技术方案,通过获取图像分类样本集,进而采用图像分割样本的生成方法,对图像分类样本集中的每个图像分类样本进行处理,生成与图像分类样本集对应的图像分割样本集,根据图像分割样本集,对预设的机器学习模型中包括的全部模型参数进行训练,得到预训练图像分割模型。本公开实施例可以根据如上述任一实施例中的图像分割样本的生成方法,使用可解释算法获取样本图像中对图像分类起到重要影响的像素点,因此可以简单、便捷且高精度的在该样本图像中起到重要影响的像素点上加入相应的分类标签形成图像分割样本,从而得到图像分割样本集,能够以非常小的代价获取大量的图像分割样本集,提高了图像分割样本的生成效率,在一定程度上提高了图像分割模型的预训练性能,并提高预训练模型的训练效果。The technical solution of the embodiment of the present disclosure is to obtain an image classification sample set and then use an image segmentation sample generation method to process each image classification sample in the image classification sample set to generate an image segmentation sample set corresponding to the image classification sample set. , based on the image segmentation sample set, train all model parameters included in the preset machine learning model to obtain a pre-trained image segmentation model. Embodiments of the present disclosure can use an interpretable algorithm to obtain the pixels in the sample image that have an important impact on image classification based on the image segmentation sample generation method in any of the above embodiments. Therefore, it can be simple, convenient and highly accurate. Add corresponding classification labels to the pixels that play an important role in the sample image to form image segmentation samples, thereby obtaining an image segmentation sample set. A large number of image segmentation sample sets can be obtained at a very small cost, which improves the generation of image segmentation samples. Efficiency, to a certain extent, improves the pre-training performance of the image segmentation model and improves the training effect of the pre-trained model.
在一个示例中,图4a是本公开实施例提供的另一种图像分割模型的预训练方法的流程图,相应的,如图4a所示,该方法包括如下操作:In one example, Figure 4a is a flow chart of another pre-training method for an image segmentation model provided by an embodiment of the present disclosure. Correspondingly, as shown in Figure 4a, the method includes the following operations:
S410、获取图像分类样本集。S410. Obtain the image classification sample set.
S420、采用图像分割样本的生成方法,对图像分类样本集中的每个图像分类样本进行处理,生成与图像分类样本集对应的图像分割样本集。S420: Use an image segmentation sample generation method to process each image classification sample in the image classification sample set, and generate an image segmentation sample set corresponding to the image classification sample set.
S430、根据图像分割样本集,对预设的机器学习模型中包括的全部模型参数进行训练,得到预训练图像分割模型。S430. According to the image segmentation sample set, train all model parameters included in the preset machine learning model to obtain a pre-trained image segmentation model.
在本公开的一个实施例中,在根据图像分割样本集,对预设的机器学习模型中包括的全部模型参数进行训练,得到预训练图像分割模型之前,还可以包括:根据与图像分类样本集对应的全部分类标签,选取一个区别于该全部分类标签的异类标签;将图像分割样本集内的图像分割样本中未标注分类标签的全部像素点,均使用异类标签进行标注。In one embodiment of the present disclosure, before training all model parameters included in the preset machine learning model according to the image segmentation sample set to obtain the pre-trained image segmentation model, it may also include: based on the image classification sample set For all corresponding classification labels, select a heterogeneous label that is different from all classification labels; all pixels in the image segmentation samples in the image segmentation sample set that are not labeled with classification labels are labeled with heterogeneous labels.
异类标签可以是与图像分割样本集对应的全部分类标签中不存的,与样本图像的分类无关的标签。示例性的,异类标签可以用于标识图像背景。The heterogeneous label may be a label that does not exist in all classification labels corresponding to the image segmentation sample set and has nothing to do with the classification of the sample image. Illustratively, heterogeneous labels can be used to identify image backgrounds.
在本公开实施例中,可以获取与图像分类样本集中样本图像对应的全部分类标签,进而解析全部分类标签,确定区别于全部分类标签的标签,作为异类标签,从而将图像分割样本集内的图像分割样本中除正向标签像素点的像素点(即未标注分类标签的至少一个像素点),标注异类标签。通过标注 异类标签的图像分割样本,训练机器学习模型,可以使机器学习模型具有更好的预训练效果,使得预训练图像分割模型能够进行准确的图像分割。In the embodiment of the present disclosure, all classification labels corresponding to the sample images in the image classification sample set can be obtained, and then all classification labels can be parsed to determine labels that are different from all classification labels as heterogeneous labels, thereby segmenting the images in the sample set. The pixels in the segmented sample except the forward label pixels (that is, at least one pixel that is not labeled with a classification label) are labeled with heterogeneous labels. by annotation Using image segmentation samples with heterogeneous labels to train the machine learning model can make the machine learning model have better pre-training effect, so that the pre-trained image segmentation model can perform accurate image segmentation.
S440、获取与图像分割任务场景匹配的标准图像分割样本集。S440. Obtain a standard image segmentation sample set matching the image segmentation task scenario.
图像分割任务场景可以是进行图像分割的图片所属的场景。标准图像分割样本集可以是与图像分割任务场景匹配的图像分割样本集。The image segmentation task scene may be the scene to which the picture for image segmentation belongs. The standard image segmentation sample set may be an image segmentation sample set matching the image segmentation task scenario.
在本公开实施例中,可以先确定图像分割任务场景,进而获取与图像分割任务场景匹配的图像分类样本集,并根据如上述任一实施例中的图像分割样本的生成方法,生成与图像分割任务场景匹配的标准图像分割样本集。In the embodiments of the present disclosure, the image segmentation task scene can be determined first, and then an image classification sample set matching the image segmentation task scene can be obtained, and according to the method for generating image segmentation samples in any of the above embodiments, an image segmentation sample set corresponding to the image segmentation sample set can be generated. Standard image segmentation sample set for task scene matching.
S450、使用标准图像分割样本集对预训练图像分割模型进行微调,得到与图像分割任务场景匹配的目标图像分割模型。S450: Use the standard image segmentation sample set to fine-tune the pre-trained image segmentation model to obtain a target image segmentation model that matches the image segmentation task scenario.
目标图像分割模型可以是利用标准图像分割样本集对预训练图像分割模型进行微调后得到的模型。The target image segmentation model may be a model obtained by fine-tuning a pre-trained image segmentation model using a standard image segmentation sample set.
在本公开实施例中,可以利用标准图像分割样本集对预训练图像分割模型进行训练,从而对预训练图像分割模型的模型参数进行调整,得到与图像分割任务场景匹配的目标图像分割模型,也即目标图像分割模型能够对与图像分割任务场景匹配的图像进行高精度的图像分割。In embodiments of the present disclosure, the pre-trained image segmentation model can be trained using a standard image segmentation sample set, thereby adjusting the model parameters of the pre-trained image segmentation model to obtain a target image segmentation model that matches the image segmentation task scenario, and also That is, the target image segmentation model can perform high-precision image segmentation on images that match the image segmentation task scene.
通过标准图像分割样本集对预训练图像分割模型进行训练,能够仅在模型参数微调的前提下,使预训练模型图像分割模型对于特定图像分割任务场景下的图像具备更强的图像分割能力。Training the pre-trained image segmentation model through the standard image segmentation sample set can make the pre-trained model image segmentation model have stronger image segmentation capabilities for images in specific image segmentation task scenarios only on the premise of fine-tuning the model parameters.
图像分割任务场景可以包括下述至少一项:驾驶场景,医疗影像场景,机器人感知场景以及遥感卫星图像分割场景。在不同的图像分割任务场景下的图像分割样本集,具备图像分割任务场景特有的图像特征,使得利用图像分割任务场景下的图像分割样本集训练预训练模型图像分割模型,可以使预训练模型图像分割模型对特定图像分割任务场景的图像分割能力更强。Image segmentation task scenarios may include at least one of the following: driving scenarios, medical imaging scenarios, robot perception scenarios, and remote sensing satellite image segmentation scenarios. The image segmentation sample sets in different image segmentation task scenarios have image characteristics unique to the image segmentation task scenario, so that using the image segmentation sample set in the image segmentation task scenario to train the pre-trained model image segmentation model can make the pre-trained model image The segmentation model has stronger image segmentation capabilities for specific image segmentation task scenarios.
本公开实施例的技术方案,通过获取图像分类样本集,进而采用图像分割样本的生成方法,对图像分类样本集中的每个图像分类样本进行处理,生成与图像分类样本集对应的图像分割样本集,从而根据图像分割样本集,对预设的机器学习模型中包括的全部模型参数进行训练,得到预训练图像分割模型,获取与图像分割任务场景匹配的标准图像分割样本集。根据如上述任一实施例中形成与图像分类样本对应的图像分割样本的技术手段,使用可解释算法获取了样本图像中对图像分类起到重要影响的像素点,因此可以简单、便捷且高精度的在该样本图像中起到重要影响的像素点上加入相应的分类标签形成图像分割样本,能够以非常小的代价获取大量的图像分割样本,提高 了图像分割样本的生成效率,在一定程度上提高了图像分割模型的预训练性能,并能够仅在模型参数微调的前提下,使预训练模型图像分割模型对于特定图像分割任务场景下的图像具备更强的图像分割能力。The technical solution of the embodiment of the present disclosure is to obtain an image classification sample set and then use an image segmentation sample generation method to process each image classification sample in the image classification sample set to generate an image segmentation sample set corresponding to the image classification sample set. , thereby training all model parameters included in the preset machine learning model according to the image segmentation sample set, obtaining a pre-trained image segmentation model, and obtaining a standard image segmentation sample set that matches the image segmentation task scenario. According to the technical means of forming an image segmentation sample corresponding to the image classification sample in any of the above embodiments, an interpretable algorithm is used to obtain the pixel points in the sample image that have an important impact on the image classification, so it can be simple, convenient and highly accurate By adding corresponding classification labels to the pixels that play an important role in the sample image to form image segmentation samples, a large number of image segmentation samples can be obtained at a very small cost and improve It improves the generation efficiency of image segmentation samples, improves the pre-training performance of the image segmentation model to a certain extent, and enables the pre-trained model image segmentation model to be capable of images in specific image segmentation task scenarios only on the premise of fine-tuning the model parameters. Stronger image segmentation capabilities.
在一个示例中,图4b是本公开实施例提供的另一种图像分割模型的预训练方法的流程图,相应的,如图4b所示,该方法包括如下操作:In one example, Figure 4b is a flow chart of another pre-training method for an image segmentation model provided by an embodiment of the present disclosure. Correspondingly, as shown in Figure 4b, the method includes the following operations:
S4100、获取图像分类样本集。S4100. Obtain the image classification sample set.
S4110、根据与图像分类样本集对应的全部分类标签,选取一个区别于该全部分类标签的异类标签。S4110. Based on all classification labels corresponding to the image classification sample set, select a heterogeneous label that is different from all classification labels.
S4120、将图像分割样本集内的图像分割样本中未标注分类标签的全部像素点,均使用异类标签进行标注。S4120. Label all pixels in the image segmentation samples in the image segmentation sample set that are not labeled with classification labels using heterogeneous labels.
S4130、采用图像分割样本的生成方法,对图像分类样本集中的每个图像分类样本进行处理,生成与图像分类样本集对应的图像分割样本集。S4130. Use the image segmentation sample generation method to process each image classification sample in the image classification sample set, and generate an image segmentation sample set corresponding to the image classification sample set.
S4140、根据图像分割样本集,对预设的机器学习模型中包括的全部模型参数进行训练,得到预训练图像分割模型。S4140: Train all model parameters included in the preset machine learning model according to the image segmentation sample set to obtain a pre-trained image segmentation model.
S4150、获取与图像分割任务场景匹配的标准图像分割样本集。S4150. Obtain a standard image segmentation sample set matching the image segmentation task scene.
S4160、使用标准图像分割样本集对预训练图像分割模型进行微调,得到与图像分割任务场景匹配的目标图像分割模型。S4160: Use the standard image segmentation sample set to fine-tune the pre-trained image segmentation model to obtain a target image segmentation model that matches the image segmentation task scenario.
本公开实施例的技术方案,通过获取图像分类样本集,进而根据与图像分类样本集对应的全部分类标签,选取一个区别于该全部分类标签的异类标签,从而将图像分割样本集内的图像分割样本中未标注分类标签的全部像素点,均使用异类标签进行标注,采用图像分割样本的生成方法,对图像分类样本集中的每个图像分类样本进行处理,生成与图像分类样本集对应的图像分割样本集,并根据图像分割样本集,对预设的机器学习模型中包括的全部模型参数进行训练,得到预训练图像分割模型。在得到预训练图像分割模型之后,获取与图像分割任务场景匹配的标准图像分割样本集,并使用标准图像分割样本集对预训练图像分割模型进行微调,得到与图像分割任务场景匹配的目标图像分割模型。根据如上述任一实施例中形成与图像分类样本对应的图像分割样本的技术手段,使用可解释算法获取了样本图像中对图像分类起到重要影响的像素点,因此可以简单、便捷且高精度的在该样本图像中起到重要影响的像素点上加入相应的分类标签形成图像分割样本,能够以非常小的代价获取大量的图像分割样本,提高了图像分割样本的生成效率,在一定程度上提高了图像分割模型的预训练性能,并能够仅在模型参数微调的前 提下,使预训练模型图像分割模型对于特定图像分割任务场景下的图像具备更强的图像分割能力。The technical solution of the embodiment of the present disclosure is to obtain an image classification sample set, and then select a heterogeneous label that is different from all classification labels based on all classification labels corresponding to the image classification sample set, thereby segmenting the images in the image segmentation sample set. All pixels in the sample that are not labeled with classification labels are labeled with heterogeneous labels. The image segmentation sample generation method is used to process each image classification sample in the image classification sample set to generate an image segmentation corresponding to the image classification sample set. sample set, and train all model parameters included in the preset machine learning model according to the image segmentation sample set to obtain a pre-trained image segmentation model. After obtaining the pre-trained image segmentation model, obtain a standard image segmentation sample set that matches the image segmentation task scene, and use the standard image segmentation sample set to fine-tune the pre-trained image segmentation model to obtain a target image segmentation that matches the image segmentation task scene. Model. According to the technical means of forming an image segmentation sample corresponding to the image classification sample in any of the above embodiments, an interpretable algorithm is used to obtain the pixel points in the sample image that have an important impact on the image classification, so it can be simple, convenient and highly accurate By adding corresponding classification labels to the pixels that play an important role in the sample image to form image segmentation samples, a large number of image segmentation samples can be obtained at a very small cost, which improves the generation efficiency of image segmentation samples to a certain extent. Improves the pre-training performance of image segmentation models and can only fine-tune model parameters before Under the premise, the pre-trained model image segmentation model has stronger image segmentation capabilities for images in specific image segmentation task scenarios.
本公开实施例中的图像分割模型的完整训练过程可以分为预训练以及下游任务微调两部分。传统的图像分割模型的预训练仅在图像分类样本集上进行,并且只有一部分模型骨干部分进行了训练。下游任务微调,则是利用预训练好的模型在一特定的图像分割样本集(与图像分割任务场景匹配的标准图像分割样本集)上进行微调训练,用于解决特定图像分割任务场景下的图像分割任务。图像分割模型的完整训练过程如下:The complete training process of the image segmentation model in the embodiment of the present disclosure can be divided into two parts: pre-training and downstream task fine-tuning. The pre-training of traditional image segmentation models is only performed on the image classification sample set, and only a part of the backbone of the model is trained. Downstream task fine-tuning is to use the pre-trained model to perform fine-tuning training on a specific image segmentation sample set (a standard image segmentation sample set that matches the image segmentation task scenario) to solve the problem of images in a specific image segmentation task scenario. Split tasks. The complete training process of the image segmentation model is as follows:
(1)选取一个图像分类样本集(如ImageNet),一个可解释性算法(如基于输入梯度的算法),以及多个已训练好的图像分类模型。通过可解释算法,能够确定输入图像分类模型的样本图像中重要的像素点(正向标签像素点),而这些重要的像素点,与图像分割的标签较为吻合。针对不同的图像分类模型(如深度学习模型),可解释算法作用在图像分类模型上,针对每张样本图像的三原色输出通道,计算梯度,提取模值,得到像素点的权重值。(1) Select an image classification sample set (such as ImageNet), an interpretability algorithm (such as an algorithm based on input gradients), and multiple trained image classification models. Through the interpretable algorithm, the important pixels (forward label pixels) in the sample image input to the image classification model can be determined, and these important pixels are more consistent with the labels of image segmentation. For different image classification models (such as deep learning models), the interpretable algorithm acts on the image classification model. For the three primary color output channels of each sample image, the gradient is calculated, the module value is extracted, and the weight value of the pixel is obtained.
(2)为了减少生成像素点的权重值的噪声,可以利用可解释算法在多个图像分类模型下的单模型权重值,进行加权平均,得到平均结果,即最终的像素点的权重值,以降低大量噪声。像素点权重值去噪音的逻辑示意图可参见图5。(2) In order to reduce the noise of the weight value of the generated pixel point, the interpretable algorithm can be used to perform a weighted average of the single model weight value under multiple image classification models to obtain the average result, that is, the final weight value of the pixel point. Reduce a lot of noise. The logic diagram of pixel weight value denoising can be seen in Figure 5.
(3)将平均结果作为图像分割伪标签。但考虑使用效率,将该平均结果进行二值化处理,进而将二值化处理结果作为图像分割伪标签,二值化处理之前参见图6,二值化处理后参见图7。选取前10%的像素点作为正向标签像素点,剩余像素点(负向的像素点)标注异类标签。(3) Use the average result as an image segmentation pseudo label. However, considering the efficiency of use, the average result is binarized, and the binarized result is used as an image segmentation pseudo-label. See Figure 6 before the binarization process, and Figure 7 after the binarization process. The first 10% of the pixels are selected as positive label pixels, and the remaining pixels (negative pixels) are labeled with heterogeneous labels.
(4)使用上述的图像分割样本的生成方法,对图像分类样本集的所有样本图像进行计算,得到对应的图像分割样本集。(4) Use the above image segmentation sample generation method to calculate all sample images in the image classification sample set to obtain the corresponding image segmentation sample set.
(5)传统的图像分类模型的预训练,使用样本图像的分类标签作为监督信息,对图像分类模型的部分模型参数(即图像分割模型的一部分骨干架构的参数)进行。区别于传统的图像分割模型的预训练,本公开实施例提出的图像分割模型的预训练方法,使用图像分割样本集作为监督信息,来对整个图像分割模型进行训练。图像分割样本集中的图像分割标签来源于两个部分:图像分类的标签和二值化的图像分割伪标签。所有正向标签像素点将被赋值为样本图像的分类标签,所有负向的像素点将被赋值为一个背景类别(异类标签)。因为图像分割样本集是基于该图像分类样本集建立的,所以每张图 片既有对应的分类标签,也有对应的二值化后的图像分割伪标签。(5) Pre-training of traditional image classification models uses the classification labels of sample images as supervisory information to perform part of the model parameters of the image classification model (ie, the parameters of a part of the backbone architecture of the image segmentation model). Different from the traditional pre-training of the image segmentation model, the pre-training method of the image segmentation model proposed in the embodiment of the present disclosure uses the image segmentation sample set as supervision information to train the entire image segmentation model. The image segmentation labels in the image segmentation sample set come from two parts: image classification labels and binarized image segmentation pseudo-labels. All positive label pixels will be assigned as the classification label of the sample image, and all negative pixels will be assigned as a background category (heterogeneous label). Because the image segmentation sample set is established based on the image classification sample set, each image The images have both corresponding classification labels and corresponding binarized image segmentation pseudo-labels.
(6)利用上述的图像分割标签作为监督信息,对图像分割模型进行预训练。该模型在训练结束后,也可直接作为一个图像分割模型进行使用,但效果有限。(6) Use the above image segmentation labels as supervision information to pre-train the image segmentation model. After training, this model can also be used directly as an image segmentation model, but the effect is limited.
(7)利用预训练好的模型,在下游任务上进行微调训练。不同在于,传统操作使用的预训练模型只有骨干参数作为初始化,而本公开实施例的方法将整个图像分割模型的模型参数均做了有效的初始化。(7) Use the pre-trained model to perform fine-tuning training on downstream tasks. The difference is that the pre-training model used in traditional operations only has backbone parameters as initialization, while the method of the embodiment of the present disclosure effectively initializes all model parameters of the entire image segmentation model.
(1)-(4)的计算逻辑为:输入:一个图像分类样本集D,K个深度图像分类模型fk,一种可解释算法A。S1、基于A计算D中的每个样本图像Ii中每个像素点的多个单模型权重值。S2、对于D中的每个样本图像Ii,计算与K个深度图像分类模型fk的单模型权重值的均值,得到每个像素点的权重值。S3、对于D中的每个样本图像Ii,计算权重值位于总体权重值前10%的阈值,并利用该阈值筛选像素点的权重值,并对权重值进行二值化(如将大于或等于该阈值的权重值设置为1,将小于该阈值的权重值设置为0),将二值化结果作为图像分割伪标签。输出:图像分类样本集对应的图像分割样本集。The calculation logic of (1)-(4) is: input: an image classification sample set D, K deep image classification models f k , and an interpretable algorithm A. S1. Calculate multiple single model weight values for each pixel in each sample image I i in D based on A. S2. For each sample image I i in D, calculate the mean of the single model weight values of the K depth image classification models f k to obtain the weight value of each pixel. S3. For each sample image I i in D, calculate the threshold whose weight value is in the top 10% of the overall weight value, and use this threshold to filter the weight value of the pixel points, and binarize the weight value (such as greater than or The weight value equal to the threshold is set to 1, the weight value smaller than the threshold is set to 0), and the binarization result is used as the image segmentation pseudo label. Output: Image segmentation sample set corresponding to the image classification sample set.
(5)-(7)的计算逻辑为:输入:一个图像分类样本集D(分类标签类别为Nc),与图像分类样本集D对应的图像分割样本集P,预设的机器学习模型f,图像分割任务H。S1、对于D中的每个样本图像Ii,其分类标签为di,将与二值化结果中为1对应的像素点的图像分割标签设置为di,并将与二值化结果中为0对应的像素点的图像分割标签设置为异类标签。S2、利用S1中的标签,使用常规的深度学习优化算法,对预设的机器学习模型f进行训练,得到预训练图像分割模型f'。S3、利用S2中训练得到的f',在图像分割任务H上进行微调。输出:在H上完成训练的目标图像分割模型。The calculation logic of (5)-(7) is: input: an image classification sample set D (classification label category is Nc), an image segmentation sample set P corresponding to the image classification sample set D, a preset machine learning model f, Image segmentation task H. S1. For each sample image I i in D, its classification label is d i . Set the image segmentation label of the pixel corresponding to 1 in the binarization result to d i , and set it to d i . The image segmentation label of the pixel corresponding to 0 is set as a heterogeneous label. S2. Use the labels in S1 and use conventional deep learning optimization algorithms to train the preset machine learning model f to obtain the pretrained image segmentation model f'. S3. Use f' obtained by training in S2 to fine-tune the image segmentation task H. Output: The target image segmentation model trained on H.
图8是公开提供的一种图像分割样本的生成装置的示意图,如图8所示,图像分割样本的生成装置包括分类样本获取模块510、权重值确定模块520、正向标签像素点筛选模块530以及图像分割样本生成模块540,其中:Figure 8 is a schematic diagram of a publicly provided device for generating image segmentation samples. As shown in Figure 8, the device for generating image segmentation samples includes a classification sample acquisition module 510, a weight value determination module 520, and a forward label pixel filtering module 530. and image segmentation sample generation module 540, wherein:
分类样本获取模块510,设置为获取图像分类样本,其中,图像分类样本中包括样本图像,以及样本图像的分类标签;权重值确定模块520,设置为通过可解释算法,确定样本图像在图像分类模型的作用下,每个像素点的权重值;正向标签像素点筛选模块530,设置为根据每个像素点的权重值,在样本图像中选取正向标签像素点;图像分割样本生成模块540,设置为根 据样本图像的正向标签像素点和分类标签,形成与图像分类样本对应的图像分割样本。The classification sample acquisition module 510 is configured to obtain image classification samples, where the image classification samples include sample images and classification labels of the sample images; the weight value determination module 520 is configured to determine the sample image in the image classification model through an interpretable algorithm. Under the influence of the weight value of each pixel; the forward label pixel screening module 530 is set to select the forward label pixel in the sample image according to the weight value of each pixel; the image segmentation sample generation module 540, set as root According to the forward label pixels and classification labels of the sample image, an image segmentation sample corresponding to the image classification sample is formed.
本公开实施例的技术方案,通过获取包括样本图像,以及样本图像的分类标签的图像分类样本,进而通过可解释算法,确定样本图像在图像分类模型的作用下,每个像素点的权重值,根据每个像素点的权重值,在样本图像中选取正向标签像素点,从而根据样本图像的正向标签像素点和分类标签,形成与图像分类样本对应的图像分割样本的技术手段,使用可解释算法获取了样本图像中对图像分类起到重要影响的像素点,因此可以简单、便捷且高精度的在该样本图像中起到重要影响的像素点上加入相应的分类标签形成图像分割样本,能够以非常小的代价获取大量的图像分割样本,提高了图像分割样本的生成效率,在一定程度上提高了图像分割模型的预训练性能。The technical solution of the embodiment of the present disclosure is to obtain an image classification sample including a sample image and a classification label of the sample image, and then determine the weight value of each pixel of the sample image under the action of the image classification model through an interpretable algorithm, According to the weight value of each pixel, the forward label pixels are selected in the sample image, thereby forming an image segmentation sample corresponding to the image classification sample according to the forward label pixels and classification labels of the sample image. Technical means can be used to The interpretation algorithm obtains the pixels in the sample image that play an important role in image classification. Therefore, it can simply, conveniently and accurately add corresponding classification labels to the pixels that play an important role in the sample image to form an image segmentation sample. A large number of image segmentation samples can be obtained at a very small cost, which improves the generation efficiency of image segmentation samples and improves the pre-training performance of the image segmentation model to a certain extent.
一实施例中,权重值确定模块520,设置为将样本图像和图像分类模型的模型参数输入至与可解释算法匹配的算法模型中,获取样本图像中的每个像素点的权重值;其中,权重值用于衡量每个像素点在图像分类模型对样本图像的分类过程中的重要性程度。In one embodiment, the weight value determination module 520 is configured to input the model parameters of the sample image and the image classification model into an algorithm model that matches the interpretable algorithm, and obtain the weight value of each pixel in the sample image; wherein, The weight value is used to measure the importance of each pixel in the classification process of the sample image by the image classification model.
一实施例中,权重值确定模块520,设置为获取多个图像分类模型;将样本图像和每个图像分类模型的模型参数分别输入至所述算法模型中,获取样本图像在每个图像分类模型作用下,每个像素点的单模型权重值;将样本图像中同一像素位置的多个单模型权重值进行加权平均,得到样本图像中的该同一像素点的权重值。In one embodiment, the weight value determination module 520 is configured to obtain multiple image classification models; input the sample image and the model parameters of each image classification model into the algorithm model respectively, and obtain the sample image in each image classification model. Under the action, the single model weight value of each pixel is weighted and averaged by multiple single model weight values of the same pixel position in the sample image to obtain the weight value of the same pixel in the sample image.
一实施例中,正向标签像素点筛选模块530,设置为根据样本图像中的像素点总数,以及预设的选取比例,确定像素点选取数量;按照权重值由大到小的顺序,选取与像素点选取数量匹配的正向标签像素点。In one embodiment, the forward label pixel filtering module 530 is configured to determine the number of pixels to be selected based on the total number of pixels in the sample image and the preset selection ratio; in order of the weight value from large to small, select and Select pixels with a matching number of forward label pixels.
一实施例中,图像分割样本生成模块540,设置为使用所述分类标签对所述样本图像中的每个正向标签像素点进行标注,形成所述图像分割样本。In one embodiment, the image segmentation sample generation module 540 is configured to use the classification label to label each forward label pixel in the sample image to form the image segmentation sample.
上述图像分割样本的生成装置可执行本公开任意实施例所提供的图像分割样本的生成方法,具备执行图像分割样本的生成方法相应的功能模块和效果。The above-mentioned device for generating image segmentation samples can execute the method for generating image segmentation samples provided by any embodiment of the present disclosure, and has corresponding functional modules and effects for executing the method for generating image segmentation samples.
图9是公开提供的一种图像分割模型的预训练装置的示意图,如图9所示,图像分割模型的预训练装置包括样本集获取模块610、图像分割样本集生成模块620以及预训练图像分割模型获取模块630,其中:Figure 9 is a schematic diagram of a publicly provided pre-training device for an image segmentation model. As shown in Figure 9, the pre-training device for the image segmentation model includes a sample set acquisition module 610, an image segmentation sample set generation module 620 and a pre-training image segmentation module. Model acquisition module 630, wherein:
样本集获取模块610,设置为获取图像分类样本集;图像分割样本集生 成模块620,设置为采用上述任一实施例中的图像分割样本的生成方法,对图像分类样本集中的每个图像分类样本进行处理,生成与图像分类样本集对应的图像分割样本集;预训练图像分割模型获取模块630,设置为根据图像分割样本集,对预设的机器学习模型中包括的全部模型参数进行训练,得到预训练图像分割模型。The sample set acquisition module 610 is configured to obtain an image classification sample set; an image segmentation sample set is generated The generation module 620 is configured to use the image segmentation sample generation method in any of the above embodiments to process each image classification sample in the image classification sample set, and generate an image segmentation sample set corresponding to the image classification sample set; pre-training The image segmentation model acquisition module 630 is configured to train all model parameters included in the preset machine learning model according to the image segmentation sample set to obtain a pre-trained image segmentation model.
本公开实施例的技术方案,通过获取图像分类样本集,进而采用图像分割样本的生成方法,对图像分类样本集中的每个图像分类样本进行处理,生成与图像分类样本集对应的图像分割样本集,根据图像分割样本集,对预设的机器学习模型中包括的全部模型参数进行训练,得到预训练图像分割模型。本公开实施例可以根据如上述任一实施例中的图像分割样本的生成方法,使用可解释算法获取样本图像中对图像分类起到重要影响的像素点,因此可以简单、便捷且高精度的在该样本图像中起到重要影响的像素点上加入相应的分类标签形成图像分割样本,从而得到图像分割样本集,能够以非常小的代价获取大量的图像分割样本集,提高了图像分割样本的生成效率,在一定程度上提高了图像分割模型的预训练性能,并提高预训练模型的训练效果。The technical solution of the embodiment of the present disclosure is to obtain an image classification sample set and then use an image segmentation sample generation method to process each image classification sample in the image classification sample set to generate an image segmentation sample set corresponding to the image classification sample set. , based on the image segmentation sample set, train all model parameters included in the preset machine learning model to obtain a pre-trained image segmentation model. Embodiments of the present disclosure can use an interpretable algorithm to obtain the pixels in the sample image that have an important impact on image classification based on the image segmentation sample generation method in any of the above embodiments. Therefore, it can be simple, convenient and highly accurate. Add corresponding classification labels to the pixels that play an important role in the sample image to form image segmentation samples, thereby obtaining an image segmentation sample set. A large number of image segmentation sample sets can be obtained at a very small cost, which improves the generation of image segmentation samples. Efficiency, to a certain extent, improves the pre-training performance of the image segmentation model and improves the training effect of the pre-trained model.
图10是公开提供的另一种图像分割模型的预训练装置的示意图,图像分割模型的预训练装置还包括异类标签标注模块640,异类标签标注模块640,设置为根据与所述图像分类样本集对应的全部分类标签,选取一个区别于该全部分类标签的异类标签;将图像分割样本集内的图像分割样本中未标注分类标签的全部像素点,均使用所述异类标签进行标注。Figure 10 is a schematic diagram of another publicly provided pre-training device for the image segmentation model. The pre-training device for the image segmentation model also includes a heterogeneous label labeling module 640. The heterogeneous label labeling module 640 is configured to classify the image according to the sample set. For all corresponding classification labels, select a heterogeneous label that is different from all classification labels; all pixels in the image segmentation samples in the image segmentation sample set that are not labeled with classification labels are labeled with the heterogeneous label.
图11是公开提供的另一种图像分割模型的预训练装置的示意图,图像分割模型的预训练装置还包括目标图像分割模型650,目标图像分割模型650,设置为获取与图像分割任务场景匹配的标准图像分割样本集;使用所述标准图像分割样本集对所述预训练图像分割模型进行微调,得到与所述图像分割任务场景匹配的目标图像分割模型。Figure 11 is a schematic diagram of another publicly provided pre-training device for an image segmentation model. The pre-training device for the image segmentation model also includes a target image segmentation model 650. The target image segmentation model 650 is configured to obtain images matching the image segmentation task scene. Standard image segmentation sample set; use the standard image segmentation sample set to fine-tune the pre-trained image segmentation model to obtain a target image segmentation model that matches the image segmentation task scenario.
一实施例中,图像分割任务场景包括下述至少一项:驾驶场景,医疗影像场景,机器人感知场景以及遥感卫星图像分割场景。In one embodiment, the image segmentation task scene includes at least one of the following: driving scene, medical imaging scene, robot perception scene, and remote sensing satellite image segmentation scene.
上述图像分割模型的预训练装置可执行本公开任意实施例所提供的图像分割模型的预训练方法,具备执行图像分割模型的预训练方法相应的功能模块和效果。The above-mentioned image segmentation model pre-training device can execute the image segmentation model pre-training method provided by any embodiment of the present disclosure, and has corresponding functional modules and effects for executing the image segmentation model pre-training method.
本公开的技术方案中,所涉及的多种数据的获取,存储和应用等,均符合相关法律法规的规定,且不违背公序良俗。 In the technical solution of this disclosure, the acquisition, storage and application of various data involved are in compliance with relevant laws and regulations and do not violate public order and good customs.
根据本公开的实施例,本公开还提供了一种电子设备、一种计算机可读存储介质以及一种计算机程序产品,以实现上述的实施例中的方法。According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a computer-readable storage medium, and a computer program product to implement the methods in the above-mentioned embodiments.
图12是本公开实施例提供的一种电子设备的示意性框图。电子设备10旨在表示多种形式的数字计算机,台式计算机、工作台、个人数字助理、服务器、大型计算机、和其它适合的计算机。本文所示的部件、它们的连接和关系、以及它们的功能仅仅作为示例,并且不意在限制本文中描述的和/或者要求的本公开的实现。Figure 12 is a schematic block diagram of an electronic device provided by an embodiment of the present disclosure. Electronic device 10 is intended to represent many forms of digital computers, including desktop computers, workstations, personal digital assistants, servers, mainframe computers, and other suitable computers. The components shown herein, their connections and relationships, and their functions are examples only and are not intended to limit implementations of the disclosure described and/or claimed herein.
如图12所示,电子设备10包括至少一个处理器11,以及与至少一个处理器11通信连接的存储器,如只读存储器(Read-Only Memory,ROM)12、随机访问存储器(Random Access Memory,RAM)13等,其中,存储器存储有可被至少一个处理器执行的计算机程序,处理器11可以根据存储在ROM 12中的计算机程序或者从存储单元18加载到RAM 13中的计算机程序,来执行多种适当的动作和处理。在RAM 13中,还可存储电子设备10操作所需的多种程序和数据。处理器11、ROM 12以及RAM 13通过总线14彼此相连。输入/输出(Input/Output,I/O)接口15也连接至总线14。As shown in Figure 12, the electronic device 10 includes at least one processor 11, and a memory communicatively connected to the at least one processor 11, such as a read-only memory (Read-Only Memory, ROM) 12, a random access memory (Random Access Memory, RAM) 13, etc., wherein the memory stores a computer program that can be executed by at least one processor. The processor 11 can execute according to the computer program stored in the ROM 12 or loaded from the storage unit 18 into the RAM 13. A variety of appropriate actions and treatments. In the RAM 13, various programs and data required for the operation of the electronic device 10 can also be stored. The processor 11, the ROM 12 and the RAM 13 are connected to each other via the bus 14. An input/output (I/O) interface 15 is also connected to the bus 14 .
电子设备10中的多个部件连接至I/O接口15,包括:输入单元16,例如键盘、鼠标等;输出单元17,例如多种类型的显示器、扬声器等;存储单元18,例如磁盘、光盘等;以及通信单元19,例如网卡、调制解调器、无线通信收发机等。通信单元19允许电子设备10通过诸如因特网的计算机网络和/或多种电信网络与其他设备交换信息/数据。Multiple components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16, such as a keyboard, a mouse, etc.; an output unit 17, such as various types of displays, speakers, etc.; a storage unit 18, such as a magnetic disk, an optical disk, etc. etc.; and communication unit 19, such as network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices through a computer network such as the Internet and/or various telecommunications networks.
处理器11可以是多种具有处理和计算能力的通用和/或专用处理组件。处理器11的一些示例包括中央处理单元(Central Processing Unit,CPU)、图形处理单元(Graphics Processing Unit,GPU)、多种专用的人工智能(Artificial Intelligence,AI)计算芯片、多种运行机器学习模型算法的处理器、数字信号处理器(Digital Signal Processing,DSP)、以及任何适当的处理器、控制器、微控制器等。处理器11执行上文所描述的多个方法和处理,例如任一实施例中给出的图像分割样本的生成方法,或者,图像分割模型的预训练方法。在一些实施例中,给出的图像分割样本的生成方法,或者,图像分割模型的预训练方法,可被实现为计算机程序,其被有形地包含于计算机可读存储介质,例如存储单元18。在一些实施例中,计算机程序的部分或者全部可以经由ROM 12和/或通信单元19而被载入和/或安装到电子设备10上。当计算机程序加载到RAM 13并由处理器11执行时,可以执行上文描述的图像分割样本的生成方法,或者,图像分割模型的预训练方法的一个或多个操作。备选地,在其他实施例中,处理器11可以通过其他任何适当的方 式(例如,借助于固件)而被配置为执行图像分割样本的生成方法,或者,图像分割模型的预训练方法。Processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of the processor 11 include a central processing unit (CPU), a graphics processing unit (GPU), a variety of dedicated artificial intelligence (Artificial Intelligence, AI) computing chips, and a variety of running machine learning models. Algorithm processor, digital signal processor (Digital Signal Processing, DSP), and any appropriate processor, controller, microcontroller, etc. The processor 11 performs multiple methods and processes described above, such as the method for generating image segmentation samples given in any embodiment, or the method for pre-training the image segmentation model. In some embodiments, the given method for generating image segmentation samples, or the method for pre-training the image segmentation model, can be implemented as a computer program, which is tangibly included in a computer-readable storage medium, such as the storage unit 18 . In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19 . When the computer program is loaded into the RAM 13 and executed by the processor 11, one or more operations of the above-described generation method of image segmentation samples, or the pre-training method of the image segmentation model, may be performed. Alternatively, in other embodiments, processor 11 may The formula is configured (eg, by means of firmware) to perform a method of generating image segmentation samples, or a method of pretraining an image segmentation model.
本文中以上描述的系统和技术的多种实施方式可以在数字电子电路系统、集成电路系统、现场可编程门阵列(Field Programmable Gate Array,FPGA)、专用集成电路(Application Specific Integrated Circuit,ASIC)、专用标准产品(Application Specific Standard Parts,ASSP)、芯片上的系统(System on Chip,SOC)、复杂可编程逻辑设备(Complex Programmable Logic Device,CPLD)、计算机硬件、固件、软件、和/或它们的组合中实现。多种实施方式可以包括:实施在一个或者多个计算机程序中,该一个或者多个计算机程序可在包括至少一个可编程处理器的可编程系统上执行和/或解释,该可编程处理器可以是专用或者通用可编程处理器,可以从存储系统、至少一个输入装置、和至少一个输出装置接收数据和指令,并且将数据和指令传输至该存储系统、该至少一个输入装置、和该至少一个输出装置。Various implementations of the systems and techniques described above may be implemented in digital electronic circuit systems, integrated circuit systems, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Parts (ASSP), System on Chip (SOC), Complex Programmable Logic Device (CPLD), computer hardware, firmware, software, and/or their realized in combination. Various implementations may include implementation in one or more computer programs that may be executed and/or interpreted on a programmable system including at least one programmable processor that may is a special-purpose or general-purpose programmable processor that can receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device.
用于实施本公开的方法的程序代码可以采用一个或多个编程语言的任何组合来编写。这些程序代码可以提供给通用计算机、专用计算机或其他可编程数据处理装置的处理器或控制器,使得程序代码当由处理器或控制器执行时使流程图和/或框图中所规定的功能/操作被实施。程序代码可以完全在机器上执行、部分地在机器上执行,作为独立软件包部分地在机器上执行且部分地在远程机器上执行或完全在远程机器或服务器上执行。Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general-purpose computer, special-purpose computer, or other programmable data processing device, such that the program codes, when executed by the processor or controller, cause the functions specified in the flowcharts and/or block diagrams/ The operation is implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
在本公开的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、RAM、ROM、可擦除可编程只读存储器(Erasable Programmable Read-Only Memory,EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(Compact Disc Read-Only Memory,CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。In the context of this disclosure, a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. Machine-readable media may include electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or devices, or any suitable combination of the foregoing. Examples of machine-readable storage media would include one or more wire-based electrical connections, laptop disks, hard drives, RAM, ROM, Erasable Programmable Read-Only Memory (EPROM), or flash memory ), optical fiber, portable compact disk read-only memory (Compact Disc Read-Only Memory, CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
为了提供与用户的交互,可以在计算机上实施此处描述的系统和技术,该计算机具有:设置为向用户显示信息的显示装置(例如,阴极射线管(Cathode Ray Tube,CRT)或者液晶显示器(Liquid Crystal Display,LCD)监视器);以及键盘和指向装置(例如,鼠标或者轨迹球),用户可以通过该键盘和该指向装置来将输入提供给计算机。其它种类的装置还可以设置为 提供与用户的交互;例如,提供给用户的反馈可以是任何形式的传感反馈(例如,视觉反馈、听觉反馈、或者触觉反馈);并且可以用任何形式(包括声输入、语音输入或者、触觉输入)来接收来自用户的输入。To provide interaction with a user, the systems and techniques described herein may be implemented on a computer having a display device (eg, a cathode ray tube (CRT)) or a liquid crystal display (e.g., a cathode ray tube (CRT)) configured to display information to a user. Liquid Crystal Display (LCD) monitor); and a keyboard and pointing device (eg, a mouse or a trackball) through which a user can provide input to the computer. Other types of devices can also be configured as Provide interaction with the user; for example, the feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and can be in any form (including acoustic input, speech input, or tactile feedback). input) to receive input from the user.
可以将此处描述的系统和技术实施在包括后台部件的计算系统(例如,作为数据服务器)、或者包括中间件部件的计算系统(例如,应用服务器)、或者包括前端部件的计算系统(例如,具有图形用户界面或者网络浏览器的用户计算机,用户可以通过该图形用户界面或者该网络浏览器来与此处描述的系统和技术的实施方式交互)、或者包括这种后台部件、中间件部件、或者前端部件的任何组合的计算系统中。可以通过任何形式或者介质的数字数据通信(例如,通信网络)来将系统的部件相互连接。通信网络的示例包括:局域网(Local Area Network,LAN)、广域网(Wide Area Network,WAN)和互联网。The systems and techniques described herein may be implemented in a computing system that includes back-end components (e.g., as a data server), or a computing system that includes middleware components (e.g., an application server), or a computing system that includes front-end components (e.g., A user's computer having a graphical user interface or web browser through which the user can interact with implementations of the systems and technologies described herein), or including such backend components, middleware components, or any combination of front-end components in a computing system. The components of the system may be interconnected by any form or medium of digital data communication (eg, a communications network). Examples of communication networks include: Local Area Network (LAN), Wide Area Network (Wide Area Network, WAN), and the Internet.
计算机系统可以包括客户端和服务器。客户端和服务器一般远离彼此并且通常通过通信网络进行交互。通过在相应的计算机上运行并且彼此具有客户端-服务器关系的计算机程序来产生客户端和服务器的关系。服务器可以是云服务器,又称为云计算服务器或云主机,是云计算服务体系中的一项主机产品,以解决了传统物理主机与虚拟专用服务器(Virtual Private Server,VPS)服务中,存在的管理难度大,业务扩展性弱的缺陷。服务器也可以为分布式系统的服务器,或者是结合了区块链的服务器。Computer systems may include clients and servers. Clients and servers are generally remote from each other and typically interact over a communications network. The relationship of client and server is created by computer programs running on corresponding computers and having a client-server relationship with each other. The server can be a cloud server, also known as cloud computing server or cloud host. It is a host product in the cloud computing service system to solve the problems that exist in traditional physical host and virtual private server (VPS) services. It has the disadvantages of difficult management and weak business scalability. The server can also be a distributed system server or a server combined with a blockchain.
人工智能是研究使计算机来模拟人的一些思维过程和智能行为(如学习、推理、思考、规划等)的学科,既有硬件层面的技术也有软件层面的技术。人工智能硬件技术一般包括如传感器、专用人工智能芯片、云计算、分布式存储、大数据处理等技术;人工智能软件技术主要包括计算机视觉技术、语音识别技术、自然语言处理技术及机器学习/深度学习技术、大数据处理技术、知识图谱技术等几大方向。Artificial intelligence is the study of using computers to simulate some human thinking processes and intelligent behaviors (such as learning, reasoning, thinking, planning, etc.). It has both hardware-level technology and software-level technology. Artificial intelligence hardware technology generally includes sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing and other technologies; artificial intelligence software technology mainly includes computer vision technology, speech recognition technology, natural language processing technology and machine learning/depth Learning technology, big data processing technology, knowledge graph technology and other major directions.
云计算(cloud computing),指的是通过网络接入弹性可扩展的共享物理或虚拟资源池,资源可以包括服务器、操作系统、网络、软件、应用和存储设备等,并可以按需、自服务的方式对资源进行部署和管理的技术体系。通过云计算技术,可以为人工智能、区块链等技术应用、模型训练提供高效强大的数据处理能力。Cloud computing refers to a flexible and scalable shared physical or virtual resource pool through network access. Resources can include servers, operating systems, networks, software, applications, storage devices, etc., and can be on-demand and self-service. A technical system for deploying and managing resources. Through cloud computing technology, it can provide efficient and powerful data processing capabilities for artificial intelligence, blockchain and other technology applications and model training.
可以使用上面所示的多种形式的流程,重新排序、增加或删除操作。例如,本公开中记载的多个操作可以并行地执行也可以顺序地执行也可以不同的次序执行,只要能够实现本公开提供的技术方案所期望的结果,本文在此不进行限制。 Operations can be reordered, added, or deleted using various forms of the process shown above. For example, multiple operations recorded in this disclosure can be performed in parallel, sequentially, or in different orders. As long as the desired results of the technical solution provided by this disclosure can be achieved, there is no limitation here.

Claims (14)

  1. 一种图像分割样本的生成方法,包括:A method for generating image segmentation samples, including:
    获取图像分类样本,其中,所述图像分类样本中包括样本图像,以及所述样本图像的分类标签;Obtain an image classification sample, wherein the image classification sample includes a sample image and a classification label of the sample image;
    通过可解释算法,确定所述样本图像在图像分类模型的作用下,每个像素点的权重值;Determine the weight value of each pixel of the sample image under the action of the image classification model through an interpretable algorithm;
    根据每个像素点的权重值,在所述样本图像中选取正向标签像素点;According to the weight value of each pixel, select the forward label pixel in the sample image;
    根据所述样本图像的正向标签像素点和分类标签,形成与所述图像分类样本对应的图像分割样本。According to the forward label pixel points and classification labels of the sample image, an image segmentation sample corresponding to the image classification sample is formed.
  2. 根据权利要求1所述的方法,其中,所述通过可解释算法,确定所述样本图像在图像分类模型的作用下,每个像素点的权重值,包括:The method according to claim 1, wherein the determining the weight value of each pixel of the sample image under the action of the image classification model through an interpretable algorithm includes:
    将所述样本图像和所述图像分类模型的模型参数输入至与可解释算法匹配的算法模型中,获取所述样本图像中的每个像素点的权重值;Input the model parameters of the sample image and the image classification model into an algorithm model that matches the interpretable algorithm, and obtain the weight value of each pixel in the sample image;
    其中,所述权重值用于衡量所述每个像素点在所述图像分类模型对所述样本图像的分类过程中的重要性程度。The weight value is used to measure the importance of each pixel in the classification process of the sample image by the image classification model.
  3. 根据权利要求2所述的方法,其中,所述将所述样本图像和所述图像分类模型的模型参数输入至与可解释算法匹配的算法模型中,获取所述样本图像中的每个像素点的权重值,包括:The method according to claim 2, wherein the model parameters of the sample image and the image classification model are input into an algorithm model that matches an interpretable algorithm, and each pixel point in the sample image is obtained. The weight values include:
    获取多个图像分类模型;Get multiple image classification models;
    将所述样本图像和每个图像分类模型的模型参数分别输入至所述算法模型中,获取所述样本图像在所述每个图像分类模型作用下,每个像素点的单模型权重值;Input the model parameters of the sample image and each image classification model into the algorithm model respectively, and obtain the single model weight value of each pixel of the sample image under the action of each image classification model;
    将所述样本图像中同一像素位置的多个单模型权重值进行加权平均,得到所述样本图像中的所述同一像素点的权重值。A weighted average of multiple single model weight values at the same pixel position in the sample image is performed to obtain the weight value of the same pixel point in the sample image.
  4. 根据权利要求1所述的方法,其中,所述根据每个像素点的权重值,在所述样本图像中选取正向标签像素点,包括:The method according to claim 1, wherein selecting forward label pixels in the sample image according to the weight value of each pixel includes:
    根据所述样本图像中的像素点总数,以及预设的选取比例,确定像素点选取数量;Determine the number of pixels to be selected based on the total number of pixels in the sample image and the preset selection ratio;
    按照权重值由大到小的顺序,选取与所述像素点选取数量匹配的正向标签像素点。In order of weight value from large to small, forward label pixels matching the selected number of pixels are selected.
  5. 根据权利要求1所述的方法,其中,所述根据所述样本图像的正向标签像素点和分类标签,形成与所述图像分类样本对应的图像分割样本,包括: The method according to claim 1, wherein forming an image segmentation sample corresponding to the image classification sample based on the forward label pixels and classification labels of the sample image includes:
    使用所述分类标签对所述样本图像中的每个正向标签像素点进行标注,形成所述图像分割样本。Use the classification label to label each forward label pixel in the sample image to form the image segmentation sample.
  6. 一种图像分割模型的预训练方法,包括:A pre-training method for image segmentation models, including:
    获取图像分类样本集;Obtain image classification sample set;
    采用权利要求1-5任一所述的图像分割样本的生成方法,对图像分类样本集中的每个图像分类样本进行处理,生成与所述图像分类样本集对应的图像分割样本集;Using the method for generating image segmentation samples according to any one of claims 1 to 5, each image classification sample in the image classification sample set is processed to generate an image segmentation sample set corresponding to the image classification sample set;
    根据所述图像分割样本集,对预设的机器学习模型中包括的全部模型参数进行训练,得到预训练图像分割模型。According to the image segmentation sample set, all model parameters included in the preset machine learning model are trained to obtain a pre-trained image segmentation model.
  7. 根据权利要求6所述的方法,在所述根据所述图像分割样本集,对预设的机器学习模型中包括的全部模型参数进行训练,得到预训练图像分割模型之前,还包括:The method according to claim 6, before training all model parameters included in the preset machine learning model according to the image segmentation sample set to obtain a pre-trained image segmentation model, it also includes:
    根据与所述图像分类样本集对应的全部分类标签,选取一个区别于所述全部分类标签的异类标签;According to all classification labels corresponding to the image classification sample set, select a heterogeneous label that is different from all classification labels;
    将所述图像分割样本集内的图像分割样本中未标注分类标签的全部像素点,均使用所述异类标签进行标注。All pixels in the image segmentation samples in the image segmentation sample set that are not labeled with classification labels are labeled using the heterogeneous labels.
  8. 根据权利要求7所述的方法,在所述根据所述图像分割样本集,对预设的机器学习模型中包括的全部模型参数进行训练,得到预训练图像分割模型之后,还包括:The method according to claim 7, after training all model parameters included in the preset machine learning model according to the image segmentation sample set to obtain a pre-trained image segmentation model, it further includes:
    获取与图像分割任务场景匹配的标准图像分割样本集;Obtain a standard image segmentation sample set that matches the image segmentation task scenario;
    使用所述标准图像分割样本集对所述预训练图像分割模型进行微调,得到与所述图像分割任务场景匹配的目标图像分割模型。The pre-trained image segmentation model is fine-tuned using the standard image segmentation sample set to obtain a target image segmentation model that matches the image segmentation task scenario.
  9. 根据权利要求8所述的方法,其中,所述图像分割任务场景包括下述至少一项:驾驶场景,医疗影像场景,机器人感知场景以及遥感卫星图像分割场景。The method according to claim 8, wherein the image segmentation task scene includes at least one of the following: driving scene, medical imaging scene, robot perception scene and remote sensing satellite image segmentation scene.
  10. 一种图像分割样本的生成装置,包括:A device for generating image segmentation samples, including:
    分类样本获取模块,设置为获取图像分类样本,其中,所述图像分类样本中包括样本图像,以及所述样本图像的分类标签;A classification sample acquisition module configured to obtain an image classification sample, wherein the image classification sample includes a sample image and a classification label of the sample image;
    权重值确定模块,设置为通过可解释算法,确定所述样本图像在图像分类模型的作用下,每个像素点的权重值;The weight value determination module is configured to determine the weight value of each pixel of the sample image under the action of the image classification model through an interpretable algorithm;
    正向标签像素点筛选模块,设置为根据每个像素点的权重值,在所述样本图像中选取正向标签像素点; A forward label pixel screening module is configured to select forward label pixels in the sample image based on the weight value of each pixel;
    图像分割样本生成模块,设置为根据所述样本图像的正向标签像素点和分类标签,形成与所述图像分类样本对应的图像分割样本。The image segmentation sample generation module is configured to form an image segmentation sample corresponding to the image classification sample based on the forward label pixels and classification labels of the sample image.
  11. 一种图像分割模型的预训练装置,包括:A pre-training device for image segmentation models, including:
    样本集获取模块,设置为获取图像分类样本集;The sample set acquisition module is set to obtain the image classification sample set;
    图像分割样本集生成模块,设置为采用如权利要求1-5任一所述的图像分割样本的生成方法,对图像分类样本集中的每个图像分类样本进行处理,生成与所述图像分类样本集对应的图像分割样本集;The image segmentation sample set generation module is configured to use the image segmentation sample generation method as described in any one of claims 1 to 5 to process each image classification sample in the image classification sample set and generate the image classification sample set. Corresponding image segmentation sample set;
    预训练图像分割模型获取模块,设置为根据所述图像分割样本集,对预设的机器学习模型中包括的全部模型参数进行训练,得到预训练图像分割模型。The pre-trained image segmentation model acquisition module is configured to train all model parameters included in the preset machine learning model according to the image segmentation sample set to obtain a pre-trained image segmentation model.
  12. 一种电子设备,包括:An electronic device including:
    至少一个处理器;at least one processor;
    存储装置,设置为存储至少一个程序;a storage device configured to store at least one program;
    当所述至少一个程序被所述至少一个处理器执行,使得所述至少一个处理器实现如权利要求1-5中任一所述的图像分割样本的生成方法,或者,如权利要求6-9中任一所述的图像分割模型的预训练方法。When the at least one program is executed by the at least one processor, the at least one processor implements the method for generating image segmentation samples according to any one of claims 1-5, or, as claimed in claims 6-9 The pre-training method of the image segmentation model described in any one of the above.
  13. 一种存储有计算机指令的非瞬时计算机可读存储介质,其中,计算机指令用于使计算机执行根据权利要求1-5中任一所述的图像分割样本的生成方法,或者,根据权利要求6-9中任一所述的图像分割模型的预训练方法。A non-transitory computer-readable storage medium storing computer instructions, wherein the computer instructions are used to cause the computer to execute the method for generating image segmentation samples according to any one of claims 1-5, or, according to claims 6- The pre-training method of the image segmentation model described in any one of 9.
  14. 一种计算机程序产品,包括计算机程序,所述计算机程序被处理器执行时实现权利要求1-5中任一所述的图像分割样本的生成方法,或者,实现权利要求6-9中任一所述的图像分割模型的预训练方法。 A computer program product, including a computer program that, when executed by a processor, implements the method for generating image segmentation samples according to any one of claims 1-5, or implements any one of claims 6-9. Pre-training method for the image segmentation model described above.
PCT/CN2023/087460 2022-05-23 2023-04-11 Image segmentation sample generation method and apparatus, method and apparatus for pre-training image segmentation model, and device and medium WO2023226606A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210567293.6A CN114882315B (en) 2022-05-23 2022-05-23 Sample generation method, model training method, device, equipment and medium
CN202210567293.6 2022-05-23

Publications (1)

Publication Number Publication Date
WO2023226606A1 true WO2023226606A1 (en) 2023-11-30

Family

ID=82677570

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/087460 WO2023226606A1 (en) 2022-05-23 2023-04-11 Image segmentation sample generation method and apparatus, method and apparatus for pre-training image segmentation model, and device and medium

Country Status (2)

Country Link
CN (1) CN114882315B (en)
WO (1) WO2023226606A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114882315B (en) * 2022-05-23 2023-09-01 北京百度网讯科技有限公司 Sample generation method, model training method, device, equipment and medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180204111A1 (en) * 2013-02-28 2018-07-19 Z Advanced Computing, Inc. System and Method for Extremely Efficient Image and Pattern Recognition and Artificial Intelligence Platform
CN110378438A (en) * 2019-08-07 2019-10-25 清华大学 Training method, device and the relevant device of Image Segmentation Model under label is fault-tolerant
CN112465754A (en) * 2020-11-17 2021-03-09 云润大数据服务有限公司 3D medical image segmentation method and device based on layered perception fusion and storage medium
CN112699858A (en) * 2021-03-24 2021-04-23 中国人民解放军国防科技大学 Unmanned platform smoke fog sensing method and system, computer equipment and storage medium
US20210174497A1 (en) * 2019-12-09 2021-06-10 Siemens Healthcare Gmbh Saliency mapping by feature reduction and perturbation modeling in medical imaging
US20210248748A1 (en) * 2020-02-12 2021-08-12 Adobe Inc. Multi-object image parsing using neural network pipeline
CN113935227A (en) * 2021-06-23 2022-01-14 中国人民解放军战略支援部队航天工程大学 Optical satellite intelligent task planning method based on real-time meteorological cloud picture
CN114882315A (en) * 2022-05-23 2022-08-09 北京百度网讯科技有限公司 Sample generation method, model training method, device, equipment and medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11914674B2 (en) * 2011-09-24 2024-02-27 Z Advanced Computing, Inc. System and method for extremely efficient image and pattern recognition and artificial intelligence platform
CN112801883A (en) * 2019-11-14 2021-05-14 北京三星通信技术研究有限公司 Image processing method, image processing device, electronic equipment and computer readable storage medium
CN112529042B (en) * 2020-11-18 2024-04-05 南京航空航天大学 Medical image classification method based on dual-attention multi-example deep learning
CN113205176B (en) * 2021-04-19 2022-09-06 重庆创通联达智能技术有限公司 Method, device and equipment for training defect classification detection model and storage medium
CN113421259B (en) * 2021-08-20 2021-11-16 北京工业大学 OCTA image analysis method based on classification network

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180204111A1 (en) * 2013-02-28 2018-07-19 Z Advanced Computing, Inc. System and Method for Extremely Efficient Image and Pattern Recognition and Artificial Intelligence Platform
CN110378438A (en) * 2019-08-07 2019-10-25 清华大学 Training method, device and the relevant device of Image Segmentation Model under label is fault-tolerant
US20210174497A1 (en) * 2019-12-09 2021-06-10 Siemens Healthcare Gmbh Saliency mapping by feature reduction and perturbation modeling in medical imaging
US20210248748A1 (en) * 2020-02-12 2021-08-12 Adobe Inc. Multi-object image parsing using neural network pipeline
CN112465754A (en) * 2020-11-17 2021-03-09 云润大数据服务有限公司 3D medical image segmentation method and device based on layered perception fusion and storage medium
CN112699858A (en) * 2021-03-24 2021-04-23 中国人民解放军国防科技大学 Unmanned platform smoke fog sensing method and system, computer equipment and storage medium
CN113935227A (en) * 2021-06-23 2022-01-14 中国人民解放军战略支援部队航天工程大学 Optical satellite intelligent task planning method based on real-time meteorological cloud picture
CN114882315A (en) * 2022-05-23 2022-08-09 北京百度网讯科技有限公司 Sample generation method, model training method, device, equipment and medium

Also Published As

Publication number Publication date
CN114882315A (en) 2022-08-09
CN114882315B (en) 2023-09-01

Similar Documents

Publication Publication Date Title
US20220335711A1 (en) Method for generating pre-trained model, electronic device and storage medium
US11100373B1 (en) Autonomous and continuously self-improving learning system
CN113642431B (en) Training method and device of target detection model, electronic equipment and storage medium
EP3848852A1 (en) Method and apparatus for detecting temporal action of video, electronic device, storage medium, and computer program product
CN112347769A (en) Entity recognition model generation method and device, electronic equipment and storage medium
CN108235116A (en) Feature propagation method and device, electronic equipment, program and medium
CN112241452A (en) Model training method and device, electronic equipment and storage medium
CN113012200B (en) Method and device for positioning moving object, electronic equipment and storage medium
WO2023226606A1 (en) Image segmentation sample generation method and apparatus, method and apparatus for pre-training image segmentation model, and device and medium
CN111539897A (en) Method and apparatus for generating image conversion model
WO2023178965A1 (en) Intent recognition method and apparatus, and electronic device and storage medium
CN112507090A (en) Method, apparatus, device and storage medium for outputting information
CN113657483A (en) Model training method, target detection method, device, equipment and storage medium
CN113378770A (en) Gesture recognition method, device, equipment, storage medium and program product
CN115861462A (en) Training method and device for image generation model, electronic equipment and storage medium
CN111862031A (en) Face synthetic image detection method and device, electronic equipment and storage medium
CN114266937A (en) Model training method, image processing method, device, equipment and storage medium
CN112966815A (en) Target detection method, system and equipment based on impulse neural network
CN115239889B (en) Training method of 3D reconstruction network, 3D reconstruction method, device, equipment and medium
CN115457329B (en) Training method of image classification model, image classification method and device
US20220335316A1 (en) Data annotation method and apparatus, electronic device and readable storage medium
CN114783597B (en) Method and device for diagnosing multi-class diseases, electronic equipment and storage medium
CN112529181B (en) Method and apparatus for model distillation
CN114842541A (en) Model training and face recognition method, device, equipment and storage medium
CN114119972A (en) Model acquisition and object processing method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23810681

Country of ref document: EP

Kind code of ref document: A1