CN113554068A

CN113554068A - Semi-automatic labeling method and device for instance segmentation data set and readable medium

Info

Publication number: CN113554068A
Application number: CN202110758660.6A
Authority: CN
Inventors: 计天晨; 房怀英; 李建涛; 杨建红; 陈强; 杨天成; 陈伟鑫; 林柏宏; 杨宇轩
Original assignee: Huaqiao University
Current assignee: Huaqiao University
Priority date: 2021-07-05
Filing date: 2021-07-05
Publication date: 2021-10-26
Anticipated expiration: 2041-07-05
Also published as: CN113554068B

Abstract

The invention discloses a semi-automatic labeling method, a semi-automatic labeling device and a readable medium for an example segmentation data set, wherein an example segmentation model and an image classification model are trained respectively; predicting a first data set in an image data set by using a trained instance segmentation model to obtain a first prediction result, determining that each image contains an image of a single object based on the first prediction result, inputting the image of the single object into the trained image classification model to obtain a second prediction result, comparing the first prediction result with the second prediction result of the single object, and manually correcting a first type and a mask of the single object in each image according to the comparison result to obtain a pseudo label instance segmentation data set; and mixing the manual example segmentation data set and the pseudo label example segmentation data set to serve as an example segmentation data set, and retraining the example segmentation model to obtain a final example segmentation model. The method uses a small amount of manual marking data, and has low cost and higher precision.

Description

Semi-automatic labeling method and device for instance segmentation data set and readable medium

Technical Field

The invention relates to the field of data annotation, in particular to a semi-automatic annotation method and device for an instance segmentation data set and a readable medium.

Background

With the continuous development of artificial intelligence technology and big data technology, various information data continuously increase at an exponential speed. In this context, computer vision based on deep learning has grown relatively mature. Computer vision is a simulation of biological vision using a computer and associated equipment. The main task of the method is to process the collected pictures or videos to obtain the three-dimensional information of the corresponding scene. In the field of computer vision, the application of the current neural network mainly comprises image recognition, target positioning and detection, semantic segmentation and instance segmentation.

Object detection or localization is a progressive process of digital images from coarse to fine. It provides not only the class of the image object, but also the location of the object in the classified image. The position is given in the form of a frame or a center. Semantic segmentation gives a good reasoning by predicting the label of each pixel in the input image. Each pixel is labeled according to the object class in which it is located. To further develop, the segmentation of instances provides different labels for individual instances of objects belonging to the same class. Thus, instance segmentation may be defined as a technique that solves both the object detection problem and the semantic segmentation problem.

Example segmentation is a direction in which target detection is difficult and complicated, and example segmentation models based on deep learning, such as Mask _ RCNN, can achieve a good segmentation effect. However, a large amount of accurately labeled image data sets are needed for training the deep learning model, especially, a large amount of manpower is consumed for manual labeling for self-defining the data sets, and the semi-automatic labeling method realized by the Mask _ RCNN alone is easy to fall into a situation that the model generates paradoxical conditions or the manual correction cost is large. Therefore, how to label the unmarked data set with low cost and high efficiency becomes a difficulty.

Disclosure of Invention

The method aims at solving the problems of how to label unmarked data sets with low cost and high efficiency in the example segmentation process. An embodiment of the present application aims to provide a semi-automatic labeling method, device and readable medium for an instance segmentation data set, which has low cost of early-stage manual labeling, high accuracy of automatic labeling identification, high labeling speed and continuous correction of automatic labeling accuracy, so as to solve the technical problems mentioned in the background art.

In a first aspect, an embodiment of the present application provides a semi-automatic labeling method for an instance segmentation data set, including the following steps:

s1, acquiring an artificial example segmentation data set based on the image data set, and training an example segmentation model by using the artificial example segmentation data set;

s2, determining an image classification data set based on the artificial example segmentation data set, and training an image classification model by using the image classification data set;

s3, predicting a first data set in the image data set by using the trained example segmentation model to obtain a first prediction result;

s4, determining that each image contains an image of a single object based on the first prediction result, inputting the image of the single object into a trained image classification model to obtain a second prediction result, comparing the first prediction result with the second prediction result, and manually correcting the first prediction result of the single object in each image according to the comparison result to obtain a pseudo label example segmentation data set; and

s5, mixing the manual instance segmentation data set and the pseudo label instance segmentation data set to serve as an instance segmentation data set, retraining the instance segmentation model by using the instance segmentation data set until the required prediction precision is achieved, outputting the final instance segmentation model and the final instance segmentation data set, and otherwise, repeating the steps S2-S5.

In some embodiments, the first prediction result includes a first type, a first confidence level and a mask corresponding to a single object in each image, and between steps S3 and S4, the method further includes: calculating an average confidence of each image according to the first confidence of a single object in each image in the first data set, if the average confidence is smaller than a second threshold value, manually correcting the first kind of the image, and transferring the image from the first prediction result to the manual example segmentation data set.

In some embodiments, the determining, based on the first prediction result, the image including the single object in step S4 specifically includes: and obtaining a minimum bounding rectangle of each object of each image in the first prediction result according to the mask of each object, and cutting each image into an image containing a single object according to the minimum bounding rectangle.

In some embodiments, the determining of the image classification dataset based on the artificial instance segmentation dataset in step S2 specifically comprises: and (4) cutting each object of each image in the manual example segmentation data set, and combining the manually marked label to form an image classification data set.

In some embodiments, the artificial instance segmentation dataset is derived by selecting a second dataset from the image dataset for artificial annotation, and the instance segmentation dataset is used as the artificial instance segmentation dataset when repeating steps S2-S5.

In some embodiments, the second prediction result includes a second category and a second confidence, and comparing the first prediction result with the second prediction result in step S4 specifically includes comparing the first category and the first confidence with the second category and the second confidence of the single object, and if the first category is different from the second category or the difference between the first confidence and the second confidence exceeds a first threshold, manually correcting the first category and the mask of the single object in each image.

In some embodiments, the example segmentation model comprises Mask _ RCNN and the image classification model comprises resenest.

In a second aspect, an embodiment of the present application provides a semi-automatic labeling apparatus for an instance segmentation data set, including:

an example segmentation model training module configured to obtain an artificial example segmentation data set based on the image data set, and train an example segmentation model using the artificial example segmentation data set;

an image classification model training module configured to determine an image classification dataset based on the artificial instance segmentation dataset, train an image classification model using the image classification dataset;

the first prediction module is configured to predict a first data set in the image data set by using the trained example segmentation model to obtain a first prediction result;

the second prediction module is configured to determine an image containing a single object in each image based on the first prediction result, input the image of the single object into the trained image classification model to obtain a second prediction result, compare the first prediction result with the second prediction result, and manually correct the first prediction result of the single object in each image according to the comparison result to obtain a pseudo label example segmentation data set; and

and the example segmentation model retraining module is configured to mix the artificial example segmentation data set and the pseudo label example segmentation data set as an example segmentation data set, retrain the example segmentation model by using the example segmentation data set until the required prediction precision is achieved, output the final example segmentation model and the example segmentation data set, and otherwise, repeatedly execute the image classification model training module to the example segmentation model retraining module.

In a third aspect, embodiments of the present application provide an electronic device comprising one or more processors; storage means for storing one or more programs which, when executed by one or more processors, cause the one or more processors to carry out a method as described in any one of the implementations of the first aspect.

In a fourth aspect, embodiments of the present application provide a computer-readable storage medium on which a computer program is stored, which, when executed by a processor, implements the method as described in any of the implementations of the first aspect.

Compared with the prior art, the beneficial effect of this application includes:

1. a small amount of manually marked manual examples are used for dividing a data set, prediction results are compared with each other through a training example division model and an image classification model, and a large amount of label-free data are semi-automatically marked, so that the cost of manual marking can be obviously reduced, and a very good practical division effect is achieved;

2. the mask and object type identification of the example segmentation are relatively independent, and high mask precision and object type identification precision can be ensured;

3. the instance segmentation data set and the model can be used for semi-automatic labeling, can also be used in practical application occasions, and can meet various different detection requirements such as instance segmentation, semantic segmentation, target detection and the like.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is an exemplary device architecture diagram in which one embodiment of the present application may be applied;

FIG. 2 is a flowchart illustrating a semi-automatic labeling method for segmented data sets according to an embodiment of the present invention;

FIG. 3 is a logical overall diagram of a semi-automatic labeling method of an example segmented data set according to an embodiment of the present invention;

FIG. 4 is a schematic flow chart of a 1 st round of training of a semi-automatic labeling method for an example segmented data set according to an embodiment of the present invention;

FIG. 5 is a schematic flow chart of the 2 nd to n nd rounds of training of the semi-automatic labeling method for segmenting a data set according to an embodiment of the present invention;

FIG. 6 is a schematic diagram of a semi-automatic labeling apparatus for segmenting a data set according to an embodiment of the present invention;

fig. 7 is a schematic structural diagram of a computer device suitable for implementing an electronic apparatus according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail with reference to the accompanying drawings, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Fig. 1 illustrates an exemplary device architecture 100 to which a semi-automatic annotation method for an example segmented data set or a semi-automatic annotation device for an example segmented data set of embodiments of the present application may be applied.

As shown in fig. 1, the apparatus architecture 100 may include

terminal devices

101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the

terminal devices

101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

The user may use the

terminal devices

101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. Various applications, such as data processing type applications, file processing type applications, etc., may be installed on the

terminal apparatuses

101, 102, 103.

The

terminal apparatuses

101, 102, and 103 may be hardware or software. When the

terminal devices

101, 102, 103 are hardware, they may be various electronic devices including, but not limited to, smart phones, tablet computers, laptop portable computers, desktop computers, and the like. When the

terminal apparatuses

101, 102, 103 are software, they can be installed in the electronic apparatuses listed above. It may be implemented as multiple pieces of software or software modules (e.g., software or software modules used to provide distributed services) or as a single piece of software or software module. And is not particularly limited herein.

The server 105 may be a server that provides various services, such as a background data processing server that processes files or data uploaded by the

terminal devices

101, 102, 103. The background data processing server can process the acquired file or data to generate a processing result.

It should be noted that, the semi-automatic annotation method for the example-divided data set provided in the embodiment of the present application may be executed by the server 105, or may also be executed by the

terminal devices

101, 102, and 103, and accordingly, the semi-automatic annotation device for the example-divided data set may be disposed in the server 105, or may also be disposed in the

terminal devices

101, 102, and 103.

It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation. In the case where the processed data does not need to be acquired from a remote location, the above device architecture may not include a network, but only a server or a terminal device.

FIG. 2 illustrates a semi-automatic labeling method for an example segmented data set according to an embodiment of the present application, including the following steps:

and S1, acquiring an artificial example segmentation data set based on the image data set, and training an example segmentation model by using the artificial example segmentation data set.

In one embodiment, a second data set is selected from the image data set for manual annotation to obtain a manual instance segmentation data set. An example segmentation model includes Mask _ RCNN. Specifically, a Mask _ RCNN series model under a Baidu paddlel framework is adopted, the specific model is Mask _ RCNN _ dcn _ r50_ vd _ fpn _2x, and the structure of a neural network model is not improved in the embodiment of the application, so that the structure of the neural network model is not improved in the embodiment of the applicationThis will not be described in detail. In alternative embodiments, other example segmentation models may be selected. Taking solid waste as an example, a solid waste data set is taken as an image data set, 500 data sets are randomly selected from a large number of solid waste data sets as a second data set, if the type of an object to be identified is m-20, and each object has at least k-50 objects, the 500 data sets at least contain 1000 objects, the 500 images are manually marked, and a 500 manual example segmentation data set N is obtained_{Artificial operation}. Segmenting 500 artificial instances into a data set N_{Artificial operation}And inputting the training example segmentation model in the standard example segmentation network Mask _ RCNN to obtain the well-trained example segmentation model. The method can be used for semi-automatic marking and can also be used in practical application occasions; the method can be applied to various different detection requirements such as instance segmentation, semantic segmentation, target detection and the like. The method has the advantages that the number of the manually marked data is small, and therefore the effect of reducing the cost can be achieved.

S2, determining an image classification dataset based on the artificial instance segmentation dataset, and training an image classification model using the image classification dataset.

In one embodiment, the determining the image classification dataset based on the artificial instance segmentation dataset in step S2 specifically comprises: segmenting a human instance into a data set N_{Artificial operation}Each object of each image is cut and combined with the manually marked label to form an image classification data set. The image classification model includes reseist. In the embodiments of the present application, the neural network model structure is not improved, and therefore, the details are not described herein. In other alternative embodiments, other image classification models may be selected. Segmenting the data set N from the 500 artificial examples_{Artificial operation}And cutting the image to prepare at least 1000 image classification data sets with labels, and inputting the image classification data sets into a ResNeSt network with high classification accuracy to train an image classification model.

S3, the first data set in the image data set is predicted by using the trained example segmentation model, and a first prediction result is obtained.

In one embodiment, the first prediction result includes a first type, a first confidence level and a mask corresponding to a single object in each image. Further included between steps S3 and S4 is: and calculating the average confidence coefficient of each image according to the first confidence coefficient of a single object in each image in the first data set, if the average confidence coefficient is smaller than a second threshold value, manually correcting the first type of the image, and transferring the image from the first prediction result to the manual example segmentation data set.

In one embodiment, 2000 unlabeled datasets are selected as the first dataset, and the 2000 unlabeled datasets are predicted by using the trained Mask _ RCNN example segmentation model to obtain a first prediction result R (t, η, Mask), wherein

A first class predicted for the ith object in the jth image,

for the first confidence of the ith object in the jth image,

for the profile mask of the ith object in the jth image, counting each image P_jEach object of (1)

Degree of confidence of

Then calculating the average confidence degree delta eta of each image_j：

Setting the second threshold value as 0.8, and if a certain image has an average confidence coefficient Delta eta_jLess than 0.8, the image P is corrected manually_jRemoving the label from the first prediction result R, and adding an artificial example segmentation data set N_{Artificial operation}。

And S4, determining that each image contains an image of a single object based on the first prediction result, inputting the image of the single object into the trained image classification model to obtain a second prediction result, comparing the first prediction result with the second prediction result, and manually correcting the first prediction result of the single object in each image according to the comparison result to obtain a pseudo label example segmentation data set.

In one embodiment, the determining, based on the first prediction result, an image containing a single object specifically comprises: and obtaining a minimum bounding rectangle of each object of each image in the first prediction result according to the mask of each object, and cutting each image into an image containing a single object according to the minimum bounding rectangle. The second prediction result includes a second category and a second confidence level. And comparing the first type and the first confidence of the single object with the second type and the second confidence, and if the first type is different from the second type or the difference between the first confidence and the second confidence exceeds a first threshold, manually correcting the first type and the mask of the single object in each image to obtain a pseudo label example segmentation data set. Specifically, each image P in the first prediction result R_jEach object of (1)

According to its mask

Obtaining the minimum bounding rectangle, cutting the minimum bounding rectangle into individual objects

Inputting the second prediction result into the trained ResNeSt image classification model to obtain a second prediction result, wherein the second prediction result comprises each object

Of the second kind

And a second degree of confidence

The first threshold is set to 0.2 when the first type is different from the second type (i.e., the first type is different from the second type)

) Or the difference between the first confidence level and the second confidence level is greater than 0.2 (namely, the difference is greater than

) While, correcting the object by hand

Of the first kind and mask, each image P_jAfter the correction of each object is completed, the corrected image P is obtained_jAnd the first kind is made into a pseudo label example division data set which is input into a pseudo label example division data set N_Dummy. According to the method, a small amount of manually labeled instance segmentation data sets are used, and a large amount of label-free data can be labeled semi-automatically by training two models, namely an instance segmentation model and an image classification model. And the mask and object type identification segmented by the example are relatively independent, so that higher mask precision and object type identification precision can be ensured.

In one embodiment, the prediction accuracy includes AP50, and the number is (0.1 × N) at random_{Artificial operation}+N_Dummy) ) a manual instance segmentation dataset N_{Artificial operation}Dividing into mixed example division verification set and residual artificial example division data set N_{Artificial operation}And pseudo label instance partitioning dataset N_DummyMixing, inputting the mixture as a mixed example segmentation training set into a Mask _ RCNN example segmentation network for training, and outputting the last obtained example score when the trained example segmentation model AP50 is more than 0.9And (4) cutting the model, otherwise, taking the instance segmentation data set as a manual instance segmentation data set, and repeating the steps S2-S5.

Fig. 3 is a logic diagram of an example of a semi-automatic labeling method for segmenting a data set according to an embodiment of the present application, fig. 4 is a step diagram of the embodiment of the present application during the 1 st round of training, and fig. 5 is a step diagram of the embodiment of the present application during the 2 nd to n th rounds of repeated training.

With further reference to fig. 6, as an implementation of the method shown in the above-mentioned figures, the present application provides an embodiment of a semi-automatic labeling apparatus for example segmented data sets, which corresponds to the embodiment of the method shown in fig. 2, and which can be applied in various electronic devices.

The embodiment of the application provides a semi-automatic labeling device for instance segmentation data sets, which comprises:

an example segmentation model training module 1 configured to acquire an artificial example segmentation data set based on the image data set, and train an example segmentation model using the artificial example segmentation data set;

an image classification model training module 2 configured to determine an image classification dataset based on the artificial instance segmentation dataset, and train an image classification model using the image classification dataset;

the first prediction module 3 is configured to predict a first data set in the image data set by using the trained instance segmentation model to obtain a first prediction result;

the second prediction module 4 is configured to determine, based on the first prediction result, that each image contains an image of a single object, input the image of the single object into the trained image classification model to obtain a second prediction result, compare the first prediction result with the second prediction result, and manually correct the first prediction result of the single object in each image according to the comparison result to obtain a pseudo label instance segmentation dataset;

and the example segmentation model retraining module 5 is configured to mix the artificial example segmentation data set and the pseudo label example segmentation data set as an example segmentation data set, retrain the example segmentation model by using the example segmentation data set until the required prediction precision is achieved, output the final example segmentation model and the example segmentation data set, and otherwise, repeatedly execute the image classification model training module to the example segmentation model retraining module.

In one embodiment, a second data set is selected from the image data set for manual annotation to obtain a manual instance segmentation data set. An example segmentation model includes Mask _ RCNN. Specifically, a Mask _ RCNN series model under a Baidu paddley framework is adopted, the specific model is Mask _ RCNN _ dcn _ r50_ vd _ fpn _2x, and the structure of the neural network model is not improved in the embodiment of the application, so that details are not repeated here. In alternative embodiments, other example segmentation models may be selected. Taking solid waste as an example, a solid waste data set is taken as an image data set, 500 data sets are randomly selected from a large number of solid waste data sets as a second data set, if the type of an object to be identified is m-20, and each object has at least k-50 objects, the 500 data sets at least contain 1000 objects, the 500 images are manually marked, and a 500 manual example segmentation data set N is obtained_{Artificial operation}. Segmenting 500 artificial instances into a data set N_{Artificial operation}And inputting the training example segmentation model in the standard example segmentation network Mask _ RCNN to obtain the well-trained example segmentation model.

In one embodiment, the determining the image classification dataset based on the artificial instance segmentation dataset in the image classification model training module 2 specifically includes: segmenting a human instance into a data set N_{Artificial operation}Each object of each image is cut and combined with the manually marked label to form an image classification data set. The image classification model includes reseist. In the embodiments of the present application, the neural network model structure is not improved, and therefore, the details are not described herein. In other alternative embodiments, other image classification models may be selected. Segmenting the data set N from the 500 artificial examples_{Artificial operation}And cutting the image to prepare at least 1000 image classification data sets with labels, and inputting the image classification data sets into a ResNeSt network with high classification accuracy to train an image classification model.

In one embodiment, the first prediction result includes a first type, a first confidence level and a mask corresponding to a single object in each image. Between the first prediction module 3 and the second prediction module 4, further comprising: and calculating the average confidence coefficient of each image according to the first confidence coefficient of a single object in each image in the first data set, if the average confidence coefficient is smaller than a second threshold value, manually correcting the first type of the image, and transferring the image from the first prediction result to the manual example segmentation data set.

A first class predicted for the ith object in the jth image,

for the first confidence of the ith object in the jth image,

Degree of confidence of

Then calculating the average confidence degree delta eta of each image_j：

Setting the second threshold value as 0.8, and if a certain image has an average confidence coefficient Delta eta_jLess than 0.8, the image P is corrected manually_jAnd removing it from the first prediction result R, addingPartitioning a data set N into artificial instances_{Artificial operation}。

According to its mask

Of the second kind

And a second degree of confidence

) While, correcting the object by hand

Of the first kind and mask, each image P_jAfter the correction of each object is completed, the corrected image P is obtained_jAnd the first kind is made into a pseudo label example division data set which is input into a pseudo label example division data set N_Dummy. The device provided by the application can be used for semi-automatically labeling a large amount of label-free data by training two models, namely an instance segmentation model and an image classification model, by using a small amount of manually labeled instance segmentation data sets. And the mask and object type identification segmented by the example are relatively independent, so that higher mask precision and object type identification precision can be ensured.

In one embodiment, the prediction accuracy includes AP50, and the number is (0.1 × N) at random_{Artificial operation}+N_Dummy) ) a manual instance segmentation dataset N_{Artificial operation}Dividing into mixed example division verification set and residual artificial example division data set N_{Artificial operation}And pseudo label instance partitioning dataset N_DummyAnd mixing the data to be used as a mixed example segmentation training set, inputting the mixed example segmentation training set into a Mask _ RCNN example segmentation network for training, outputting the example segmentation model obtained at the last time when the trained example segmentation model AP50 is larger than 0.9, otherwise, using the example segmentation data set as an artificial example segmentation data set, and repeatedly executing the image classification model training module to the example segmentation model retraining module.

Referring now to fig. 7, a schematic diagram of a computer device 700 suitable for use in implementing an electronic device (e.g., the server or terminal device shown in fig. 1) according to an embodiment of the present application is shown. The electronic device shown in fig. 7 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.

As shown in fig. 7, the computer apparatus 700 includes a Central Processing Unit (CPU)701 and a Graphics Processing Unit (GPU)702, which can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)703 or a program loaded from a storage section 709 into a Random Access Memory (RAM) 704. In the RAM704, various programs and data necessary for the operation of the apparatus 700 are also stored. The CPU 701, GPU702, ROM 703, and RAM704 are connected to each other via a bus 705. An input/output (I/O) interface 706 is also connected to bus 705.

The following components are connected to the I/O interface 706: an input portion 607 including a keyboard, a mouse, and the like; an output section 708 including a display such as a Liquid Crystal Display (LCD) and a speaker; a storage section 709 including a hard disk and the like; and a communication section 710 including a network interface card such as a LAN card, a modem, or the like. The communication section 710 performs communication processing via a network such as the internet. The driver 711 may also be connected to the I/O interface 706 as needed. A removable medium 712 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 711 as necessary, so that a computer program read out therefrom is mounted into the storage section 709 as necessary.

In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such embodiments, the computer program may be downloaded and installed from a network via the communication section 710, and/or installed from the removable media 712. The computer program performs the above-described functions defined in the method of the present application when executed by a Central Processing Unit (CPU)701 and a Graphics Processing Unit (GPU) 702.

It should be noted that the computer readable medium described herein can be a computer readable signal medium or a computer readable medium or any combination of the two. The computer readable medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor device, apparatus, or any combination of the foregoing. More specific examples of the computer readable medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution apparatus, device, or apparatus. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution apparatus, device, or apparatus. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations for aspects of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based devices that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The modules described in the embodiments of the present application may be implemented by software or hardware. The modules described may also be provided in a processor.

As another aspect, the present application also provides a computer-readable medium, which may be contained in the electronic device described in the above embodiments; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring an artificial example segmentation data set based on the image data set, and training an example segmentation model by using the artificial example segmentation data set; determining an image classification dataset based on the artificial instance segmentation dataset, and training an image classification model by using the image classification dataset; predicting a first data set in the image data set by using the trained example segmentation model to obtain a first prediction result; determining an image containing a single object in each image based on the first prediction result, inputting the image of the single object into a trained image classification model to obtain a second prediction result, comparing the first prediction result with the second prediction result, and manually correcting the first prediction result of the single object in each image according to the comparison result to obtain a pseudo label example segmentation data set; and mixing the manual instance segmentation data set and the pseudo label instance segmentation data set to serve as an instance segmentation data set, retraining the instance segmentation model by using the instance segmentation data set until the required prediction precision is achieved, outputting the final instance segmentation model and the final instance segmentation data set, and otherwise, repeating the steps.

The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention herein disclosed is not limited to the particular combination of features described above, but also encompasses other arrangements formed by any combination of the above features or their equivalents without departing from the spirit of the invention. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.

Claims

1. A semi-automatic labeling method for an instance segmentation data set is characterized by comprising the following steps:

s2, determining an image classification dataset based on the artificial instance segmentation dataset, and training an image classification model by using the image classification dataset;

s5, mixing the manual instance segmentation data set and the pseudo label instance segmentation data set to serve as an instance segmentation data set, retraining the instance segmentation model by using the instance segmentation data set until the required prediction precision is achieved, outputting the final instance segmentation model and the instance segmentation data set, and otherwise, repeating the steps S2-S5.

2. The method for semi-automatic annotation of an instance segmentation data set according to claim 1, wherein the first prediction result comprises a first class, a first confidence level and a mask corresponding to a single object in each image, and further comprising between steps S3 and S4: calculating an average confidence of each image according to the first confidence of a single object in each image in the first data set, if the average confidence is smaller than a second threshold, manually correcting the first kind of the image, and transferring the image from a first prediction result to the manual instance segmentation data set.

3. The method for semi-automatic annotation of an example segmented data set according to claim 1, wherein said determining in step S4 an image containing a single object based on said first prediction result specifically comprises: and obtaining a minimum bounding rectangle of each object of each image in the first prediction result according to the mask of each object, and cutting each image into the image containing the single object according to the minimum bounding rectangle.

4. The method for semi-automatic annotation of instance segmented data sets according to claim 1, wherein said determining in step S2 an image classification data set based on said manual instance segmented data set specifically comprises: and cutting each object of each image in the manual example segmentation data set and combining the manually marked label to form the image classification data set.

5. The method for semi-automatic annotation of instance segmented data sets according to claim 1, wherein said manual instance segmented data set is obtained by selecting a second data set from an image data set for manual annotation, said instance segmented data set being used as said manual instance segmented data set when repeating steps S2-S5.

6. The method for semi-automatic labeling of an instance segmented data set according to claim 2, wherein the second prediction result comprises a second category and a second confidence level, the comparing the first prediction result with the second prediction result in step S4 specifically comprises comparing the first category and the first confidence level with the second category and the second confidence level of a single object, and if the first category is different from the second category or the difference between the first confidence level and the second confidence level exceeds a first threshold, manually correcting the first category and the mask of the single object in each image.

7. A method for semi-automatic annotation of instance segmented data sets according to any of claims 1-6, wherein said instance segmentation model comprises Mask _ RCNN and said image classification model comprises ResNeSt.

8. A semi-automatic annotation apparatus for instance segmented data sets, comprising:

an instance segmentation model training module configured to obtain an artificial instance segmentation dataset based on an image dataset, train an instance segmentation model using the artificial instance segmentation dataset;

the first prediction module is configured to predict a first data set in the image data set by using a trained instance segmentation model to obtain a first prediction result;

the second prediction module is configured to determine an image containing a single object in each image based on the first prediction result, input the image of the single object into a trained image classification model to obtain a second prediction result, compare the first prediction result with the second prediction result, and manually correct the first prediction result of the single object in each image according to the comparison result to obtain a pseudo label example segmentation data set; and

and the example segmentation model retraining module is configured to mix the artificial example segmentation data set and the pseudo label example segmentation data set to serve as an example segmentation data set, retrain the example segmentation model by using the example segmentation data set until the required prediction precision is achieved, output the final example segmentation model and the example segmentation data set, and otherwise, repeatedly execute the image classification model training module to the example segmentation model retraining module.

9. An electronic device, comprising:

one or more processors;

a storage device for storing one or more programs,

when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-7.

10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-7.