CN110058756B

CN110058756B - Image sample labeling method and device

Info

Publication number: CN110058756B
Application number: CN201910319246.8A
Authority: CN
Inventors: 牟永奇; 许欢庆; 李洁; 汤劲武
Original assignee: Beijing Lenztech Co ltd
Current assignee: Beijing Lenztech Co ltd
Priority date: 2019-04-19
Filing date: 2019-04-19
Publication date: 2021-03-02
Anticipated expiration: 2039-04-19
Also published as: CN110058756A

Abstract

The application provides a method and a device for labeling an image sample, wherein position information of a labeling frame is acquired firstly, images which do not belong to a classification in frame selection images of the labeling frame are deleted for any preset classification, then the classified frame selection images and corresponding position information are used as labeled data of the classification, so that the acquisition of the labeling frame and the deletion of the frame selection images are decoupled, compared with a mode of manually inputting the types of the frame selection images after one labeling frame is used for obtaining the frame selection images, batch deletion processing can be performed on a plurality of frame selection images, and a step of acquiring the position information of the labeling frame and screening the frame selection images can be executed in parallel at a certain time point, therefore, compared with a traditional labeling mode, the efficiency can be obviously improved.

Description

Image sample labeling method and device

Technical Field

The application relates to the field of artificial intelligence, in particular to an image sample labeling method and device.

Background

The image target detection technology based on deep learning is mature day by day, and is applied to various fields such as intelligent retail, intelligent monitoring, intelligent driving, intelligent medical treatment and the like, and strong technical efficiency is developed. The image target detection model based on deep learning needs to be trained by using a large number of labeled image samples, so that a large amount of manpower is required to be invested for labeling the image samples.

Taking fig. 1 as an example, in a conventional labeling manner, a manual target selection using a labeling box on an image sample is supported, position information of the labeling box is recorded, after a target is manually framed, a category option pops up, and a category to which the target belongs is manually selected from the category options, so that labeling of the target is completed (characters in fig. 1 are tool keys in existing labeling software, and no further description is given here). It can be seen that the above labeling process is repeated to complete the labeling of all of the targets in fig. 1.

Therefore, how to improve the efficiency of labeling is a problem to be solved urgently at present.

Disclosure of Invention

The application provides an image sample labeling method and device, and aims to solve the problem of how to improve the efficiency of image sample labeling.

In order to achieve the above object, the present application provides the following technical solutions:

an annotation method of an image sample, comprising:

acquiring position information of a marking frame, wherein the marking frame is used for framing a target on the image sample to obtain a framed image;

deleting images which do not belong to any preset classification in the frame selection images to obtain frame selection images of the classification;

for any preset classification, the classified frame selection image and the corresponding position information are used as the labeled data of the classification, wherein the position information corresponding to any frame selection image is as follows: and obtaining the position information of the marking frame of the frame selection image.

Optionally, the obtaining of the position information of the labeling frame includes:

obtaining position information of the marking frame output by a frame selection model obtained through pre-training, wherein the frame selection model is used for selecting a target on the image sample;

or, based on the manual adjustment operation of the position information of the reference marking frame output by the frame selection model, the position information of the marking frame is obtained.

Optionally, the training process of the frame model includes:

obtaining an annotation frame selected manually based on manual frame selection operation on the first batch of image samples;

training a preset frame selection model by using the manually selected marking frame;

taking a new image sample as the input of the trained frame selection model to obtain a labeling frame of the new image sample output by the frame selection model;

under the condition that the labeling frame output by the frame selection model is manually modified, the manually modified labeling frame is used for training the frame selection model;

and finishing the training process of the frame selection model under the condition that the number of the manually modified marking frames in the marking frames output by the frame selection model is not more than a preset threshold value.

acquiring the position information of the marking frame by using a first process;

for any preset classification, deleting images which do not belong to the classification in the frame selection images to obtain the frame selection images of the classification, wherein the method comprises the following steps:

deleting images which do not belong to any preset classification in the framed images by using a second process to obtain the classified framed images;

the method further comprises the following steps:

and under the condition that the first process is used for obtaining at least one frame selection image, the second process is used for obtaining at least one classified frame selection image, and the first process is used for obtaining the position information of a new labeling frame.

Optionally, the deleting images that do not belong to the category from the frame-selected images to obtain the frame-selected images of the category includes:

creating a folder that includes the classification;

placing the framed images into the classified folders;

and deleting the images which do not belong to the classification in the framed images based on the manual deletion operation of the images in the classified folder to obtain the classified framed images.

An apparatus for annotating an image sample, comprising:

the marking frame obtaining module is used for obtaining position information of a marking frame, and the marking frame is used for framing a target on the image sample to obtain a framed image;

the deleting module is used for deleting images which do not belong to any preset classification in the frame selection images to obtain the classified frame selection images;

and the annotation data acquisition module is used for regarding any one preset classification, and taking the classified frame selection image and the corresponding position information as the classified annotation data, wherein the position information corresponding to any one frame selection image is as follows: and obtaining the position information of the marking frame of the frame selection image.

Optionally, the obtaining of the position information of the label frame by the label frame obtaining module includes:

the marking frame acquisition module is specifically used for acquiring position information of the marking frame output by a frame selection model obtained through pre-training, and the frame selection model is used for selecting a target on the image sample; or, based on the manual adjustment operation of the position information of the reference marking frame output by the frame selection model, the position information of the marking frame is obtained.

Optionally, the method further includes:

the model training module is used for obtaining a manually-framed marking frame based on manual framing operation on the first batch of image samples; training a preset frame selection model by using the manually selected marking frame; taking a new image sample as the input of the trained frame selection model to obtain a labeling frame of the new image sample output by the frame selection model; under the condition that the labeling frame output by the frame selection model is manually modified, the manually modified labeling frame is used for training the frame selection model; and finishing the training process of the frame selection model under the condition that the number of the manually modified marking frames in the marking frames output by the frame selection model is not more than a preset threshold value.

the marking frame acquiring module is specifically used for acquiring the position information of the marking frame by using a first process;

the deleting module is used for deleting images which do not belong to any preset classification in the framed images, and the obtaining of the classified framed images comprises the following steps:

the deleting module is specifically configured to delete, by using a second process, an image that does not belong to any preset classification in the framed selection image to obtain a framed selection image of the classification;

the device further comprises:

and the scheduling module is used for obtaining at least one classified frame selection image by using the second process under the condition of obtaining at least one frame selection image by using the first process, and obtaining the position information of a new labeling frame by using the first process.

Optionally, the deleting module is configured to delete, for any preset classification, an image that does not belong to the classification in the framed images, and obtaining the classified framed image includes:

the deleting module is specifically used for establishing a folder comprising the classification; placing the framed images into the classified folders; and deleting the images which do not belong to the classification in the framed images based on the manual deletion operation of the images in the classified folder to obtain the classified framed images.

According to the image sample labeling method and device, the position information of the labeling frame is obtained, for any preset classification, the image which does not belong to the classification in the frame selection image selected by the labeling frame is deleted, then the classified frame selection image and the corresponding position information are used as the labeled data of the classification, therefore, the acquisition of the labeling frame and the deletion of the frame selection image are decoupled, compared with a mode that the frame selection image is obtained by using one labeling frame and then the type of the frame selection image is manually input, batch deletion processing can be carried out on a plurality of frame selection images, and the step of obtaining the position information of the labeling frame and screening the frame selection image can be executed in parallel at a certain time point, so that the efficiency can be obviously improved compared with the traditional labeling mode.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a diagram illustrating an example of obtaining annotations of an image sample in the prior art;

FIG. 2 is a flowchart of an image sample annotation method disclosed in an embodiment of the present application;

FIG. 3 is a diagram illustrating an example of an annotation method for an image sample disclosed in an embodiment of the present application;

FIG. 4 is a flowchart of another method for annotating an image sample according to an embodiment of the present disclosure;

fig. 5 is a schematic structural diagram of an image sample annotation device disclosed in an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

Fig. 2 is a method for labeling an image sample, disclosed in an embodiment of the present application, including the following steps:

s201: the image sample is displayed.

S202: and obtaining a frame selection image based on manual operation of frame selection targets on the image sample.

Taking fig. 1 as an example, a user may use a mouse or other tool to select an object on an image sample using a labeled box, and usually, an area of the labeled box includes an object. Any one of the marked boxes is a selected image (including the marked box).

S203: and deleting the images which do not belong to any preset classification in the frame selection images to obtain the frame selection images of the classification.

Specifically, folders may be created, any folder is named as a category, and the frame selection images are placed in the folders, where it should be noted that all the obtained frame selection images are placed in any folder.

The user can open any folder and delete the frame image which does not belong to the category pointed by the folder name (namely the category corresponding to the folder). Namely, based on manual deletion operation, deleting images which do not belong to the classification corresponding to the folder in the frame selection images of any folder to obtain the frame selection images of the classification.

S204: and regarding any preset classification, taking the frame selection image of the classification and the corresponding position information as the labeling data of the classification.

The position information corresponding to any one of the frame selection images is as follows: and obtaining the position information of the marking frame of the frame selection image. Generally, the position information of a label box includes the coordinates of the upper left corner and the lower right corner of the label box.

FIG. 3 is an example of obtaining annotation data using the process shown in FIG. 2:

an image sample is displayed, with objects in the image sample including and a bag. The user uses existing tools (as shown in the left column of fig. 3, which are not described herein), to frame the annotation box on the image sample, and any one of the annotation boxes frames a target.

The images selected by the frames are placed in folders which are created in advance, and in fig. 3, one folder is named as "bottle" and the other folder is named as "bag", which respectively represent the corresponding classification of the folders. It should be noted that the obtained frame selection image may be placed in the folder after any one or a part of the frame selection images are obtained, or the obtained frame selection image may be placed in the folder after all the frame selection images are obtained.

The user deletes the box selection pictures which do not belong to the bottles (namely the bags) in the bottle folder to obtain the box selection pictures which belong to the bottles. After the user finishes deleting, establishing the corresponding relation between each bottle frame selection picture and the position information of the marking frame of the obtained frame selection picture, and taking the corresponding bottle frame selection picture and the position information as the marking data of bottle classification.

The user deletes the box selection pictures which do not belong to the bag (namely the bottle) in the bag folder to obtain the box selection pictures which belong to the bag. After the user finishes deleting, establishing a corresponding relation between each bag frame selection picture and the position information of the marking frame of the obtained frame selection picture, and taking the corresponding bag frame selection picture and the position information as marking data of bag classification.

It can be seen that, in the method for labeling image samples according to this embodiment, the target is selected by framing, and then the framed images are uniformly screened, that is, the framing and the category labeling are decoupled, so that the framing and the category labeling can be performed by different people, and both can be performed in parallel at a certain time point.

And the frame selection images are screened in a deleting mode, so that the method is simple and easy to implement.

Fig. 2 shows that a manual frame selection manner is used to obtain a labeling frame, and besides the manual frame selection manner, other manners may also be used to obtain the labeling frame, and fig. 4 is a further image sample labeling method disclosed in this embodiment of the present application, and the main difference from fig. 2 is that a frame selection model is trained using a manually set labeling frame, and then the trained frame selection model is used to automatically obtain the labeling frame. Fig. 4 includes the following steps:

s401: and obtaining the marking frame and the framing image based on manual framing operation on the first batch of image samples.

The specific implementation manner of obtaining the labeling frame is as follows: the foreground displays the marking frame and the background records the position information of the marking frame.

S402: and obtaining the labeling data of each classification in the first image samples according to the steps of S203-S204 by using the frame selected images in the first image samples.

S403: and training a frame selection model by using the labeled frames of the first batch of manually selected image samples.

S404: and inputting the new image sample into the trained frame selection model to obtain the labeling frame of the new image sample.

After obtaining the labeling frame output by the model, manually determining whether the labeling frame meets the requirement, if so, triggering to execute S406, otherwise, manually modifying the labeling frame, for example: and if the marking frame marked on the new image sample by the trained frame selection model does not completely frame the target, manually stretching the marking frame to enable the marking frame to completely frame the target. After the annotation box is manually modified, S405 is performed.

Further, an interactive interface may be displayed to prompt the user whether modification is required, and receive an instruction that is not required to be modified as the trigger instruction of S406, or receive the position information of the label box modified by the user.

S405: and taking the manually modified annotation frame as an annotation frame of a new image sample.

S406: and obtaining the labeling data of each classification in the new image sample by using the frame selection image corresponding to the labeling frame of the new image sample according to the steps of S203-S204.

The frame selection image corresponding to the labeling frame is an area (including the labeling frame) framed by the labeling frame in the image sample.

It should be noted that, in the case of executing S405, the new annotation frame of the image sample is the manually modified annotation frame, that is, the annotation frame output by the model is used as the reference annotation frame, and the new annotation frame of the image sample is obtained after the reference standard frame is manually modified. If S405 is not executed, the annotation box of the new image sample is the annotation box output by the framing model with the new image sample as input.

S407: and taking the manually modified marking box as incremental training data to train a box selection model.

S408: and judging whether the number of the manually modified marking frames in the marking frames output by the model is not more than a preset value, if so, executing S409, otherwise, returning to execute S404 and subsequent processes when a new image sample needs to be marked, namely, automatically outputting the model and selecting the marking frames in a manner of combining manual assistance, and continuing to train the frame selection model.

S409: and finishing the training process of the frame selection model.

Therefore, the labeling frame in the subsequent new image sample can be automatically labeled by using the frame selection model without the help of manpower.

It should be noted that, the number of the first image samples may be preset, because the first image samples are manual frame selection marking frames, the number of the first image samples is usually multiple.

The new image sample refers to an image sample which is not labeled yet and is to be labeled compared with an image sample which is labeled already.

As can be seen from the flow shown in fig. 4, the manual labeling box is used to train the frame selection model, the training process is iterated in a manual correction mode, and after the training is completed, the frame selection model is used to automatically output the labeling box, so that the labeling efficiency can be improved, and the acquisition efficiency of the labeling data can be further improved.

Further, based on the decoupling processing principle of obtaining the annotation frame and screening the frame-selected images of each category in the image sample annotation method in the embodiment of the present application, a first process may be used to obtain the annotation frame, and a second process may be used to delete the images in the frame-selected images that do not belong to the category, so as to obtain the frame-selected images of the category. In this case, at least one classified framing image is obtained using the second process, while at least one framing image is obtained using the first process. Therefore, the steps of acquiring the labeling frame and the frame selection images of all the classifications can be executed in parallel at a certain time point, and the acquisition efficiency of the labeling data is further improved.

Specifically, as for the flow shown in fig. 4, the first process may be used to execute S401 or S404, the second process may be used to execute S402 or S406, and the third process may be used to execute the model training processes of S403, S405, S407, and S408. The three processes are linked and independent of each other, and can be executed in parallel at a certain time point.

Fig. 5 is an apparatus for annotating an image sample, disclosed in an embodiment of the present application, including: the system comprises a marking frame acquisition module, a deletion module, a marking data acquisition module, and optionally a model training module and a scheduling module.

The marking frame obtaining module is used for obtaining position information of a marking frame, and the marking frame is used for framing a target on the image sample to obtain a framed image. And the deleting module is used for deleting images which do not belong to any preset classification in the frame selection images to obtain the frame selection images of the classification. The annotation data acquisition module is used for regarding any one preset classification, using the classified frame selection image and the corresponding position information as the classified annotation data, wherein the position information corresponding to any one frame selection image is as follows: and obtaining the position information of the marking frame of the frame selection image.

Further, the specific implementation manner of the marking frame obtaining module obtaining the position information of the marking frame is as follows: obtaining position information of the marking frame output by a frame selection model obtained through pre-training, wherein the frame selection model is used for selecting a target on the image sample; or, based on the manual adjustment operation of the position information of the reference marking frame output by the frame selection model, the position information of the marking frame is obtained.

The specific implementation mode of deleting the images which do not belong to the classification in the selected images by the deleting module is as follows: creating a folder that includes the classification; placing the framed images into the classified folders; and deleting the images which do not belong to the classification in the framed images based on the manual deletion operation of the images in the classified folder to obtain the classified framed images.

The model training module is used for: obtaining an annotation frame selected manually based on manual frame selection operation on the first batch of image samples; training a preset frame selection model by using the manually selected marking frame; taking a new image sample as the input of the trained frame selection model to obtain a labeling frame of the new image sample output by the frame selection model; under the condition that the labeling frame output by the frame selection model is manually modified, the manually modified labeling frame is used for training the frame selection model; and finishing the training process of the frame selection model under the condition that the number of the manually modified marking frames in the marking frames output by the frame selection model is not more than a preset threshold value.

Further, the annotation frame acquisition module may acquire the position information of the annotation frame using a first process. The deleting module may delete, for any one preset classification, an image that does not belong to the classification in the framed images using a second process, to obtain a framed image of the classification. In this case, the scheduling module is configured to, in a case where the first process is used to obtain at least one frame selection image, obtain at least one classified frame selection image by using the second process, and obtain the position information of the new annotation frame by using the first process, so as to implement parallel implementation of obtaining the frame selection image and deleting the screening frame selection image within a certain time range.

The apparatus shown in fig. 5 is capable of performing a sample labeling process in stages, and the two stages can be performed in parallel at a certain time. The working link of the split category labeling is simplified into the process of screening the target small graphs one by one, and the labeling difficulty is reduced. The two stages can realize flexible configuration on the skill requirements and the number of the people to be labeled, reduce the labeling cost and improve the labeling efficiency.

In addition, the use of a general target detection model is introduced in the labeling process, and through the iteration of manual data labeling and model training, a preselected target coordinate frame is automatically generated by using the model, so that the manual labeling is assisted, and the working difficulty is reduced. When the detection precision of the model reaches a certain degree, the method can completely replace manual work to finish the marking work of the residual data.

In summary, the apparatus shown in fig. 5 can realize labeling of image samples more efficiently.

The functions described in the method of the embodiment of the present application, if implemented in the form of software functional units and sold or used as independent products, may be stored in a storage medium readable by a computing device. Based on such understanding, part of the contribution to the prior art of the embodiments of the present application or part of the technical solution may be embodied in the form of a software product stored in a storage medium and including several instructions for causing a computing device (which may be a personal computer, a server, a mobile computing device or a network device) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts among the embodiments are referred to each other.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. An image sample labeling method is characterized by comprising the following steps:

acquiring position information of a marking frame, wherein the marking frame is used for framing a target on the image sample to obtain a framed image; for any preset classification, deleting images which do not belong to the classification in the frame selection images based on manual operation to obtain the frame selection images of the classification so as to realize batch deletion processing of a plurality of frame selection images;

for any preset classification, based on manual operation, the classified frame selection image and the corresponding position information are used as the labeled data of the classification, wherein the position information corresponding to any frame selection image is as follows: obtaining the position information of the marking frame of the frame selection image;

the acquisition of the labeling frame and the deletion of the frame selection image are decoupled, and the frame selection image and the category labeling can be executed in parallel, so that the efficiency of manually labeling the image sample is improved.

2. The method of claim 1, wherein the obtaining the position information of the label box comprises:

3. The method of claim 2, wherein the training process of the frame selection model comprises:

4. The method of claim 1, wherein the obtaining the position information of the label box comprises:

for any preset classification, deleting images which do not belong to the classification in the framed images based on manual operation to obtain the framed images of the classification, wherein the deleting comprises the following steps:

the method further comprises the following steps:

5. The method according to any one of claims 1 or 4, wherein deleting images not belonging to the category from the framed images, and obtaining the framed images of the category comprises:

creating a folder that includes the classification;

placing the framed images into the classified folders;

6. An apparatus for annotating an image sample, comprising:

the deleting module is used for deleting images which do not belong to any preset classification in the framed images based on manual operation to obtain the classified framed images so as to realize batch deleting processing of the plurality of framed images;

and the annotation data acquisition module is used for taking the classified frame selection images and the corresponding position information as the classified annotation data of any one preset classification based on manual operation, wherein the position information corresponding to any one frame selection image is as follows: obtaining the position information of the marking frame of the frame selection image;

7. The apparatus of claim 6, wherein the label box obtaining module is configured to obtain the position information of the label box and comprises:

8. The apparatus of claim 7, further comprising:

9. The apparatus of claim 6, wherein the label box obtaining module is configured to obtain the position information of the label box and comprises:

the deleting module is used for deleting images which do not belong to any preset classification in the framed images based on manual operation for any preset classification, and obtaining the classified framed images comprises the following steps:

the device further comprises:

10. The apparatus according to any one of claims 6 to 9, wherein the deleting module is configured to, for any one preset classification, delete an image that does not belong to the classification from the framed images, and obtaining the framed image of the classification includes: