WO2022160736A1

WO2022160736A1 - Image annotation method and apparatus, electronic device, storage medium and program

Info

Publication number: WO2022160736A1
Application number: PCT/CN2021/118580
Authority: WO
Inventors: 李嘉辉
Original assignee: 上海商汤智能科技有限公司
Priority date: 2021-01-28
Filing date: 2021-09-15
Publication date: 2022-08-04
Also published as: CN112925938B; CN112925938A

Abstract

An image annotation method and apparatus, an electronic device, a storage medium, and a program. The image annotation method comprises: obtaining a target image to be annotated (S101); partitioning the target image to obtain a plurality of image blocks (S102); performing encoding processing on each of the plurality of image blocks to obtain encoding information corresponding to each image block (S103); performing category annotation on some of the plurality of encoded image blocks to obtain annotated image blocks (S104); determining category information corresponding to each image block on the basis of the encoding information corresponding to each image block and the encoding information of the annotated image block (S105); and annotating the target image on the basis of the annotation mode corresponding to the target image and the category information corresponding to each image block (S106). According to the image annotation method, a lesion area in an organ image of a patient can be quickly annotated in the medical field, such that a user is effectively assisted in clinical diagnosis.

Description

Image annotation method, device, electronic device, storage medium and program

CROSS-REFERENCE TO RELATED APPLICATIONS

The present disclosure is based on the Chinese patent application with the application number of 202110116990.5, the application date of January 28, 2021, and the application title of "An Image Annotation Method, Device, Electronic Device and Storage Medium", and requires the priority of the above-mentioned Chinese patent application The entire contents of the above-mentioned Chinese patent application are hereby incorporated into the present disclosure by reference.

technical field

The present disclosure relates to the technical field of image recognition, in particular, but not limited to, an image labeling method, apparatus, electronic device, storage medium and program.

Background technique

With the continuous deepening of artificial neural network research, the application field of artificial neural network has been greatly expanded. For example, in the field of image recognition, a neural network for image recognition can be trained, such as a neural network for image target detection. The neural network can mark the target object in the image through the detection frame; for example, a neural network can be trained for semantic segmentation of the image, and the neural network can determine the contour of the target object in the image.

When training a neural network for image recognition, a large number of training samples need to be obtained, and each training sample needs to carry accurate annotation information to meet the training requirements of the neural network. The annotation information includes the detection frame annotation data of the target object, the outline annotation data of the target object, etc. The annotation of these annotation information usually requires a lot of manual annotation operations.

SUMMARY OF THE INVENTION

Embodiments of the present disclosure provide an image labeling method, apparatus, electronic device, storage medium, and program.

The technical solutions of the embodiments of the present disclosure are implemented as follows:

An embodiment of the present disclosure provides an image labeling method, the method comprising:

Obtain the target image to be annotated;

The target image is divided into blocks to obtain a plurality of image blocks;

performing encoding processing on each image block in the plurality of image blocks to obtain encoding information corresponding to each image block;

Perform category labeling on some image blocks in the plurality of image blocks after encoding processing, to obtain labeled image blocks;

determining the category information corresponding to each image block based on the encoding information corresponding to each image block and the encoding information of the marked image block;

The target image is annotated based on the labeling manner corresponding to the target image and the category information corresponding to each image block.

In some embodiments of the present disclosure, performing encoding processing on each image block in the plurality of image blocks to obtain encoding information corresponding to each image block includes:

Inputting the image blocks into a neural network, and extracting the image features contained in the image blocks through the neural network; and

Through the neural network, the coding information corresponding to each image block is determined based on the image feature of each image block.

In some embodiments of the present disclosure, the labeling of the target image based on the labeling manner corresponding to the target image and the category information corresponding to each image block includes:

In the case where the labeling method is to label a rectangular frame, obtain image blocks with the same corresponding category information;

The target image is marked according to the smallest peripheral rectangular frame of the image blocks with the same corresponding category information.

In the case where the labeling method is contour labeling, the class information corresponding to each pixel point in the target image is determined based on the class information corresponding to each image block and the pixel points included in each image block;

Obtain the attribute features of each pixel in the target image, and determine the target pixel based on the corresponding category information and the attribute feature of the each pixel; Wherein, the target pixel is the pixel to be adjusted for category information;

In the case where the target pixel exists, the category information corresponding to the target pixel is adjusted based on the category information of the pixel with the same attribute feature as the target pixel to obtain each pixel in the target image. The target category information corresponding to the point;

Based on the target category information corresponding to each pixel in the target image, an outline of a target object formed by pixels belonging to the same target category is marked.

In some embodiments of the present disclosure, determining the target pixel based on the category information and attribute features corresponding to each pixel includes:

Selecting at least one of the respective pixel points as the first pixel point;

Based on the attribute feature of the first pixel point, the difference value of the attribute feature of the first pixel point and the second pixel point is determined; wherein, the second pixel point is adjacent to the first pixel point;

In the case that the difference value is greater than the preset threshold, at least one third pixel is selected; wherein, the third pixel is a pixel with the same attribute feature as the first pixel;

When the category information corresponding to the first pixel point is inconsistent with the category information corresponding to the third pixel point, the first pixel point is used as the target pixel point.

In the case that the labeling method is to perform classification labeling, based on the category information corresponding to each image block, determine the number of image blocks corresponding to each category information;

The category information of the target image is determined based on the number of image blocks corresponding to each category of information.

In some embodiments of the present disclosure, the image labeling method further includes:

After the target image is marked, in response to the marking category update instruction for the target image block, update the category information of the marked image block;

Return to perform the step of determining the category information corresponding to each image block based on the encoding information corresponding to each image block and the encoding information of the marked image block.

In some embodiments of the present disclosure, the neural network is obtained by:

Acquire multiple sample images, and divide each sample image into blocks to obtain multiple sample image blocks and position information of each sample image block;

Based on the position information of each sample image block, a first sample image block pair and a second native image block pair are determined; wherein the distance between the two sample image blocks in the first sample image block pair is less than or is equal to a set threshold, and the distance between two sample image blocks in the second sample image block pair is greater than the set threshold;

respectively inputting the first sample image block pair and the second sample image block pair into the neural network to be trained, to obtain predictive coding information corresponding to each sample image block;

Based on the predictive coding information corresponding to each sample image block, the network parameters of the neural network to be trained are adjusted to obtain a trained neural network for coding.

The embodiment of the present disclosure also provides an image labeling device, including:

The image acquisition module is configured to: acquire the target image to be marked;

The image segmentation module is configured to: block the target image to obtain a plurality of image blocks;

The image encoding module is configured to: perform encoding processing on each image block in the plurality of image blocks to obtain encoding information corresponding to each image block;

The first labeling module is configured to: perform category labeling on some image blocks in the plurality of image blocks after encoding processing, to obtain labelled image blocks;

The category determination module is configured to: determine the category information corresponding to each image block based on the encoding information corresponding to each image block and the encoding information of the marked image block;

The second labeling module is configured to: label the target image based on the labeling method corresponding to the target image and the category information corresponding to each image block.

Embodiments of the present disclosure further provide an electronic device, including: a processor, a memory, and a bus, where the memory stores machine-readable instructions executable by the processor, and when the electronic device runs, the processor and the The memories communicate with each other through a bus, and when the machine-readable instructions are executed by the processor, the image labeling method as described in any preceding one is executed.

An embodiment of the present disclosure further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and the computer program is executed by a processor to execute the image labeling method described in any preceding one.

The embodiments of the present disclosure also provide a computer program, the computer program includes computer-readable codes, and when the extremely computer-readable codes are executed in an electronic device, the processor of the electronic device executes the code for realizing Image annotation method as described in the previous one.

The image labeling method provided by the embodiments of the present disclosure can perform block processing on each target image to be labeled to obtain a plurality of image blocks, determine the encoding information corresponding to each image block, and further according to each image block The coding information corresponding to the image block and the coding information of a small number of pre-labeled image blocks with categories can be used to determine the category information corresponding to each image block in the multiple image blocks. The category information corresponding to the block completes the annotation of the target image. That is to say, in the implementation process of the image labeling method provided by the embodiments of the present disclosure, the labeling of the entire target image can be completed based on a small amount of labelled image blocks of labeling category information, which greatly saves labeling time and improves labeling efficiency. Exemplarily, by using the image labeling method provided by the embodiments of the present disclosure, the lesion area in the patient's organ image can be quickly labelled in the medical field, thereby effectively assisting the user to perform clinical diagnosis.

In order to make the above-mentioned objects, features and advantages of the present disclosure more obvious and easy to understand, the preferred embodiments are exemplified below, and are described in detail as follows in conjunction with the accompanying drawings.

Description of drawings

In order to explain the technical solutions of the embodiments of the present disclosure more clearly, the following briefly introduces the accompanying drawings required in the embodiments, which are incorporated into the specification and constitute a part of the specification. The drawings illustrate embodiments consistent with the present disclosure, and together with the description serve to explain the technical solutions of the present disclosure. It should be understood that the following drawings only show some embodiments of the present disclosure, and therefore should not be regarded as limiting the scope. Other related figures are obtained from these figures.

FIG. 1 is a flowchart of an image labeling method provided by an embodiment of the present disclosure;

FIG. 2 is a flowchart of determining encoding information according to an embodiment of the present disclosure;

3 is a flowchart of a method for training a neural network provided by an embodiment of the present disclosure;

4 is a flowchart of a first specific image labeling method provided by an embodiment of the present disclosure;

5 is a flowchart of a second specific image labeling method provided by an embodiment of the present disclosure;

6 is a flowchart of a second specific image labeling method provided by an embodiment of the present disclosure;

7a is a schematic diagram of a lung image provided by an embodiment of the present disclosure;

7b is a schematic diagram of annotated image blocks included in a lung image provided by an embodiment of the present disclosure;

FIG. 7c is a schematic diagram of labeling a lung image provided by an embodiment of the present disclosure;

FIG. 8 is a schematic structural diagram of an image labeling apparatus provided by an embodiment of the present disclosure;

FIG. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure.

Detailed ways

In order to make the purposes, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present disclosure. Obviously, the described embodiments are only These are some, but not all, embodiments of the present disclosure. The components of the disclosed embodiments generally described and illustrated in the drawings herein may be arranged and designed in a variety of different configurations. Therefore, the following detailed description of the embodiments of the disclosure provided in the accompanying drawings is not intended to limit the scope of the disclosure as claimed, but is merely representative of selected embodiments of the disclosure. Based on the embodiments of the present disclosure, all other embodiments obtained by those skilled in the art without creative work fall within the protection scope of the present disclosure.

It should be noted that like numerals and letters refer to like items in the following figures, so once an item is defined in one figure, it does not require further definition and explanation in subsequent figures.

The term "and/or" in this paper only describes an association relationship, which means that there can be three kinds of relationships, for example, A and/or B, which can mean: the existence of A alone, the existence of A and B at the same time, the existence of B alone. a situation. In addition, the term "at least one" herein refers to any combination of any one of the plurality or at least two of the plurality, for example, including at least one of A, B, and C, and may mean including from A, B, and C. Any one or more elements selected from the set of B and C.

The training process of a neural network for image recognition requires the help of a large number of labeled image samples. For example, for training a neural network for image detection, it needs to rely on a large amount of labeled information carrying the detection frame to complete the training process; for The neural network training process for instance segmentation needs to rely on a large number of image samples carrying instance outline annotation information. Therefore, in order to train a high-precision neural network, a large number of image samples with annotation information are required, and the annotation process of annotation information requires manual annotation, which means that it takes a lot of time to perform the image samples for each target object. callout. Therefore, the image labeling method in the related art consumes a long time and has low efficiency.

Based on the above problems, the present disclosure provides an image labeling method. For each target image to be labelled, the target image can be processed into blocks to obtain a plurality of image blocks, and then according to the pre-trained neural network for coding network to determine the encoding information corresponding to each image block, and according to the encoding information corresponding to each image block and a small amount of encoding information of the marked image blocks with pre-marked category information, the corresponding image block of each of the multiple image blocks can be determined. Then, based on the annotation method corresponding to the target image and the category information corresponding to each image block, the annotation of the target image can be completed. That is to say, with the image labeling method provided by the embodiments of the present disclosure, the labeling of the entire target image can be completed based on a small number of labeling image blocks of labeling categories, which greatly saves labeling time and improves labeling efficiency.

The image labeling method provided by the embodiments of the present disclosure can be applied to electronic devices. Exemplarily, the electronic device may be a computer device with a certain computing capability. Exemplarily, the computer device may include: a terminal device, a server, and other processing devices. Any of the devices; for example, the terminal device may be a user equipment (User Equipment, UE), a mobile device, a user terminal, a terminal, a cellular phone, a cordless phone, a Personal Digital Assistant (PDA), a handheld devices, computing devices, in-vehicle devices, wearable devices, etc.

In some embodiments of the present disclosure, the image annotation method may be implemented by a processor invoking computer-readable instructions stored in a memory.

FIG. 1 is a flowchart of an image labeling method provided by an embodiment of the present disclosure. As shown in FIG. 1 , the method may include steps S101 to S106:

S101: Acquire a target image to be marked.

Exemplarily, the target image may be a sample image used for training a neural network for image recognition; exemplarily, in the case of training a neural network for pedestrian detection, the target image may be a large number of pre-collected images containing pedestrians; Exemplarily, in the case of training a neural network for lung lesion identification, the target image may be a pre-acquired lung image.

S102: Divide the target image into blocks to obtain multiple image blocks.

Exemplarily, the target image can be divided into blocks according to the set number of rows and columns to obtain multiple image blocks of the same size, or the target image can be divided into blocks with a preset size to obtain multiple image blocks with the preset size. Image blocks with the same size, each image block contains the same pixels.

S103: Perform encoding processing on each image block in the plurality of image blocks to obtain encoding information corresponding to each image block.

Exemplarily, each image block can be encoded based on a pre-trained neural network for encoding, and the encoding information corresponding to the image block can be predicted; exemplarily, the encoding information corresponding to any two image blocks can be It is used to represent the similarity and/or distance information of the image features between any two image blocks. In this way, it can be determined whether the two image blocks belong to the same category according to the coding information of the two image blocks.

Exemplarily, the image block is encoded by the encoding neural network, and the mapping relationship between the image feature of the image block and the encoding space can be constructed. The closer the distance between the encoding information corresponding to the two image blocks in the encoding space, the The image features of the two image blocks are more likely to be similar, or it can indicate that the distances in the coding space of the respective coding information corresponding to the image blocks with similar image features are relatively close.

Exemplarily, the image features corresponding to the image blocks may include at least one of the features that can represent the image content, such as texture features, spectral features, and color features of the image blocks.

S104: Perform category labeling on some image blocks in the plurality of image blocks after encoding processing, to obtain labeled image blocks.

S105 , based on the encoding information corresponding to each image block and the encoding information annotating the image block, determine the category information corresponding to each image block.

Exemplarily, for target images in different usage scenarios, the category information contained in the target images may be predetermined. For example, for lung images for lesion detection, the category information may include benign categories and lesion categories; for pedestrians to be detected The target image, the category information can include pedestrian categories and non-pedestrian categories; for the target image to be labeled with instances, the category information in the target image can be the category corresponding to each instance; for example, for the target image of the road scene, the target image The included category information may include vehicle category, pedestrian category, and road category.

Exemplarily, for multiple image blocks included in the target image, some image blocks in the multiple image blocks may be classified in advance to obtain annotated image blocks for determining category information, so that the encoding information of each image block can be used to determine the type of image block. It is determined that the image blocks belong to the same category as the labeled image blocks, and the category information of the image blocks of the same category can be the same. In this way, the category information corresponding to each image block in the multiple image blocks included in the target image can be determined.

Exemplarily, taking the target image as a lung image as an example, some image blocks in the multiple image blocks corresponding to the lung image can be classified, for example, annotating a number of image blocks corresponding to the category information as the lesion category, we can obtain: A number of annotated image blocks representing the lesion category; annotating a number of image blocks with corresponding category information as benign categories, several labeled image blocks representing benign categories can be obtained, and then based on the encoding information corresponding to each image block contained in the lung image, The category information of each image block is determined, that is, image blocks whose corresponding category information is a lesion category and an image block whose corresponding category information is a benign category are obtained in the lung image.

S106 , label the target image based on the labeling manner corresponding to the target image and the category information corresponding to each image block.

Exemplarily, the labeling method may be determined in advance according to the use scene of the target image. For example, in the case where a rectangular frame of the target object in the target image needs to be marked, the labeling method may be a method of labeling a rectangular frame; In the case of identifying the contour of the target object in the target image, the labeling method may be the method of contour labeling; for the case where the category of the target image needs to be marked, the labeling method may be the method of classifying and labeling.

After the category information corresponding to each image block included in the target image is determined according to the above method, the target image can be labeled according to the corresponding labeling method of the target image.

In the image labeling method provided by the embodiment of the present disclosure, for each target image to be labelled, the target image may be processed into blocks to obtain a plurality of image blocks, and the encoding information corresponding to each image block is determined, and further according to each image block The coding information corresponding to each image block and a small number of coding information of pre-annotated category-marked image blocks can determine the category information corresponding to each image block in the multiple image blocks, and further based on the corresponding marking method of the target image and each image block. The category information corresponding to the block completes the annotation of the target image. That is to say, in the implementation process of the image labeling method provided by the embodiments of the present disclosure, the labeling of the entire target image can be completed based on a small amount of labelled image blocks of labeling category information, thereby greatly saving labeling time and improving labeling efficiency. Exemplarily, by using the image labeling method provided by the embodiments of the present disclosure, the lesion area in the patient's organ image can be quickly labelled in the medical field, so that clinical diagnosis can be efficiently aided.

The foregoing S101 to S106 will be described below with reference to specific embodiments.

In this embodiment of the present disclosure, encoding processing is performed on each image block in the plurality of image blocks to obtain encoding information corresponding to each image block, which may be implemented through S1031 to S1032 as shown in FIG. 2 :

S1031, input each image block into a neural network, and extract image features contained in each image block through the neural network;

S1032 , through a neural network, determine the coding information corresponding to each image block based on the image feature of each image block.

After receiving the image block, the pre-trained neural network for encoding can extract the image content contained in the image block. For example, the image content of the image block can be represented by image features. The neural network can construct the mapping relationship between the image feature of the image block and the encoding space. Then, after extracting the image feature of the image block, the encoding information corresponding to the image feature can be obtained according to the pre-trained mapping relationship, so as to obtain the image. The encoding information corresponding to the block.

In the embodiment of the present disclosure, the image block is processed by the pre-trained neural network for encoding, so that the mapping relationship between the image features and the encoding information can be established in advance, so that after the image features of the image block are extracted, the image block can be processed. Quickly determine the encoding information corresponding to the image block.

In this embodiment of the present disclosure, the pre-trained neural network for encoding may be implemented through S301 to S304 as shown in FIG. 3 :

S301: Acquire multiple sample images, and divide each sample image into blocks to obtain multiple sample image blocks and position information of each sample image block.

The process of segmenting multiple sample images to obtain multiple sample image blocks is the same as the above-mentioned method of obtaining multiple image blocks included in the target image, and will not be repeated here.

Exemplarily, the position information of each sample image block may be represented by the position coordinates of the center point of the sample image block in the image coordinate system corresponding to the sample image.

S302, based on the position information of each sample image block, determine a first sample image block pair and a second sample image block pair.

Wherein, the distance between the two sample image blocks in the first sample image block pair is less than or equal to the set threshold; the distance between the two sample image blocks in the second sample image block pair is greater than the set threshold.

Exemplarily, in a sample image, the image features between sample image blocks that are close to each other in space are relatively similar, and the probability of belonging to the same category of information is high; the image features between sample image blocks that are far apart in space may be are not similar, the probability of belonging to the same category of information is small, so the first sample image block pair and the second sample image block pair can be constructed based on the distance information between the sample image blocks. The two image blocks of the frame sample image and the distance less than or equal to the set threshold are taken as the first sample image pair, and the two image blocks belonging to the same frame sample image and the distance greater than the set threshold, or the two image blocks belonging to different frame sample images. The two sample image patches serve as the second sample image pair.

S303, respectively inputting the first sample image block pair and the second sample image block pair into the neural network to be trained, to obtain predictive coding information corresponding to each sample image block.

Exemplarily, after inputting the two sample image blocks included in the first sample image block pair into the neural network to be trained, the image features of the two sample image blocks can be obtained respectively, and based on the two sample image blocks By compressing and encoding the two sample image blocks corresponding to the corresponding image features respectively, prediction coding information corresponding to the two sample image blocks can be obtained.

S304, based on the predictive coding information corresponding to each sample image block, adjust the network parameters of the neural network to be trained to obtain a trained neural network for coding.

Exemplarily, in the process of adjusting the network parameters, an unsupervised clustering algorithm can be introduced to train the neural network for encoding; Exemplarily, in the process of training, by adjusting the network parameters, the first The distance between the predictive coding information corresponding to the two sample image blocks included in this image block pair is gradually reduced, so that the distance between the predictive coding information corresponding to the two sample image blocks included in the second sample image block pair is gradually enlarged; When adjusting the predictive coding information corresponding to the two sample image blocks included in the first sample image block or the second sample image block, the image features corresponding to each image block can be adjusted at the same time; After the number of training times, or the loss value corresponding to the loss function is less than the set loss value, the obtained mapping relationship between image features and encoding space can indicate that the encoding information corresponding to similar image blocks is closer in the encoding space, and dissimilar image blocks are closer in the encoding space. The corresponding encoded information is far away in the encoding space, and at this time, the trained neural network for encoding can be obtained.

Exemplarily, the encoded information may be represented by a vector, and the distance between the encoded information corresponding to two image blocks may be determined by a cosine distance formula.

In the embodiment of the present disclosure, in the process of training a neural network for encoding, a pair of first sample image blocks whose mutual distance is less than or equal to a set threshold and a second sample image block whose mutual distance is greater than the set threshold are introduced. Yes, the neural network is trained. In the training process, the network parameters can be accurately obtained by adjusting the network parameters based on the training criteria with high similarity of image features between the sample image blocks that are close in space. A neural network that determines the encoded information corresponding to the image block based on the image block.

In the embodiment of the present disclosure, the category information corresponding to each image block can be determined by introducing a semi-supervised learning algorithm, and exemplarily, the category information, encoding information, and encoding space distribution information of a given labeled image block can be obtained. Category information corresponding to each image block in the multiple image blocks contained in the target image.

Exemplarily, based on the coding space distribution information, it is possible to reflect the position distribution in the coding space of the coding information corresponding to the multiple image blocks included in the target image, and the higher the similarity of the image features between the two image blocks, The larger the corresponding distance in the coding space, the coding information corresponding to the two image blocks has a probability of being close; or, the coding space distribution can be a clustering distribution obtained after clustering processing based on the coding information corresponding to multiple image blocks, so that, When the target image contains target objects of various categories, the coding space distribution corresponding to the target image may contain multiple clusters, and the image blocks contained in each cluster have a high probability of belonging to the same category of information. After determining the category information corresponding to the encoding information contained in one of the clusters, the category information corresponding to other encoding information contained in the cluster can be determined, that is, the corresponding category information of each image block in the multiple image blocks contained in the target image can be obtained. Category information.

In this embodiment of the present disclosure, marking the target image based on the labeling method corresponding to the target image and the category information corresponding to each image block can be implemented through S1061 to S1062 as shown in FIG. 4 :

S1061, when the labeling method is to label a rectangular frame, obtain image blocks with the same corresponding category information;

S1062: Mark the target image according to the minimum peripheral rectangular frame of the image blocks corresponding to the same category information.

Exemplarily, a plurality of image blocks with the same category information may be included. When annotating the target image, the image blocks with the same category information may be searched in the manner of a single-connected domain search to obtain an image with the same category information. The minimum surrounding rectangle corresponding to the block is then marked according to the minimum surrounding rectangle.

Exemplarily, in the case where the target image is a lung image, in the above manner, an image block in the lung image whose category information is a lesion can be marked to obtain a lesion area in the lung image.

In the embodiment of the present disclosure, by using the category information corresponding to each image block included in the identified target image, the labeling of the image blocks with the same category information included in the target image can be completed. This process does not require the user to manually mark the rectangle frame of each target object, which can greatly save the marking time and improve the marking efficiency.

In this embodiment of the present disclosure, marking the target image based on the labeling method corresponding to the target image and the category information corresponding to each image block can be implemented through S1063 to S1066 as shown in FIG. 5 :

S1063 , when the labeling method is contour labeling, determine the class information corresponding to each pixel in the target image based on the class information corresponding to each image block and the pixels included in each image block.

Exemplarily, when the labeling method is to perform contour labeling, in order to improve the labeling accuracy, category information corresponding to each pixel included in the target image may be determined.

Exemplarily, when the size of the image block segmentation is small, the category information of the image block can be used as the category information corresponding to each pixel included in the image block. For example, the category information corresponding to an image block is a lesion category, then the image block The category information corresponding to all the included pixels may be lesion categories; for example, when the image block is an image block at the edge of the target object, the category information of the pixels included in the image block may be different.

Exemplarily, when the category information corresponding to the image block is used as the category information corresponding to each pixel included in the image block, in order to improve the accuracy of the category information of the pixel, the category information of some pixels, especially the target object The category information corresponding to the edge pixels needs to be further adjusted, as described later.

S1064: Acquire attribute features of each pixel in the target image, and determine the target pixel based on category information and attribute features corresponding to each pixel.

The target pixel is the pixel to be adjusted for category information.

Exemplarily, in the process of determining whether there is a target pixel to be adjusted for category information, a conditional random field model may be introduced, and the conditional random field model is used to determine whether there is a target pixel to be adjusted for category information, and to be adjusted. The category information of the target pixel for category information adjustment is adjusted.

Exemplarily, the attribute features of each pixel included in the target image can be extracted, and the attribute features between different pixels can be used to represent the difference information between these pixels, such as texture differences, gradient differences, grayscale values. Difference information such as difference, because the pixels with inaccurate category information are likely to be pixels on the edge of the target object in the target image, so the pixels on the edge of the target object in the target image can be filtered out based on the attribute features corresponding to each pixel. point.

Exemplarily, the category information corresponding to two pixels with the same attribute feature is likely to be consistent. If there is a pixel with the same attribute feature as a certain pixel on the edge of the target object, but the corresponding category information is inconsistent, the pixel can be considered. The point is the target pixel to be adjusted for category information.

S1065 , in the case that the target pixel exists, adjust the category information corresponding to the target pixel based on the category information of the pixel with the same attribute feature as the target pixel to obtain the target category information corresponding to each pixel in the target image .

Exemplarily, since the category information corresponding to two pixels with the same attribute feature is highly likely to be the same, the category information of the target pixel can be analyzed based on the category information of other pixels with the same attribute feature as the target pixel. After adjusting the category information corresponding to the target pixel point, the target category information corresponding to each pixel point in the target image is obtained.

S1066, based on the target category information corresponding to each pixel in the target image, annotate the contour of the target object formed by the pixels belonging to the same target category.

Exemplarily, the contour of the region formed by pixels belonging to the same category information can be annotated through the target category information corresponding to each pixel in the target image, so that the contour annotation of the target object contained in the target image can be completed.

Exemplarily, after the above step S1064, if it is determined that there is no target pixel, based on the category information corresponding to each pixel in the target image, the outline of the target object formed by pixels belonging to the same target category is marked.

This situation can indicate that the prediction of the category information of the pixel points on the edge of the target object is relatively accurate, and there is no need to adjust it. The corresponding labeling process for the target object in this situation is similar to that described above, and will not be repeated here. .

In the embodiment of the present disclosure, when it is determined that there are target pixels with inaccurate category information, the target category information corresponding to each pixel can be accurately determined based on the category information and attribute features corresponding to each pixel included in the target image, thereby The segmentation and annotation results with higher accuracy are obtained.

Exemplarily, for S1064, when the target pixel is determined based on the category information and attribute features corresponding to each pixel, S10641 to S10643 may be used to implement:

S10641: Select at least one of each pixel point as the first pixel point; based on the attribute feature of the first pixel point, determine the difference value of the attribute feature of the first pixel point and the second pixel point.

Wherein, the second pixel is adjacent to the first pixel.

Exemplarily, the first pixel point and the second pixel point refer to the pixel points adjacent to the pixel coordinates of the target image.

Exemplarily, the difference value of the attribute feature of each first pixel point and the second pixel point can be determined through the conditional random field model mentioned above.

Exemplarily, in the method for determining the category information of each pixel point proposed in the embodiment of the present disclosure, the category information of some pixel points serving as the edge of the target object may be adjusted, and the adjustment process may be based on the difference between adjacent pixel points. Whether the difference value of the attribute feature is greater than a preset threshold determines the pixel points on the edge of the target object; exemplarily, the preset threshold value can be preset, or can be determined when the pixel points on the edge of the target object are determined based on the conditional random field model of.

Exemplarily, in order to improve the accuracy of the category information of the pixel, it can be determined whether the difference value of the attribute feature of each pixel and at least one adjacent pixel is greater than a preset threshold, so that more needs to be carried out can be found. Of course, for the adjusted target pixel, in order to improve the labeling efficiency, the number of adjacent pixels to be compared with the pixel to be compared with the attributes may also be set, which is not limited in this embodiment of the present disclosure.

S10642, in the case that the difference value is greater than the preset threshold, select at least one third pixel point.

Wherein, the third pixel point is a pixel point having the same attribute characteristics as the first pixel point.

Exemplarily, when it is determined that the difference value of the attribute feature of the first pixel point and the adjacent second pixel point is greater than a preset threshold, the first pixel point can be used as a candidate pixel point to be adjusted for category information. , these candidate pixels are likely to be pixels on the edge of the target object.

Based on the foregoing description, it can be seen that the category information corresponding to the pixels with the same attribute features is consistent. Therefore, in the case where it is determined that the difference value of the attribute features of the first pixel point and the adjacent second pixel point is greater than the preset threshold, it is possible to further It is judged whether the category information corresponding to the first pixel point and the category information of the third pixel point having the same attribute feature as the pixel point are the same.

S10643: In the case that the category information corresponding to the first pixel point is inconsistent with the category information corresponding to the third pixel point, the first pixel point is used as the target pixel point.

Exemplarily, it is determined that the category information corresponding to the first pixel point A is the lesion category, and the category information of several other third pixel points that have the same attribute characteristics as the first pixel point A are benign categories, then the first pixel point The category information of A has a relatively high probability of being a benign category, so the first pixel point A can be used as the target pixel point to be adjusted for category information.

In the embodiment of the present disclosure, the target pixel to be adjusted for the category information can be quickly determined through the attribute features and category information between adjacent pixels. is larger, and the category information of the pixel is different from that of other pixels with the same attribute characteristics, then it can be determined that the category information of the pixel is wrong with a high probability, so the category to be processed can be quickly determined in this way. The target pixel for information adjustment.

For the above S106, in this embodiment of the present disclosure, marking the target image based on the labeling method corresponding to the target image and the category information corresponding to each image block can be implemented through S1067-S1068 as shown in FIG. 6 :

S1067 , when the labeling method is to perform classification labeling, determine the number of image blocks corresponding to each type of information based on the class information corresponding to each image block.

S1068: Determine the category information of the target image based on the number of image blocks corresponding to each category of information.

Exemplarily, in the case where the category information of the entire target image needs to be determined, the number of image blocks corresponding to each category of information contained in the target image needs to be classified and counted, and then the category of the image block containing the maximum number of image blocks can be classified. information as the category information of the target image.

Exemplarily, in order to improve the accuracy of the determined category information of the target image, in addition to determining the number of image blocks corresponding to each category of information, a threshold condition for setting the number can also be added, such as the number of image blocks satisfying any category information. Under the condition that the number reaches the set number threshold and the number of image blocks corresponding to any category information is the largest, any category information can be used as category information of the target image.

In the embodiment of the present disclosure, by determining the number of image blocks corresponding to each category of information contained in the target image, the category information of the target image can be quickly determined.

The image labeling method provided by the embodiment of the present disclosure may further include the following operations:

After the target image is annotated, the category information of the annotated image block is updated in response to the annotation category update instruction for the target image block; the step of performing the above-mentioned S105 is returned, that is, based on the coding information corresponding to each image block and the information of the annotated image block. The step of encoding information and determining the category information corresponding to each image block.

Exemplarily, updating the category information of annotated image blocks may include the following situations:

(1) Change the category information of annotated image blocks that have been pre-annotated with category information, such as changing the lesion category to benign category;

(2) Newly labeled image blocks, such as labeling categories for image blocks without category information;

(3) Modify the category information of the labeled image blocks that have been pre-labeled with category information, and add new labeled image blocks.

Exemplarily, for the above situation (1), after the target image is annotated, the annotation results can be fine-tuned to obtain more accurate annotation results. In step S105, more accurate category information corresponding to each image block can be obtained, so that the target image can be marked based on the more accurate category information.

Exemplarily, for the above situation (2), the target image contains multiple target objects, and the initial labeling category information only includes labeling the image block included in one of the target objects, then the target image obtained in this way The labeling result only contains the category information for the target object, that is, the labeling result is not complete. At this time, the labeling image blocks corresponding to other target objects can be added, and then return to the above step S105, so that a more complete picture can be obtained. The category information of the image block, so that a more complete annotation result corresponding to the target image can be obtained.

Exemplarily, for the above situation (3), the specific method is the result of the above two methods, and this method can improve the accuracy and completeness of the target image annotation at the same time.

In the embodiment of the present disclosure, the category information of the annotated image block may be updated based on the annotation result of the target image, thereby improving the accuracy and/or completeness of the annotation result of the target image.

In the embodiment of the present disclosure, after the target image is marked, the category information corresponding to the image block can be readjusted based on the marking result, so as to obtain a more accurate marking result.

The image annotation results provided by the present disclosure will be described below with a specific embodiment:

As shown in Fig. 7a, the target image is a lung image of a patient, and the lung image is divided into blocks to obtain multiple lung image blocks (not shown in Fig. 7a), the dark color in Fig. 7b The rectangular area is the category information corresponding to the labeled image blocks of the manually labeled category. For example, the category information corresponding to the labeled image block is the lesion category. Then, according to the above-mentioned image annotation method, it can be determined that the labeled image block belongs to the same category. Image block, and further label the lung image based on the category information corresponding to each image block in the lung image, and the labeling result for the target object can be obtained, for example, the labeling result shown in Figure 7c is obtained, and the labeling result is a lesion Outline map of the area.

Those skilled in the art can understand that in the above method of the specific implementation, the writing order of each step does not mean a strict execution order but constitutes any limitation on the implementation process, and the specific execution order of each step should be based on its function and possible Internal logic is determined.

Based on the same technical concept, the embodiment of the present disclosure also provides an image labeling device corresponding to the image labeling method. Reference may be made to the implementation of the method, and repeated descriptions will not be repeated.

Referring to FIG. 8 , which is a schematic diagram of an image labeling apparatus 800 provided by an embodiment of the present disclosure, the image labeling apparatus 800 includes:

The image acquisition module 801 is configured to: acquire the target image to be marked;

The image segmentation module 802 is configured to: segment the target image to obtain multiple image blocks;

The image encoding module 803 is configured to: perform encoding processing on each image block in the plurality of image blocks to obtain encoding information corresponding to each image block;

The first labeling module 804 is configured to: perform category labeling on part of the image blocks in the plurality of image blocks after encoding processing, to obtain labelled image blocks;

The category determining module 805 is configured to: determine the category information corresponding to each image block based on the encoding information corresponding to each image block and the encoding information annotating the image block;

The second labeling module 806 is configured to label the target image based on the labeling manner corresponding to the target image and the category information corresponding to each image block.

In one embodiment, the image encoding module 803 is configured to: input each image block into a neural network, extract image features included in each image block through the neural network; encoding information.

In one embodiment, the second labeling module 806 is configured to: when the labeling method is to label a rectangular frame, obtain image blocks with the same corresponding category information; The target image is annotated.

In one embodiment, the second labeling module 806 is configured to: when the labeling method is contour labeling, determine each pixel in the target image based on the category information corresponding to each image block and the pixels contained in each image block Corresponding category information; obtain the attribute features of each pixel in the target image, and determine the target pixel based on the category information and attribute features corresponding to each pixel, wherein the target pixel is the pixel to be adjusted for category information; in In the presence of target pixels, the category information corresponding to the target pixels is adjusted based on the category information of the pixels with the same attribute characteristics as the target pixels, and the target category information corresponding to each pixel in the target image is obtained; The target category information corresponding to each pixel in the image marks the contour of the target object composed of pixels belonging to the same target category.

In one embodiment, the second labeling module 806 is configured to: select at least one of the respective pixel points as the first pixel point; based on the attribute characteristics of the first pixel point, determine the attributes of the first pixel point and the second pixel point The difference value of the feature; when the difference value is greater than the preset threshold, at least one third pixel point is selected, wherein the third pixel point is a pixel point with the same attribute feature as the first pixel point; in the first pixel point When the corresponding category information is inconsistent with the category information corresponding to the third pixel point, the first pixel point is used as the target pixel point; wherein the second pixel point is adjacent to the first pixel point.

In one embodiment, the second labeling module is configured to: when the labeling method is to perform classification labeling, determine the number of image blocks corresponding to each class of information based on the class information corresponding to each image block; The number of image blocks corresponding to the information determines the category information of the target image.

In one embodiment, the category determination module 805 is configured to: after annotating the target image, in response to an annotation category update instruction for the target image block, update the category information of the annotated image block; The step of determining the type information corresponding to each image block.

In a possible implementation, the image labeling apparatus 800 further includes a network training module 807, and the network training module 807 is configured to: acquire multiple sample images, and divide each sample image into blocks to obtain multiple sample image blocks and position information of each sample image block; based on the position information of each sample image block, determine the first sample image block pair and the second sample image block pair, wherein the two sample images in the first sample image block pair The distance between the blocks is less than or equal to the set threshold, and the distance between the two sample image blocks in the second sample image block pair is greater than the set threshold; the first sample image block pair and the second sample image block are respectively For the input neural network to be trained, the predictive coding information corresponding to each sample image block is obtained; based on the predictive coding information corresponding to each sample image block, the network parameters of the neural network to be trained are adjusted, and the training completed is obtained. Encoded neural network.

For the description of the processing flow of each module in the apparatus and the interaction flow between the modules, reference may be made to the relevant descriptions in the foregoing method embodiments, which will not be described in detail here.

Corresponding to the image labeling method in FIG. 1 , an embodiment of the present disclosure further provides an electronic device 900 . As shown in FIG. 9 , the schematic structural diagram of the electronic device 900 provided by the embodiment of the present disclosure includes:

The processor 91, the memory 92, and the bus 93; the memory 92 is used to store the execution instructions, including the memory 921 and the external memory 922; the memory 921 here is also called the internal memory, which is used to temporarily store the operation data in the processor 91, and For the data exchanged by the external memory 922 such as the hard disk, the processor 91 exchanges data with the external memory 922 through the memory 921. When the electronic device 900 is running, the communication between the processor 91 and the memory 92 is through the bus 93, so that the processor 91 executes the following instructions : obtain the target image to be marked; divide the target image into blocks to obtain multiple image blocks; perform coding processing on each of the multiple image blocks to obtain the coding information corresponding to each image block; Part of the image blocks in the image blocks are classified into categories to obtain labeled image blocks; based on the encoding information corresponding to each image block and the encoding information of the labeled image blocks, the category information corresponding to each image block is determined; and, based on the annotation corresponding to the target image The method and the corresponding category information of each image block are used to annotate the target image.

Embodiments of the present disclosure further provide a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is run by a processor, the steps of the image labeling method described in the foregoing method embodiments are executed. Wherein, the storage medium may be a volatile or non-volatile computer-readable storage medium.

An embodiment of the present disclosure further provides a computer program, the computer program product includes computer code, and when the computer code runs in an electronic device, the processor of the electronic device can execute the image labeling method described in any of the preceding embodiments, For details, reference may be made to the foregoing method embodiments, which will not be repeated here.

Wherein, the above-mentioned computer program product can be specifically implemented by means of hardware, software or a combination thereof. In an optional embodiment, the computer program product is embodied as a computer storage medium, and in another optional embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK), etc. Wait.

Those skilled in the art can clearly understand that, for the convenience and brevity of description, for the specific working process of the system and device described above, reference may be made to the corresponding process in the foregoing method embodiments, which will not be repeated here. In the several embodiments provided by the present disclosure, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. The apparatus embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some communication interfaces, indirect coupling or communication connection of devices or units, which may be in electrical, mechanical or other forms.

The units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.

In addition, each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a processor-executable non-volatile computer-readable storage medium. Based on such understanding, the technical solutions of the present disclosure can be embodied in the form of software products in essence, or the parts that contribute to the prior art or the parts of the technical solutions. The computer software products are stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in various embodiments of the present disclosure. The aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes .

Finally, it should be noted that the above-mentioned embodiments are only specific implementations of the present disclosure, and are used to illustrate the technical solutions of the present disclosure rather than limit them. The protection scope of the present disclosure is not limited thereto, although referring to the aforementioned embodiments The present disclosure has been described in detail, and those of ordinary skill in the art should understand that any person skilled in the art can still modify or modify the technical solutions described in the foregoing embodiments within the technical scope disclosed in the present disclosure. Changes are easily thought of, or equivalent replacements are made to some of the technical features; and these modifications, changes or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the embodiments of the present disclosure, and should be included in the protection of the present disclosure. within the range. Therefore, the protection scope of the present disclosure should be based on the protection scope of the claims.

Industrial Applicability

The embodiments of the present disclosure disclose an image labeling method, device, electronic device, storage medium and program, the method includes: acquiring a target image to be labelled; dividing the target image into blocks to obtain a plurality of image blocks; Perform encoding processing on each image block in the plurality of image blocks to obtain encoding information corresponding to each image block; perform category labeling on some image blocks in the plurality of image blocks after encoding processing to obtain labeled image blocks ; Based on the coding information corresponding to each image block and the coding information of the labeled image block, determine the category information corresponding to each image block; Based on the corresponding labeling method of the target image and the corresponding category of each image block information, and annotate the target image. With the image labeling method provided by the embodiments of the present disclosure, the lesion area in the patient's organ image can be quickly labelled in the medical field, thereby effectively assisting the user in clinical diagnosis.

Claims

An image labeling method, the method comprising:

Obtain the target image to be annotated;

The target image is divided into blocks to obtain a plurality of image blocks;

performing encoding processing on each image block in the plurality of image blocks to obtain encoding information corresponding to each image block;

Perform category labeling on some image blocks in the plurality of image blocks after encoding processing, to obtain labeled image blocks;

Determine the category information corresponding to each image block based on the encoding information corresponding to each image block and the encoding information of the labeled image block; and based on the corresponding labeling method of the target image and the category corresponding to each image block information, and annotate the target image.
The image labeling method according to claim 1, wherein encoding each image block in the plurality of image blocks is performed to obtain encoding information corresponding to each image block, comprising:

Inputting the image blocks into a neural network, and extracting the image features contained in the image blocks through the neural network;

Through the neural network, the coding information corresponding to each image block is determined based on the image feature of each image block.
The image labeling method according to claim 1 or 2, wherein the labeling of the target image based on the labeling manner corresponding to the target image and the category information corresponding to each image block comprises:

In the case where the labeling method is to label a rectangular frame, obtain image blocks with the same corresponding category information; and

The target image is marked according to the smallest peripheral rectangular frame of the image blocks with the same corresponding category information.
The image labeling method according to claim 1 or 2, wherein the labeling of the target image based on the labeling manner corresponding to the target image and the category information corresponding to each image block comprises:

In the case where the labeling method is contour labeling, the class information corresponding to each pixel point in the target image is determined based on the class information corresponding to each image block and the pixel points included in each image block;

Acquiring attribute features of each pixel in the target image, and determining the target pixel based on the category information and attribute features corresponding to each pixel; wherein, the target pixel is a pixel to be adjusted for category information;

In the case where the target pixel exists, the category information corresponding to the target pixel is adjusted based on the category information of the pixel with the same attribute feature as the target pixel to obtain each pixel in the target image. the target category information corresponding to the point; and

Based on the target category information corresponding to each pixel in the target image, an outline of a target object formed by pixels belonging to the same target category is marked.
The image labeling method according to claim 4, wherein the determining the target pixel point based on the category information and attribute characteristics corresponding to each pixel point comprises:

Selecting at least one of the respective pixel points as the first pixel point;

Based on the attribute feature of the first pixel point, the difference value of the attribute feature of the first pixel point and the second pixel point is determined; wherein, the second pixel point is adjacent to the first pixel point;

In the case that the difference value is greater than a preset threshold, at least one third pixel is selected; wherein, the third pixel is a pixel having the same attribute characteristics as the first pixel; and

When the category information corresponding to the first pixel point is inconsistent with the category information corresponding to the third pixel point, the first pixel point is used as the target pixel point.
The image labeling method according to claim 1 or 2, wherein the labeling of the target image based on the labeling method corresponding to the target image and the category information corresponding to each image block comprises:

In the case where the labeling method is to perform classification labeling, based on the class information corresponding to each image block, determine the number of image blocks corresponding to each class of information; and

The category information of the target image is determined based on the number of image blocks corresponding to each category of information.
The image labeling method according to any one of claims 1 to 6, wherein the method further comprises:

After the target image is marked, in response to the marking category update instruction for the target image block, the category information of the marked image block is updated; and

Return to perform the step of determining the category information corresponding to each image block based on the encoding information corresponding to each image block and the encoding information of the marked image block.
The image labeling method according to claim 2, wherein the neural network is obtained by:

Acquire multiple sample images, and divide each sample image into blocks to obtain multiple sample image blocks and position information of each sample image block;

Based on the position information of each sample image block, a first sample image block pair and a second sample image block pair are determined; wherein the distance between the two sample image blocks in the first sample image block pair is less than or is equal to a set threshold; the distance between two sample image blocks in the second sample image block pair is greater than the set threshold;

respectively inputting the first sample image block pair and the second sample image block pair into the neural network to be trained, to obtain predictive coding information corresponding to each sample image block; and

Based on the predictive coding information corresponding to each sample image block, the network parameters of the neural network to be trained are adjusted to obtain a trained neural network for coding.
An image labeling device, comprising:

The image acquisition module is configured to: acquire the target image to be marked;

The image segmentation module is configured to: block the target image to obtain a plurality of image blocks;

The image encoding module is configured to: perform encoding processing on each image block in the plurality of image blocks to obtain encoding information corresponding to each image block;

The first labeling module is configured to: perform category labeling on some image blocks in the plurality of image blocks after encoding processing, to obtain labelled image blocks;

The category determination module is configured to: determine the category information corresponding to each image block based on the encoding information corresponding to each image block and the encoding information of the marked image block; and

The second labeling module is configured to label the target image based on the labeling manner corresponding to the target image and the category information corresponding to each image block.
An electronic device, comprising: a processor, a memory and a bus, the memory stores machine-readable instructions executable by the processor, and when the electronic device is running, the processor and the memory communicate through the bus , when the machine-readable instructions are executed by the processor, the image labeling method according to any one of claims 1 to 8 is executed.
A computer-readable storage medium, on which a computer program is stored, and when the computer program is run by a processor, the image annotation method according to any one of claims 1 to 8 is executed.
A computer program, the computer program comprising computer-readable codes, when the computer-readable codes are executed in an electronic device, the processor of the electronic device executes the code for realizing any one of claims 1 to 8 The described image annotation method.