CN111292341A

CN111292341A - Image annotation method, image annotation device and computer storage medium

Info

Publication number: CN111292341A
Application number: CN202010078873.XA
Authority: CN
Inventors: 刘杰辰; 陈佃文; 曹琼; 郝玉峰; 黄宇凯; 李科
Original assignee: Beijing Speechocean Technology Co ltd
Current assignee: Beijing Speechocean Technology Co ltd
Priority date: 2020-02-03
Filing date: 2020-02-03
Publication date: 2020-06-16
Anticipated expiration: 2040-02-03
Also published as: CN111292341B

Abstract

The invention relates to the technical field of computer vision, and provides an image annotation method, an image annotation device and a computer storage medium. The image annotation method comprises the following steps: acquiring an image to be marked; determining the minimum size of a segmentation region based on an image to be labeled; performing super-pixel segmentation on the image to be marked based on the minimum size to obtain a segmented image; determining a current threshold based on the current segmented image; fusing the segmentation areas of the segmentation images by adopting an area fusion mode according to the current threshold value to obtain a current fusion image; labeling a local area which only comprises one target image in the local area; and judging whether the current fusion image comprises an unmarked local area or not, and finishing the marking of the image to be marked according to the judgment result. According to the image annotation method provided by the disclosure, the time cost of manual annotation can be reduced, the annotation efficiency is improved, and the accuracy of image annotation is not influenced.

Description

Image annotation method, image annotation device and computer storage medium

Technical Field

The present invention relates generally to the field of computer vision technology, and more particularly, to an image annotation method, an image annotation apparatus, and a computer storage medium.

Background

In the field of computer vision, an increasing number of projects involve semantic segmentation, such as: unmanned, video comprehension, and the like. As the demand for semantic segmentation labeling increases, the labeling requirements also increase, for example: the full-pixel semantic segmentation labeling belongs to a labeling task with extremely high additional value.

In the related technology, as the accuracy of marking needs to be accurate to the pixel point, methods such as magnetic force joint line and the like are adopted; or, the edge of the object to be segmented is outlined in a mode of dotting on the edge by using tools such as Photoshop and the like for manual marking. The method of adopting magnetic force to stick to the line and so on has great error easily, and the manual marking mode is very time-consuming, and the accuracy of the final result is in direct proportion to the manual dotting quantity, and the marking accuracy often can not reach the accuracy requirement of the project.

Disclosure of Invention

In order to solve the above problems in the prior art, the present invention provides an image annotation method, an image annotation apparatus, and a computer storage medium.

In a first aspect, an embodiment of the present invention provides an image annotation method, including: acquiring an image to be annotated, wherein the image to be annotated comprises one or more target images; determining the minimum size of a segmentation region based on an image to be labeled; performing super-pixel segmentation on the image to be marked based on the minimum size to obtain a segmented image; determining a current threshold based on the current segmented image; fusing the segmentation areas of the segmentation image by adopting an area fusion mode according to the current threshold value to obtain a current fusion image, wherein the fusion image comprises one or more local areas; labeling a local area which only comprises one target image in the local area; judging whether the current fused image comprises an unmarked local area or not, if the current fused image comprises the unmarked local area, taking the unmarked local area as a new current segmentation image, and returning to the step of determining the current threshold value based on the current segmentation image; and if the current fusion image does not comprise the unmarked local area, finishing the marking of the image to be marked.

In an embodiment, fusing the segmentation regions of the segmentation image by using a region fusion method according to the current threshold to obtain a current fusion image, including: acquiring a first average color value of pixels in each segmentation area in a segmentation image; and fusing two corresponding segmentation areas in the segmentation image with the difference of the two first average color values smaller than the current threshold value by adopting an area fusion mode according to the current threshold value and the first average color value to obtain the current fusion image.

In one embodiment, determining the current threshold based on the current segmented image comprises: acquiring a second average color value of pixels of an unmarked area in the current segmentation image; based on the second average color value, a current threshold is determined.

In another embodiment, labeling a local region including only one target image in the local region includes: and marking the local area which only comprises one target image in the local area according to the type of the target image.

In one embodiment, determining the minimum size of the segmentation region based on the image to be annotated includes: determining coordinate information of a target image in an image to be marked through a target detection model; based on the coordinate information, a minimum size of the divided region is determined.

In another embodiment, the image annotation method further comprises: determining the category information of a target image in an image to be marked through a target detection model; determining a center coordinate of the target image based on the coordinate information and the category information, wherein the center coordinate corresponds to the category information; labeling a local area only comprising one target image in the local area according to the type of the target image, wherein the labeling comprises the following steps: and if the local area only comprises the center coordinate of one target image, marking the local area according to the category information corresponding to the center coordinate.

In a second aspect, an embodiment of the present invention provides an image annotation apparatus, including: the system comprises an acquisition module, a display module and a display module, wherein the acquisition module is used for acquiring an image to be annotated, and the image to be annotated comprises one or more target images; the threshold confirming module is used for determining the minimum size of the segmentation area based on the image to be annotated and determining the current threshold based on the current segmentation image; the image segmentation module is used for carrying out super-pixel segmentation on the image to be marked based on the minimum size to obtain a segmented image; the image fusion module is used for fusing the segmentation areas of the segmentation images in an area fusion mode according to the current threshold value to obtain a current fusion image, and the fusion image comprises one or more local areas; the labeling module is used for labeling the local area which only comprises one target image in the local area; the judging module is used for judging whether the current fusion image comprises an unmarked local area or not; when the current fusion image comprises an unmarked local area, taking the unmarked local area as a new current segmentation image, and returning to a threshold confirmation module to execute the step of determining the current threshold based on the current segmentation image; and when the current fusion image does not comprise the unmarked local area, finishing the marking of the image to be marked.

In an embodiment, the image fusion module fuses the segmentation areas of the segmentation image by adopting an area fusion method according to the current threshold value by adopting the following method to obtain the current fusion image: acquiring a first average color value of pixels in each segmentation area in a segmentation image; and fusing two corresponding segmentation areas in the segmentation image with the difference of the two first average color values smaller than the current threshold value by adopting an area fusion mode according to the current threshold value and the first average color value to obtain the current fusion image.

In one embodiment, the threshold validation module determines the current threshold based on the current segmented image in the following manner: acquiring a second average color value of pixels of an unmarked area in the current segmentation image; based on the second average color value, a current threshold is determined.

In another embodiment, the labeling module labels the local region including only one target image in the local region in the following manner: and marking the local area which only comprises one target image in the local area according to the type of the target image.

In one embodiment, the threshold confirmation module determines the minimum size of the segmented region based on the image to be annotated by: determining coordinate information of a target image in an image to be marked through a target detection model; based on the coordinate information, a minimum size of the divided region is determined.

In another embodiment, the obtaining module is further configured to: determining the category information of a target image in an image to be marked through a target detection model; determining the central coordinate of the target image based on the coordinate information and the category information, wherein the central coordinate corresponds to the category information; the labeling module labels the local area only comprising one target image in the local area according to the category of the target image by adopting the following mode: and if the local area only comprises the center coordinate of one target image, marking the local area according to the category information corresponding to the center coordinate.

In a third aspect, an embodiment of the present invention provides an electronic device, where the electronic device includes: a memory to store instructions; and the processor is used for calling the instruction stored in the memory to execute any one of the image labeling methods.

In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, where the computer-readable storage medium stores computer-executable instructions, and when executed by a processor, the computer-executable instructions perform any one of the image annotation methods.

The invention provides an image annotation method, an image annotation device and a computer storage medium, which can divide an image to be annotated by superpixel division and assist a target image in the image to be annotated according to the obtained divided image. In the process of marking the image to be marked, a multi-round region fusion mode is adopted, the fusion threshold value of the segmentation region is adjusted according to the current segmentation image, the local region to be marked is gradually thinned, and then the marking of the image to be marked is completed, so that the time cost of manual marking is favorably reduced, the marking efficiency is improved, and the accuracy of image marking is not influenced.

Drawings

The above and other objects, features and advantages of embodiments of the present invention will become readily apparent from the following detailed description read in conjunction with the accompanying drawings. Several embodiments of the invention are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which:

FIG. 1 is a schematic diagram illustrating an image annotation method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a segmented image provided by an embodiment of the present invention;

FIG. 3 is a schematic diagram of a fused image provided by an embodiment of the invention;

FIG. 4 is a schematic diagram of another fused image provided by an embodiment of the invention;

FIG. 5 is a schematic diagram illustrating another image annotation method provided by an embodiment of the invention;

FIG. 6 is a schematic diagram illustrating an apparatus for image annotation according to an embodiment of the present invention;

FIG. 7 is a schematic diagram of an electronic device provided by an embodiment of the invention;

in the drawings, the same or corresponding reference numerals indicate the same or corresponding parts.

Detailed Description

The principles and spirit of the present invention will be described with reference to a number of exemplary embodiments. It is understood that these embodiments are given solely for the purpose of enabling those skilled in the art to better understand and to practice the invention, and are not intended to limit the scope of the invention in any way.

It should be noted that although the expressions "first", "second", etc. are used herein to describe different modules, steps, data, etc. of the embodiments of the present invention, the expressions "first", "second", etc. are merely used to distinguish between different modules, steps, data, etc. and do not indicate a particular order or degree of importance. Indeed, the terms "first," "second," and the like are fully interchangeable.

FIG. 1 is a diagram illustrating an image annotation process according to an exemplary embodiment. As shown in fig. 1, the image annotation method 10 includes the following steps S11 to S17.

In step S11, an image to be annotated is acquired.

In the embodiment of the present disclosure, the image to be labeled is an image that needs to be labeled. The image to be annotated can be acquired through image acquisition equipment, or can be acquired through a local database, a cloud end or a video, and is not limited in the disclosure. The image to be annotated includes one or more target images to be annotated, and the target image content may include: people, animals, cars, street lights, plants, signs, and the like.

In step S12, the minimum size of the divided region is determined based on the image to be annotated.

In the embodiment of the present disclosure, in order to label a target image in an image to be labeled, the image to be labeled needs to be segmented into a plurality of segmented regions with different sizes by adopting a super-pixel segmentation method, and then effective information is extracted according to contents in the segmented regions, for example: texture information, color information, etc. When the size of the division area divided by the super pixels is too small, the number of the division areas is huge, trivial and difficult to operate; when the size of the segmentation region segmented by the super pixels is too large, the segmentation region is complete in blocking, but details are easy to ignore, and the edge of the target object cannot be attached. Therefore, in order to reasonably segment the image to be annotated, the minimum size of the segment region needs to be determined according to the size of the image to be annotated, the size of the target image in the image to be annotated and other factors, and then the superpixel segmentation is reasonably carried out.

In an embodiment, the minimum size of the segmentation region is determined according to the size of the target image occupied by the target image in the image to be labeled. In one example, by using the target detection model, coordinate information of a rectangular region including the target image in the image to be annotated can be obtained, and further, a target image size of the target image can be obtained. The minimum size of the segmentation region is determined according to the size of the target image, which is beneficial to distinguishing a plurality of target images in the image to be labeled, and is beneficial to reasonably controlling the number of segmentation regions segmented by superpixels, so that the target images can be labeled quickly. In another example, the segmentation size of the super-pixel segmentation is manually determined according to the project requirement and any one target image in the image to be annotated, so that the image to be annotated can be rapidly segmented.

In step S13, the image to be labeled is subjected to superpixel segmentation based on the minimum size, so as to obtain a segmented image.

In the embodiment of the present disclosure, a superpixel segmentation algorithm such as a Simple Linear Iterative Clustering (SLIC) algorithm or a Felzenszwalb algorithm may be adopted to perform superpixel segmentation on the image to be labeled. And according to the determined minimum size, performing super-pixel segmentation on the image to be marked to form a plurality of segmentation areas with different sizes, and further obtaining a segmentation image. The method is beneficial to reducing the labeling range of the target image in the image to be labeled, and can rapidly label the target image when the regions are fused. In an implementation scenario, as shown in fig. 2, a felzenzwald algorithm is used to perform superpixel segmentation on an image to be annotated, and the felzenzwald algorithm is a graphics-based superpixel segmentation method, and when image segmentation is performed, edge information in the image to be annotated and details of a target image can be retained, which is beneficial to improving the accuracy of annotation.

In step S14, a current threshold is determined based on the current segmented image.

In the embodiment of the disclosure, the selection of the current threshold is a main factor of segmentation region fusion, and when the current threshold is too large, the segmentation region containing the details of the target image is easy to ignore; when the current threshold is too small, the time cost of labeling the target image cannot be reduced. In the method, a multi-round region fusion mode is adopted, a current threshold value is determined according to a current segmentation image, and a current threshold value of the next round is determined according to an annotation result of a fusion image until the target image in the image to be annotated is annotated. And determining the current threshold value of the fusion segmentation region according to different pixel values of each segmentation region in the current segmentation image, so as to distinguish the positions of different target images in the image to be labeled, and further facilitate labeling of the target images.

In an embodiment, a current threshold with a higher value may be used as a current threshold for the first round of region fusion, and segmented regions with a larger size and a closer pixel color may be fused. In the subsequent region fusion process, the numerical value of the current threshold is gradually reduced, the segmented regions with smaller segmentation sizes are fused, the details of the target image to be labeled are thinned until the image to be labeled is labeled, and the labeling accuracy is improved.

In another embodiment, the current threshold may be determined based on a second average color value of pixels of an unlabeled region in the current segmented image, facilitating fast fusion of segmented regions with similar pixels in the segmented image. The second average color value is the average color value of all pixels in the unmarked region in the current segmented image.

In step S15, the segmented regions of the segmented image are fused by region fusion according to the current threshold, so as to obtain a current fused image.

In the embodiment of the present disclosure, the fused image includes one or more local regions, the local regions are formed by fusing a plurality of segmented regions, and may include one or more target images. As shown in fig. 3, according to the current threshold, the segmentation areas in the current segmentation image are fused in an area fusion manner, so that the color difference of the pixels in each local area in the obtained fused image is obvious, and the target image is conveniently labeled.

In one embodiment, first average color values for pixels within respective segmented regions in a segmented image are obtained. And fusing the two segmented regions of which the difference between the two first average color values in the segmented image is smaller than the current threshold value according to the determined current threshold value and the first average color value of the pixel to form one or more local regions, thereby obtaining the current fused image. The first average color value is an average color value of all pixels in the divided region.

In step S16, a local region including only one target image among the local regions is labeled.

In the embodiment of the disclosure, in order to improve the accuracy of labeling, when labeling the fused image, labeling is performed according to the number of target images included in a local region in the fused image, and when the local region only includes one target image, labeling the local region; when the local area comprises two or more target images, in order to ensure the accuracy of image labeling, no labeling is performed. In an embodiment, the labeling manner corresponding to the category is preset, for example: the different categories are marked with different colors. And when the local area is labeled, labeling by adopting a corresponding labeling mode according to the type of the target image in the local area.

In step S17, it is determined whether the current fused image includes an unmarked region.

In the embodiment of the present disclosure, after the local region labeling is completed, whether the next round of region fusion is performed is determined according to a result of whether the current fusion image contains an unmarked region, so as to complete the labeling of the image to be labeled. And if the current fusion image comprises the unmarked local area, taking the unmarked local area as a new current segmentation image, returning to the step of determining the current threshold value based on the current segmentation image, performing the next round of region fusion, and refining the image marking details to obtain a clear target image fusion schematic diagram shown in fig. 4. In one example, the region fusion can be performed by selecting the unmarked local region as the new current segmentation image, for example, by manual selection. And if the current fusion image does not comprise the unmarked local area, finishing the marking of the image to be marked.

According to the embodiment, the image to be annotated is segmented by adopting superpixel segmentation, and the target image in the image to be annotated is assisted to be annotated according to the obtained segmented image. In the process of marking the image to be marked, a multi-round region fusion mode is adopted, the local region to be marked is gradually thinned, and then the marking of the image to be marked is completed, so that the time cost of manual marking is favorably reduced, the marking efficiency is improved, and the accuracy of image marking is not influenced. The image annotation method is convenient and fast to operate, operation training of project-related personnel is facilitated, operation of the project-related personnel is facilitated, and overall annotation cost is reduced.

FIG. 5 is a schematic diagram illustrating another image annotation process according to an exemplary embodiment. As shown in fig. 5, the image annotation method 20 includes the following steps S21 to S29.

In the embodiment of the present disclosure, the implementation of step S21, step S25 to step S29 is the same as the implementation of the image annotation method 10, and will not be described herein again.

In step S21, an image to be annotated is acquired.

In step S22, by the target detection model, category information of the target image in the image to be annotated is determined.

In the embodiment of the disclosure, in order to conveniently and rapidly determine the target image in the image to be annotated, the image to be annotated can be detected through the target detection model, so that the target image in the image to be annotated and the category information of the target image are determined, the target image to be annotated in the image to be annotated is further determined, the efficiency of determining the target image is improved, and therefore the annotation efficiency of the image to be annotated is improved.

In step S23, coordinate information of the target image in the image to be annotated is determined by the target detection model, and the minimum size of the divided region is determined based on the coordinate information.

In the embodiment of the present disclosure, in order to reduce manual operations in the process of an image to be labeled, the minimum size of a segmented region is determined through a target detection model. The target detection is carried out through the target detection model, the coordinate information of a rectangular area including the target image in the image to be marked is obtained, the minimum size of the segmentation area is further determined through the coordinate information of the rectangular area, the method is beneficial to distinguishing a plurality of target images in the image to be marked, the number of the segmentation areas segmented by the super pixels is reasonably controlled, and the target images are conveniently and rapidly marked.

In step S24, the center coordinates of the target image are determined based on the coordinate information and the category information.

In the embodiment of the disclosure, the center coordinate of the corresponding target image is determined according to the coordinate information of the rectangular area, and then the corresponding relationship between the center coordinate and the category information of the corresponding target image is determined, so that the labeling mode corresponding to the category information is determined according to the center coordinate information, and then the target image is rapidly labeled, and the labeling efficiency is improved.

In step S25, the image to be labeled is subjected to superpixel segmentation based on the minimum size, so as to obtain a segmented image.

In step S26, a current threshold is determined based on the current segmented image.

In step S27, the segmented regions of the segmented image are fused by region fusion according to the current threshold, so as to obtain a current fused image.

In step S28, if the local region includes only the center coordinates of one target image, the local region is labeled based on the type information corresponding to the center coordinates.

In the embodiment of the present disclosure, when a first round of local area labeling is performed, and when a local area only includes a center coordinate of one target image, the current local area only includes one category of target images, and the local area is labeled according to category information corresponding to the center coordinate; when the local area comprises the center coordinates of two or more target images, the current local area comprises a plurality of categories of target images, and no marking is performed to ensure the accuracy of image marking. And local area labeling of the first round is carried out according to the central coordinate and the category information, so that each target image in the image to be labeled can be distinguished conveniently and quickly. When the local region in the first round of the current fused image or the local region in the current fused image except the first round does not include the center coordinate of the target object, the local region may be labeled by using the embodiment of step S16 in the image labeling method 10, so as to improve the accuracy of image labeling.

In step S29, it is determined whether the current fused image includes an unmarked region.

Through the embodiment, the target image in the image to be marked, the category information and the coordinate information of the target image are obtained by using the target detection model, so that the manual intervention is reduced, the manual marking time is saved, the marking efficiency is improved, and the accuracy of image marking is not influenced in the local area marking process.

FIG. 6 is a schematic diagram illustrating an image annotation device in accordance with an exemplary embodiment. As shown in fig. 6, the image annotation apparatus 100 includes the following modules:

an obtaining module 110, configured to obtain an image to be annotated, where the image to be annotated includes one or more target images;

a threshold confirmation module 120, configured to determine, based on the image to be annotated, a minimum size of the segmented region and determine a current threshold based on the current segmented image;

the image segmentation module 130 is configured to perform superpixel segmentation on the image to be annotated to obtain a segmented image based on the minimum size;

the image fusion module 140 is configured to fuse the segmentation regions of the segmentation image in a region fusion manner according to the current threshold value to obtain a current fusion image, where the fusion image includes one or more local regions;

the labeling module 150 is configured to label a local region of the local region that only includes one target image;

a judging module 160, configured to judge whether the current fused image includes an unmarked local area;

when the current fused image includes the unmarked local region, the unmarked local region is used as a new current segmentation image, and the current segmentation image is returned to the threshold confirmation module 120 to execute the step of determining the current threshold based on the current segmentation image;

and when the current fusion image does not comprise the unmarked local area, finishing the marking of the image to be marked.

In an embodiment, the image fusion module 140 fuses the segmentation regions of the segmentation image by the region fusion method according to the current threshold in the following manner to obtain a current fusion image: acquiring a first average color value of pixels in each segmentation area in a segmentation image; and fusing two corresponding segmentation areas in the segmentation image with the difference of the two first average color values smaller than the current threshold value by adopting an area fusion mode according to the current threshold value and the first average color value to obtain the current fusion image.

In one embodiment, the threshold validation module 120 determines the current threshold based on the current segmented image in the following manner: acquiring a second average color value of pixels of an unmarked area in the current segmentation image; based on the second average color value, a current threshold is determined.

In another embodiment, the labeling module 150 labels the local region including only one target image in the local region by: and marking the local area which only comprises one target image in the local area according to the type of the target image.

In one embodiment, the threshold validation module 120 determines the minimum size of the segmented region based on the image to be annotated by: determining coordinate information of a target image in an image to be marked through a target detection model; based on the coordinate information, a minimum size of the divided region is determined.

In another embodiment, the obtaining module 110 is further configured to: determining the category information of a target image in an image to be marked through a target detection model; determining the central coordinate of the target image based on the coordinate information and the category information, wherein the central coordinate corresponds to the category information; the labeling module labels the local area only comprising one target image in the local area according to the category of the target image by adopting the following mode: and if the local area only comprises the center coordinate of one target image, marking the local area according to the category information corresponding to the center coordinate.

The functions implemented by the modules in the apparatus correspond to the steps in the method described above, and for concrete implementation and technical effects, please refer to the description of the method steps above, which is not described herein again.

Referring to fig. 7, an embodiment of the invention provides an electronic device 200. The electronic device 200 includes a memory 210, a processor 220, and an Input/Output (I/O) interface 230. The memory 210 is used for storing instructions. And the processor 220 is used for calling the instructions stored in the memory 210 to execute the image annotation method of the embodiment of the invention. The processor 220 is connected to the memory 210 and the I/O interface 230, respectively, for example, via a bus system and/or other connection mechanism (not shown). The memory 210 may be used to store programs and data, including a program for image annotation according to an embodiment of the present invention, and the processor 220 executes various functional applications and data processing of the electronic device 200 by executing the program stored in the memory 210.

In an embodiment of the present invention, the processor 220 may be implemented in at least one hardware form of a Digital Signal Processor (DSP), a Field Programmable Gate Array (FPGA), and a Programmable Logic Array (PLA), and the processor 220 may be one or a combination of a Central Processing Unit (CPU) or other Processing units with data Processing capability and/or instruction execution capability.

Memory 210 in embodiments of the present invention may comprise one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile Memory may include, for example, a Random Access Memory (RAM), a cache Memory (cache), and/or the like. The nonvolatile Memory may include, for example, a Read-only Memory (ROM), a Flash Memory (Flash Memory), a Hard Disk Drive (HDD), a Solid-State Drive (SSD), or the like.

In the embodiment of the present invention, the I/O interface 230 may be used to receive input instructions (e.g., numeric or character information, and generate key signal inputs related to user settings and function control of the electronic device 200, etc.), and may also output various information (e.g., images or sounds, etc.) to the outside. The I/O interface 230 may include one or more of a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a mouse, a joystick, a trackball, a microphone, a speaker, a touch panel, and the like.

In some embodiments, the invention provides a computer-readable storage medium having stored thereon computer-executable instructions that, when executed by a processor, perform any of the methods described above.

Although operations may be depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in a particular order, or in a serial order, or that all operations be performed, to achieve desirable results. In certain environments, multitasking and parallel processing may be advantageous.

The methods and apparatus of the present invention can be accomplished with standard programming techniques with rule based logic or other logic to accomplish the various method steps. It should also be noted that the words "means" and "module," as used herein and in the claims, is intended to encompass implementations using one or more lines of software code, and/or hardware implementations, and/or equipment for receiving inputs.

Any of the steps, operations, or procedures described herein may be performed or implemented using one or more hardware or software modules, alone or in combination with other devices. In one embodiment, the software modules are implemented using a computer program product comprising a computer readable medium of computer program code that is executable by a computer processor for performing any or all of the described steps, operations, or procedures.

The foregoing description of the implementation of the invention has been presented for purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise form disclosed, and modifications and variations are possible in light of the above teachings or may be acquired from practice of the invention. The embodiments were chosen and described in order to explain the principles of the invention and its practical application to enable one skilled in the art to utilize the invention in various embodiments and with various modifications as are suited to the particular use contemplated.

Claims

1. An image annotation method, characterized in that the method comprises:

acquiring an image to be annotated, wherein the image to be annotated comprises one or more target images;

determining the minimum size of a segmentation region based on the image to be annotated;

performing super-pixel segmentation on the image to be marked based on the minimum size to obtain a segmented image;

determining a current threshold based on the current segmented image;

fusing the segmentation areas of the segmentation image in an area fusion mode according to the current threshold value to obtain a current fusion image, wherein the fusion image comprises one or more local areas;

labeling a local area which only comprises one target image in the local area;

judging whether the current fused image comprises an unmarked local area or not, if the current fused image comprises the unmarked local area, taking the unmarked local area as a new current segmentation image, and returning to the step of determining the current threshold value based on the current segmentation image; and if the current fusion image does not comprise the unmarked local area, finishing the marking of the image to be marked.

2. The method according to claim 1, wherein the fusing the segmentation regions of the segmentation image by using a region fusion method according to the current threshold to obtain a current fused image comprises:

acquiring a first average color value of pixels in each segmentation area in the segmentation image;

and fusing two corresponding segmentation areas in the segmentation image, of which the difference between the two first average color values in the segmentation image is smaller than the current threshold value, in a region fusion mode according to the current threshold value and the first average color value to obtain a current fusion image.

3. The method of claim 1, wherein determining the current threshold based on the current segmented image comprises:

acquiring a second average color value of pixels of an unmarked area in the current segmentation image;

determining the current threshold based on the second average color value.

4. The method according to any one of claims 1 to 3,

the labeling of the local region including only one target image in the local region includes:

and labeling the local area which only comprises one target image in the local area according to the category of the target image.

5. The method according to claim 4, wherein the determining the minimum size of the segmentation region based on the image to be annotated comprises:

determining coordinate information of a target image in the image to be marked through a target detection model;

based on the coordinate information, a minimum size of the segmented region is determined.

6. The method of claim 5,

the method further comprises the following steps:

determining the category information of the target image in the image to be annotated through the target detection model;

determining center coordinates of the target image based on the coordinate information and the category information, wherein the center coordinates correspond to the category information;

the labeling, according to the category of the target image, a local area of the local area that only includes one target image includes:

if the local area only comprises the center coordinate of one target image, labeling the local area according to the category information corresponding to the center coordinate.

7. An image annotation apparatus, characterized in that the apparatus comprises:

the system comprises an acquisition module, a display module and a display module, wherein the acquisition module is used for acquiring an image to be annotated, and the image to be annotated comprises one or more target images;

the threshold confirming module is used for determining the minimum size of the segmentation region based on the image to be annotated and determining the current threshold based on the current segmentation image;

the image segmentation module is used for performing super-pixel segmentation on the image to be marked based on the minimum size to obtain a segmented image;

the image fusion module is used for fusing the segmentation areas of the segmentation image in an area fusion mode according to the current threshold value to obtain a current fusion image, and the fusion image comprises one or more local areas;

the labeling module is used for labeling a local area which only comprises one target image in the local area;

the judging module is used for judging whether the current fusion image comprises an unmarked local area;

when the current fusion image comprises an unmarked local area, taking the unmarked local area as a new current segmentation image, and returning to a threshold confirmation module to execute the step of determining the current threshold based on the current segmentation image;

and when the current fused image does not comprise the unmarked local area, finishing the marking of the image to be marked.

8. The apparatus according to claim 7, wherein the image fusion module fuses the segmentation regions of the segmentation image according to the current threshold value by using a region fusion method to obtain a current fusion image:

9. The apparatus of claim 7, wherein the threshold confirmation module determines the current threshold based on the current segmented image by:

determining the current threshold based on the second average color value.

10. The apparatus according to any one of claims 7 to 9,

the labeling module labels the local area which only comprises one target image in the local area in the following mode:

11. The apparatus of claim 10, wherein the threshold confirmation module determines the minimum size of the segmented region based on the image to be annotated by:

12. The apparatus of claim 11,

the acquisition module is further configured to:

determining the category information of the target image in the image to be annotated through the target detection model; and determining center coordinates of the target image based on the coordinate information and the category information, wherein the center coordinates correspond to the category information;

the labeling module labels the local area which only comprises one target image in the local area according to the category of the target image by adopting the following mode:

13. An electronic device, wherein the electronic device comprises:

a memory to store instructions; and

a processor for invoking the instructions stored by the memory to perform the image annotation method of any one of claims 1 to 6.

14. A computer-readable storage medium, wherein the computer-readable storage medium stores computer-executable instructions that, when executed by a processor, perform the image annotation method of any one of claims 1 to 6.