WO2021082507A1 - Semi-automated image segmentation and data annotation method, electronic device and storage medium - Google Patents
Semi-automated image segmentation and data annotation method, electronic device and storage medium Download PDFInfo
- Publication number
- WO2021082507A1 WO2021082507A1 PCT/CN2020/100347 CN2020100347W WO2021082507A1 WO 2021082507 A1 WO2021082507 A1 WO 2021082507A1 CN 2020100347 W CN2020100347 W CN 2020100347W WO 2021082507 A1 WO2021082507 A1 WO 2021082507A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- target area
- outermost
- coordinate
- edge
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/181—Segmentation; Edge detection involving edge growing; involving edge linking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10004—Still image; Photographic image
Definitions
- the present invention relates to the technical field of image processing, in particular to a semi-automatic image segmentation data labeling method, electronic device and storage medium.
- Deep learning-based image segmentation algorithms are widely used in a series of applications that require precise identification of the category and location of objects, such as garbage classification systems, autonomous driving, and processing defect detection systems. Then, image segmentation algorithms based on deep learning require a large amount of manually labeled data. Come to train. At present, the main method of image segmentation data is based on the edge of the target object, which is observed by the naked eye and manually judged, and the mouse clicks and labels point by point.
- the present invention aims to solve at least one of the technical problems existing in the prior art. For this reason, the present invention provides a semi-automatic image segmentation data labeling method, electronic device and storage medium, which can reduce the number of manual mouse clicks during image labeling and improve labeling efficiency.
- an embodiment of the present invention provides a semi-automatic image segmentation data labeling method, including the following steps:
- Step S1 Display an image to be annotated, where the image to be annotated includes a real target area and a real background area;
- Step S2 Obtain a plurality of first pixel coordinates located in the real target area, and generate a target area seed point set from the plurality of first pixel coordinates;
- Step S3 Obtain a plurality of second pixel coordinates located in the real background area, and generate a background area seed point set from the plurality of second pixel coordinates;
- Step S4 Establish a target seed gray value histogram based on the target area seed point set
- Step S5 Establish a background seed gray value histogram based on the background area seed point set
- Step S6 establishing an undirected graph representing the image to be labeled, constructing an energy function, performing image segmentation on the image to be labeled based on the minimum cut algorithm, and obtaining a binarized image after the image to be labeled is segmented,
- the binarized picture includes a predicted target area and a predicted background area;
- Step S7 Obtain the outermost coordinate points of the predicted target area on the binarized image based on an edge tracking algorithm, and generate an edge coordinate set from a plurality of the outermost coordinate points;
- Step S8 Generate a contour of the prediction target area based on the edge coordinate set, and highlight the contour on the image to be annotated;
- Step S9 Determine whether a complete selection instruction is received, if yes, obtain the category information of the target area, and save the edge coordinate set and the category information as a json file; if not, return to step S1.
- a semi-automatic image segmentation data labeling method has at least the following beneficial effects: the method is based on the multiple first pixel coordinates selected by the user and located in the real target area on the image to be labelled. The multiple second pixel coordinates in the real background area are judged based on the energy function whether each pixel in the image to be labeled belongs to the real target area or the real background area, and the outermost coordinates of the predicted target area are output, and then input by the user The category information corresponding to the predicted target area can complete the image labeling task. Therefore, the semi-automatic image segmentation data labeling method provided in this embodiment greatly simplifies the number of mouse clicks during image labeling, reduces the cost of manual labeling, and speeds up the efficiency of manual labeling.
- step S7 "generating a plurality of the outermost coordinate points into an edge coordinate set" in the step S7 further includes the following steps:
- Step S7.1 Create a set A of all the outermost coordinate points, establish a set A', add any one of the outermost coordinate points p 0 in the set A to the set A', and add all the outermost coordinate points in the set A except
- the outermost coordinate points other than p 0 establish a set A 2 , and the first marked coordinate point p is set to p 0 ;
- Step S7.2 Determine whether the number of elements in the set A 2 is zero, if not, execute step S7.2a, if yes, execute step S7.2b;
- Step S7.2a Calculate the distance d between all the outermost coordinate points in the set A 2 and the first mark coordinate point p, and set the first mark coordinate point p to the minimum value of the distance d in the set A 2
- the outermost coordinate point, the outermost coordinate point corresponding to the minimum distance d in the set A 2 is added to the set A'and deleted from the set A 2 , and the step S7.2 is returned;
- Step S7.2b Sort the outermost coordinate points according to the order in which the outermost coordinate points are added to the set A';
- Step S7.3 Establish an edge coordinate set, add p 0 to the edge coordinate set and delete it from the set A', and set the second mark coordinate point p'to p 0 ;
- Step S7.4 Determine whether the number of elements in the set A'is one, if not, execute step S7.4a, if yes, execute step S7.4b;
- Step S7.4a Determine whether the second mark coordinate point p'and the two outermost coordinate points ranked in the first two positions in the set A'are three-point collinear, if yes, go to step S7.4a1, if not, Step S7.4a2 is executed;
- Step S7.4a1 delete the outermost coordinate point that is ranked first in the set A'from the set A', and return to step S7.4;
- Step S7.4a2 Set the second mark coordinate point p'as the outermost coordinate point ranked at the top in the set A', and add the outermost coordinate point ranked at the top in the set A' To the edge coordinate set and delete from the set A', return to step S7.4;
- Step S7.4b Add the outermost coordinate points in the set A'to the edge coordinate set, and output the edge coordinate set.
- step S8 further includes:
- Step S8.4 Perform shadow processing on the prediction target area on the image to be labeled.
- an embodiment of the present invention provides an electronic device, including: a memory, a processor, and a computer program stored on the memory and capable of running on the processor.
- the processor executes the program, the implementation is as follows: The semi-automatic image segmentation data labeling method described in any one of the first aspect of the present invention.
- an electronic device of an embodiment of the present invention executes the semi-automatic image segmentation data labeling method according to any one of the first aspect of the present invention, it has all the beneficial effects of the first aspect of the present invention.
- an embodiment of the present invention provides a computer-readable storage medium that stores computer-executable instructions, and is characterized in that: the computer-executable instructions are used to execute the semiconductor device according to any one of the first aspects of the present invention.
- Automatic image segmentation data labeling method
- the computer-readable storage medium of the embodiment of the present invention stores computer-executable instructions for executing the semi-automated image segmentation data labeling method according to any one of the first aspect of the present invention, it has the advantages of the first aspect of the present invention. All beneficial effects.
- Fig. 1 is a flowchart of a method according to a second embodiment of the present invention.
- FIG. 2 is an effect diagram of a semi-automatic image segmentation data labeling method according to an embodiment of the present invention
- FIG. 3 is a schematic diagram of the principle of the graph segmentation algorithm according to the second embodiment of the present invention.
- FIG. 4 is a schematic diagram of the structure of the electronic device according to the first embodiment of the present invention.
- the electronic device 100 the processor 101, and the memory 102.
- the existing data labeling method is to use open source image segmentation data labeling tools (such as Labelme, Labelimg, etc.) to mark and plot the edges of the target image (by clicking the mouse to draw the edge of the target object in the image, if you want to Dotting and plotting of complex target objects may require hundreds of clicks of the mouse), and then separate the target image from the background, that is, segment the image to obtain the target image, and then perform data annotation on the obtained target image. If the shape of the target image is complex, it may be necessary to plot hundreds of points, that is, the number of clicks of the mouse may be as high as hundreds of times, which will make the eyes of the data annotator fatigue.
- open source image segmentation data labeling tools such as Labelme, Labelimg, etc.
- the hand of the data annotator will also experience fatigue and soreness due to long-time mouse clicks.
- Such a data labeling method is inefficient.
- the image segmentation time will be very long, which will seriously affect the speed of data labeling, resulting in low data labeling efficiency.
- the present invention provides a semi-automatic image segmentation data labeling method, electronic device and storage medium.
- the user uses the mouse to obtain the image to be labelled twice on the image to be labelled.
- the present invention can reduce the number of clicks of the mouse when annotator mark data, and allows annotator to choose whether to accept the result of this automatic segmentation by inputting a "full selection instruction", which not only improves the efficiency of annotation, but also ensures the annotation The precision.
- an electronic device 100 provided by the first embodiment of the present invention includes a memory 102 and a processor 101.
- a processor 101 and a memory 102 are taken as an example.
- the processor and the memory may be connected through a bus or in other ways.
- the connection through a bus is taken as an example.
- the memory 102 can be used to store non-transitory software programs and non-transitory computer-executable programs.
- the memory 102 may include a high-speed random access memory, and may also include a non-transitory memory, such as at least one magnetic disk storage device, a flash memory device, or other non-transitory solid-state storage devices.
- the memory 102 may optionally include a memory 102 remotely provided with respect to the processor, and these remote memories may be connected to the electronic device 100 via a network. Examples of the aforementioned networks include, but are not limited to, the Internet, corporate intranets, local area networks, mobile communication networks, and combinations thereof.
- the device structure shown in FIG. 4 does not constitute a limitation on the electronic device 100, and may include more or fewer components than shown in the figure, or a combination of certain components, or a different component arrangement.
- the processor 101 in the electronic device 100 can be used to call the semi-automatic image segmentation data labeling method stored in the memory 102, And perform the following steps:
- Step S1 Display an image to be annotated, where the image to be annotated includes a real target area and a real background area;
- Step S2 Obtain a plurality of first pixel coordinates located in the real target area, and generate a target area seed point set from the plurality of first pixel coordinates;
- Step S3 Obtain a plurality of second pixel coordinates located in the real background area, and generate a background area seed point set from the plurality of second pixel coordinates;
- Step S4 Establish a target seed gray value histogram based on the target area seed point set
- Step S5 Establish a background seed gray value histogram based on the background area seed point set
- Step S6 establishing an undirected graph representing the image to be labeled, constructing an energy function, performing image segmentation on the image to be labeled based on the minimum cut algorithm, and obtaining a binarized image after the image to be labeled is segmented,
- the binarized picture includes a predicted target area and a predicted background area;
- Step S7 Obtain the outermost coordinate points of the predicted target area on the binarized image based on an edge tracking algorithm, and generate an edge coordinate set from a plurality of the outermost coordinate points;
- Step S8 Generate a contour of the prediction target area based on the edge coordinate set, and highlight the contour on the image to be annotated;
- Step S9 Determine whether a complete selection instruction is received, if yes, obtain the category information of the target area, and save the edge coordinate set and the category information as a json file; if not, return to step S1.
- FIG. 1 it is a semi-automatic image segmentation data labeling method according to a second embodiment of the present invention, which includes the following steps:
- Step S1 Display an image to be annotated, where the image to be annotated includes a real target area and a real background area;
- Step S2 Obtain a plurality of first pixel coordinates located in the real target area, and generate a target area seed point set from the plurality of first pixel coordinates;
- Step S3 Obtain a plurality of second pixel coordinates located in the real background area, and generate a background area seed point set from the plurality of second pixel coordinates;
- Step S4 Establish a target seed gray value histogram based on the target area seed point set
- Step S5 Establish a background seed gray value histogram based on the background area seed point set
- Step S6 establishing an undirected graph representing the image to be labeled, constructing an energy function, performing image segmentation on the image to be labeled based on the minimum cut algorithm, and obtaining a binarized image after the image to be labeled is segmented,
- the binarized picture includes a predicted target area and a predicted background area;
- Step S7 Obtain the outermost coordinate points of the predicted target area on the binarized image based on an edge tracking algorithm, and generate an edge coordinate set from a plurality of the outermost coordinate points;
- Step S8 Generate a contour of the prediction target area based on the edge coordinate set, and highlight the contour on the image to be annotated;
- Step S9 Determine whether a complete selection instruction is received, if yes, obtain the category information of the target area, and save the edge coordinate set and the category information as a json file; if not, return to step S1.
- the image to be annotated can be displayed to the annotator through a PC computer screen or a tablet computer screen, and the image to be annotated includes a real target area and a real background area.
- the tagger judges that the footballs belong to the real target area and the lawns belong to the real background area.
- the method of obtaining the first pixel coordinates is to use the labeler to press and hold down the left mouse button on the PC and drag in the real target area to obtain the pixels of the real target area that the mouse drags through as the first pixel coordinates.
- the first pixel can be displayed in the real target area. After the first pixel coordinate is obtained, the second pixel coordinate is obtained in the real background area in the same manner.
- the picture here is slightly different from the normal picture.
- An ordinary graph is composed of vertices and edges. If the edges are directional, such a graph is called a directed graph, otherwise it is an undirected graph, and the edges are weighted. Different edges can have different weights. , Respectively represent different physical meanings.
- the graph in this embodiment has two more vertices on the basis of the ordinary graph, and these two vertices are represented by the symbols "S" and "T", and are collectively referred to as terminal vertices. All other vertices must be connected to these 2 vertices to form part of the edge set. Therefore, there are two types of vertices and two types of edges in the graph used to represent the image to be annotated in this embodiment.
- the first type of vertices and edges are: the first type of ordinary vertices corresponds to each pixel in the image.
- the connection of every two neighboring vertices (corresponding to every two neighboring pixels in the image) is an edge. This kind of edge is also called n-links.
- the second type of vertices and edges are: In addition to image pixels, there are two other terminal vertices, called S and T respectively. There is a connection between each ordinary vertex and these 2 terminal vertices, forming the second type of edge. Such edges are also called t-links.
- Figure 3 shows the s-t graph corresponding to the image, each pixel corresponds to a corresponding vertex in the graph, and there are two vertices s and t.
- the graph in Figure 3 has two kinds of edges, one of which represents the n-links that connect every two ordinary vertices in the neighborhood, and the other represents the t-links that connects every ordinary vertex to s and t.
- s generally represents the foreground target (ie, the real target area)
- t generally represents the background (ie, the real background area).
- Each edge of the graph in Figure 3 has a non-negative weight we, which can also be understood as a cost.
- a cut is a subset C of the edge set E in the graph, and the cost of this cut (represented as
- This embodiment needs to require such a set of edges.
- This set of edges includes the above two types of edges. The disconnection of all edges in the set will result in the separation of the remaining "S" and "T” graphs, so it is called this The set of edges is "cut”. If a cut has the smallest sum of the ownership values of its edges, then this is called the minimum cut, which is the result of the graph cut.
- the Ford-Folkson theorem shows that the maximum flow max flow of the network is equal to the minimum cut min cut. Therefore, the max-flow/min-cut algorithm invented by Boykov and Kolmogorov can be used to obtain the minimum cut of the s-t graph.
- the weight of the edge in the graph determines the final segmentation result, and the weight of the edge can be determined by the energy function.
- Image segmentation can be regarded as a pixel labeling problem.
- the label of the target (s-node) is set to 1
- the label of the background (t-node) is set to 0. This process can be obtained by minimizing the graph cut to minimize the energy function. It is obvious that the cut that occurs at the boundary between the target and the background is what we want (equivalent to cutting off the connection between the background and the target in the image, which is equivalent to dividing it). At the same time, the energy should also be minimal at this time.
- R(L) is the regional term
- B(L) is the boundary term
- a is an important factor between the regional term and the boundary term, which determines their influence on energy. If a is 0, then only boundary factors are considered, and regional factors are not considered.
- E(L) represents the weight, that is, the loss function, also called the energy function. The goal of the graph cut is to optimize the energy function to minimize its value.
- R p (l p ) represents the penalty for assigning label l p to pixel p
- the weight of the energy term of R p (l p ) can be obtained by comparing the gray level of pixel p with the gray histogram of the given target and foreground
- it is the probability that the pixel p belongs to the label l p
- we want the energy to be the smallest, so we generally take the negative logarithm of the probability, so the weight of t-link The values are as follows:
- R p (1) -ln Pr(L p
- 'obj'); R p (0) -ln Pr(L p
- p and q are neighborhood pixels, and the boundary term mainly reflects the boundary attributes of segmenting L.
- B ⁇ p,q> can be parsed as a penalty for discontinuity between pixels p and q.
- B ⁇ p,q> can be parsed as a penalty for discontinuity between pixels p and q.
- B ⁇ p,q> if they are more similar ( For example, their gray scale), then the larger B ⁇ p,q> , if they are very different, then B ⁇ p,q> is close to 0.
- the difference between the two neighboring pixels is very small, then it is likely to belong to the same target or the same background. If they are very different, it means that the two pixels are likely to be between the target and the background.
- the edge part is more likely to be divided, so when the difference between the two neighboring pixels is larger, the smaller B ⁇ p,q> , that is, the smaller the energy.
- an image is divided into two disjoint parts of the target and the background, and the image segmentation technology is used to achieve this.
- the graph is composed of vertices and edges, and edges have weights.
- this graph has two types of vertices, two types of edges and two types of weights.
- An ordinary vertex is composed of each pixel of the image, and then there is an edge between every two neighboring pixels, and its weight is determined by the "boundary term" mentioned above.
- the weight of the edge is determined by the "regional energy term” R p (1), each The weight of the edge connecting the common vertex and t is determined by the "regional energy term” R p (0). In this way, the weights of all edges can be determined, that is, the graph is determined.
- the min cut (minimum cut) algorithm can be used to find the smallest cut. This min cut is the set of weights and the smallest edges. The disconnection of these edges can just cause the target and the background to be separated, that is, min. Cut corresponds to the minimization of energy.
- the min cut and the max flow (minimum cut) of the graph are equivalent, so the min cut of the st graph can be found through the max flow algorithm.
- a graph structure is used to represent the image to be labeled, an energy function is constructed, and the image to be labeled is segmented based on the minimum cut or maximum flow algorithm. And after the image is segmented, the image will be divided into two parts, namely the prediction target area and the prediction background area, and then all the pixels in the prediction target area are set to black, and all the pixels in the predicted background area are set to white, so all pixels of the image to be labeled It is divided into two values, black and white, which is the binary image. According to these data, an edge tracking algorithm is used on the binarized picture to obtain the outermost coordinate points of the prediction target area on the binarized picture.
- the graph segmentation algorithm is used in this embodiment to automatically obtain the outermost coordinate points of the predicted target area, in order to facilitate the annotator to determine the accuracy of the outermost coordinate points of the predicted target area obtained by this graph segmentation, it is based on the edge
- the coordinate set generates the contour of the prediction target area, and the contour is highlighted on the to-be-labeled image, which is convenient for the labeler to compare.
- the annotator thinks that the predicted target area obtained this time is ideal, he can send a "full selection instruction" to the electronic device in the first embodiment.
- the instruction can be sent by hitting the enter key on the keyboard.
- the display screen will display an interface for inputting category information.
- the labeler inputs category information corresponding to the predicted target area through the keyboard, such as football, and then the edge coordinate set and the category information are saved as a json file to complete the semi-automatic labeling.
- the annotator thinks that the predicted target area obtained this time is not ideal, he can send an "incomplete selection instruction" to the electronic device in the first embodiment.
- the annotator can press the space bar on the keyboard to The electronic device sends the instruction, and after receiving the instruction, the electronic device executes step S1 again. Therefore, the semi-automatic image segmentation data labeling method provided in this embodiment improves labeling efficiency while also allowing labelers to control the accuracy of labeling.
- the step S7 of "generating an edge coordinate set from a plurality of the outermost coordinate points" further includes the following steps :
- Step S7.1 Create a set A of all the outermost coordinate points, establish a set A', add any one of the outermost coordinate points p 0 in the set A to the set A', except for the set A
- the outermost coordinate points other than p 0 establish a set A 2 , and the first marked coordinate point p is set to p 0 ;
- Step S7.2 Determine whether the number of elements in the set A 2 is zero, if not, execute step S7.2a, if yes, execute step S7.2b;
- Step S7.2a Calculate the distance d between all the outermost coordinate points in the set A 2 and the first mark coordinate point p, and set the first mark coordinate point p to the minimum value of the distance d in the set A 2
- the outermost coordinate point, the outermost coordinate point corresponding to the minimum distance d in the set A 2 is added to the set A'and deleted from the set A 2 , and the step S7.2 is returned;
- Step S7.2b Sort the outermost coordinate points according to the order in which the outermost coordinate points are added to the set A';
- Step S7.3 Establish an edge coordinate set, add p 0 to the edge coordinate set and delete it from the set A', and set the second mark coordinate point p'to p 0 ;
- Step S7.4 Determine whether the number of elements in the set A'is one, if not, execute step S7.4a, if yes, execute step S7.4b;
- Step S7.4a Determine whether the second mark coordinate point p'and the two outermost coordinate points in the top two positions in the set A'are three-point collinear, if yes, perform step S7.4a1, if not, Step S7.4a2 is executed;
- Step S7.4a1 delete the outermost coordinate point that is ranked first in the set A'from the set A', and return to step S7.4;
- Step S7.4a2 Set the second marking coordinate point p'as the outermost coordinate point ranked at the top in the set A', and add the outermost coordinate point ranked at the top in the set A' To the edge coordinate set and delete from the set A', return to step S7.4;
- Step S7.4b Add the outermost coordinate points in the set A'to the edge coordinate set, and output the edge coordinate set.
- the annotator In purely manual labeling, for the case where part of the outline of the real target area is a straight line, for example, for the case where the real target area is a square, the annotator generally only uses the mouse to click on the four vertices of the square, and one of the two adjacent vertices Draw a straight line between. Therefore, only four pixels are needed to represent the coordinates of the square, which greatly reduces the amount of data.
- the semi-automatic labeling method since the edge coordinates of the predicted target area are obtained by the edge tracking algorithm, they are composed of a series of pixels that are neighbors, resulting in a large amount of data.
- this embodiment provides a simplified algorithm for obtaining the outermost coordinate points of the predicted target area.
- the algorithm includes two parts.
- the first part is steps S7.1 to S7.2. These steps are performed in the order in which the outermost coordinate points of the obtained prediction target area are added to the set A'. Sort. If the outermost coordinate points are passed through the outermost coordinate points in the order in which the outermost coordinate points are added to the set A′, it will just form the contour of the prediction target area. Therefore, the second part consisting of step S7.3 to step S7.4 is to check whether the three adjacent points on the contour are collinear in the order in which the outermost coordinate points are added to the set A'. , If collinear, remove the middle point, only keep the first and last two points, realize the effect of manual labeling, and reduce the amount of data generated by semi-automatic labeling.
- the semi-automatic image segmentation data labeling method of the fourth embodiment of the present invention is based on the second embodiment and the third embodiment.
- step S8 "generate the contour of the prediction target area based on the edge coordinate set, And highlighting the outline on the image to be annotated," also includes the following steps:
- Step S8.1 On the image to be annotated, the two outermost coordinate points that are added to the edge coordinate set in an adjacent order are connected by a straight line;
- Step S8.2 On the image to be annotated, the outermost coordinate point added to the edge coordinate set in the last order is connected with p 0 by a straight line;
- Step S8.3 Generate the contour from the pixels passing by the straight line on the image to be annotated, and highlight the pixels corresponding to the straight line.
- the third embodiment Based on the third embodiment, it can be known that passing through these outermost coordinate points in the order in which the outermost coordinate points are added to the set A′ just encloses the contour of the prediction target area. Therefore, while using the third embodiment to reduce the amount of semi-automatic labeling data, it also facilitates the generation of the contour of the predicted target area, reduces the calculation time when generating the contour, and improves the efficiency of the algorithm. At the same time, the pixels of the outline are increased in brightness and white in color, which is convenient for the user to identify the edge of the currently selected area.
- the semi-automatic image segmentation data labeling method of the fifth embodiment of the present invention is based on the fourth embodiment, and the step S8 further includes:
- Step S8.4 Perform shadow processing on the prediction target area on the image to be labeled.
- shadow processing is performed on the predicted target area, and the output is a darkened image of the local area.
- the function is to facilitate the user to identify the selected local area in the target object.
- the computer-readable storage medium of the fifth embodiment of the present invention stores computer-executable instructions for executing the semi-automated image segmentation data labeling according to any one of the second to fifth embodiments above method.
- FIG. 2 it is an effect diagram of image processing to be labeled using the semi-automatic image segmentation data labeling method of an embodiment of the present invention.
- the image to be labeled is displayed on the computer screen.
- the real target area on the image to be labeled is a football, and the real background area is a lawn.
- the marker moves the mouse on the football, clicks the left button of the mouse, and then drags the mouse to draw a stroke on the football;
- the marker moves the mouse to the lawn, clicks the left button of the mouse, and then drags the mouse to draw a stroke on the lawn;
- the fourth step is to automatically obtain the contour coordinates of the football through image segmentation and process the shadow of the football;
- the points where the football contour coordinates automatically obtained by the graph segmentation are located on the same execution are merged, thereby reducing the amount of data.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
A semi-automated image segmentation and data annotation method, an electronic device and a storage medium. The method comprises: on the basis multiple first pixel coordinates located in a real target area and multiple second pixel coordinates located in a real background area selected by a user on an image to be annotated, and on the basis of an energy function, determining whether each pixel in the image belongs to the real target area or the real background area; outputting the outermost coordinates of a predicted target area; and the user then inputs category information corresponding to the predicted target area, and the image annotation task can then be completed. The method simplifies the number of mouse clicks during image annotation, reduces the cost of manual annotation, and improves the efficiency of manual annotation.
Description
本发明涉及图像处理技术领域,特别涉及一种半自动化图像分割数据标注方法、电子装置及存储介质。The present invention relates to the technical field of image processing, in particular to a semi-automatic image segmentation data labeling method, electronic device and storage medium.
基于深度学习的图像分割算法大量应用与垃圾分类系统、自动驾驶、加工缺陷检测系统等一系列需要精细识别物体的类别和位置的应用上,然后基于深度学习的图像分割算法需要大量人工标注的数据来训练。现在主要的图像分割数据方法是根据目标物体的边缘,由肉眼观察并通过人工判断,一个点一个点地进行鼠标点击标注。Deep learning-based image segmentation algorithms are widely used in a series of applications that require precise identification of the category and location of objects, such as garbage classification systems, autonomous driving, and processing defect detection systems. Then, image segmentation algorithms based on deep learning require a large amount of manually labeled data. Come to train. At present, the main method of image segmentation data is based on the edge of the target object, which is observed by the naked eye and manually judged, and the mouse clicks and labels point by point.
因此,现有技术中,为了满足基于深度学习的图像分割网络的训练对图像标注数据的需求,技术众包平台应运而生,有的公司招聘上万数据标注员,但由于这样的数据标注方法靠肉眼观察并通过人工判断的数据标注方法,对一张图像进行标注时,需要人工点击鼠标几十次甚至上百次,效率低下。Therefore, in the prior art, in order to meet the needs of image annotation data for the training of image segmentation networks based on deep learning, a technical crowdsourcing platform came into being. Some companies recruit tens of thousands of data annotators, but due to such data annotation methods The method of data labeling by visual observation and manual judgment requires manual mouse clicks for dozens or even hundreds of times when labeling an image, which is inefficient.
发明内容Summary of the invention
本发明旨在至少解决现有技术中存在的技术问题之一。为此,本发明提出一种半自动化图像分割数据标注方法、电子装置及存储介质,能够减少图像标注时人工点击鼠标的次数,提高标注效率。The present invention aims to solve at least one of the technical problems existing in the prior art. For this reason, the present invention provides a semi-automatic image segmentation data labeling method, electronic device and storage medium, which can reduce the number of manual mouse clicks during image labeling and improve labeling efficiency.
第一方面,本发明实施例提供了一种半自动化图像分割数据标注方法,包括以下步骤:In the first aspect, an embodiment of the present invention provides a semi-automatic image segmentation data labeling method, including the following steps:
步骤S1:显示待标注图像,所述待标注图像包括真实目标区域和真实背景区域;Step S1: Display an image to be annotated, where the image to be annotated includes a real target area and a real background area;
步骤S2:获取位于所述真实目标区域内的多个第一像素坐标,并将多个所述 第一像素坐标生成目标区域种子点集合;Step S2: Obtain a plurality of first pixel coordinates located in the real target area, and generate a target area seed point set from the plurality of first pixel coordinates;
步骤S3:获取位于所述真实背景区域内的多个第二像素坐标,并将多个所述第二像素坐标生成背景区域种子点集合;Step S3: Obtain a plurality of second pixel coordinates located in the real background area, and generate a background area seed point set from the plurality of second pixel coordinates;
步骤S4:基于所述目标区域种子点集合建立目标种子灰度值直方图;Step S4: Establish a target seed gray value histogram based on the target area seed point set;
步骤S5:基于所述背景区域种子点集合建立背景种子灰度值直方图;Step S5: Establish a background seed gray value histogram based on the background area seed point set;
步骤S6:建立用于表示所述待标注图像的无向图,构造能量函数,基于最小割算法对所述待标注图像进行图像分割,获得所述待标注图像被分割后的二值化图片,所述二值化图片包括预测目标区域以及预测背景区域;Step S6: establishing an undirected graph representing the image to be labeled, constructing an energy function, performing image segmentation on the image to be labeled based on the minimum cut algorithm, and obtaining a binarized image after the image to be labeled is segmented, The binarized picture includes a predicted target area and a predicted background area;
步骤S7:基于边缘跟踪算法获取所述二值化图片上所述预测目标区域的最外围坐标点,并将多个所述最外围坐标点生成边缘坐标集合;Step S7: Obtain the outermost coordinate points of the predicted target area on the binarized image based on an edge tracking algorithm, and generate an edge coordinate set from a plurality of the outermost coordinate points;
步骤S8:基于所述边缘坐标集合生成所述预测目标区域的轮廓,并在所述待标注图像上高亮显示所述轮廓;Step S8: Generate a contour of the prediction target area based on the edge coordinate set, and highlight the contour on the image to be annotated;
步骤S9:判断是否接收到完全选中指令,若是,则获取所述目标区域的类别信息,并将所述边缘坐标集合和所述类别信息保存为json文件;若否,则返回步骤S1。Step S9: Determine whether a complete selection instruction is received, if yes, obtain the category information of the target area, and save the edge coordinate set and the category information as a json file; if not, return to step S1.
根据本发明实施例的一种半自动化图像分割数据标注方法,至少具有如下有益效果:本方法基于用户在待标注图像上选定的位于所述真实目标区域内的多个第一像素坐标以及位于所述真实背景区域内的多个第二像素坐标,基于能量函数来判断待标注图像中的各个像素是属于真实目标区域还是真实背景区域,并输出预测目标区域的最外围坐标,再由用户输入该预测目标区域对应的类别信息,就可以完成图像标注任务。因此,本实施例提供的半自动化图像分割数据标注方法,极大简化了图像标注时鼠标点击的次数,降低了人工标注的成本,加快人工标注的效率。A semi-automatic image segmentation data labeling method according to an embodiment of the present invention has at least the following beneficial effects: the method is based on the multiple first pixel coordinates selected by the user and located in the real target area on the image to be labelled. The multiple second pixel coordinates in the real background area are judged based on the energy function whether each pixel in the image to be labeled belongs to the real target area or the real background area, and the outermost coordinates of the predicted target area are output, and then input by the user The category information corresponding to the predicted target area can complete the image labeling task. Therefore, the semi-automatic image segmentation data labeling method provided in this embodiment greatly simplifies the number of mouse clicks during image labeling, reduces the cost of manual labeling, and speeds up the efficiency of manual labeling.
在本发明的另一个特定的实施例中,所述步骤S7中的“将多个所述最外围坐标点生成边缘坐标集合”,还包括以下步骤:In another specific embodiment of the present invention, "generating a plurality of the outermost coordinate points into an edge coordinate set" in the step S7 further includes the following steps:
步骤S7.1:将全部的所述最外围坐标点建立集合A,建立集合A',将集合A 中的任一个所述最外围坐标点p
0添加到集合A'中,将集合A中除了p
0之外的其他所述最外围坐标点建立集合A
2,将第一标记坐标点p置为p
0;
Step S7.1: Create a set A of all the outermost coordinate points, establish a set A', add any one of the outermost coordinate points p 0 in the set A to the set A', and add all the outermost coordinate points in the set A except The outermost coordinate points other than p 0 establish a set A 2 , and the first marked coordinate point p is set to p 0 ;
步骤S7.2:判断集合A
2中元素数量是否为零,若否,则执行步骤S7.2a,若是,则执行步骤S7.2b;
Step S7.2: Determine whether the number of elements in the set A 2 is zero, if not, execute step S7.2a, if yes, execute step S7.2b;
步骤S7.2a:计算集合A
2中的所有最外围坐标点与第一标记坐标点p的距离d,将第一标记坐标点p置为集合A
2中对应于距离d的最小值的所述最外围坐标点,将集合A
2中对应于距离d的最小值的所述最外围坐标点添加到集合A'中并从集合A
2中删除,返回步骤S7.2;
Step S7.2a: Calculate the distance d between all the outermost coordinate points in the set A 2 and the first mark coordinate point p, and set the first mark coordinate point p to the minimum value of the distance d in the set A 2 The outermost coordinate point, the outermost coordinate point corresponding to the minimum distance d in the set A 2 is added to the set A'and deleted from the set A 2 , and the step S7.2 is returned;
步骤S7.2b:按照所述最外围坐标点添加到所述集合A'中的顺序将所述最外围坐标点进行排序;Step S7.2b: Sort the outermost coordinate points according to the order in which the outermost coordinate points are added to the set A';
步骤S7.3:建立边缘坐标集合,将p
0添加到所述边缘坐标集合中并从集合A'删除,将第二标记坐标点p'置为p
0;
Step S7.3: Establish an edge coordinate set, add p 0 to the edge coordinate set and delete it from the set A', and set the second mark coordinate point p'to p 0 ;
步骤S7.4:判断集合A'中的元素数量是否为一,若否,则执行步骤S7.4a,若是,则执行步骤S7.4b;Step S7.4: Determine whether the number of elements in the set A'is one, if not, execute step S7.4a, if yes, execute step S7.4b;
步骤S7.4a:判断第二标记坐标点p′与集合A'中排序位于最前两位的两个所述最外围坐标点是否三点共线,若是,则执行步骤S7.4a1,若否,则执行步骤S7.4a2;Step S7.4a: Determine whether the second mark coordinate point p'and the two outermost coordinate points ranked in the first two positions in the set A'are three-point collinear, if yes, go to step S7.4a1, if not, Step S7.4a2 is executed;
步骤S7.4a1:将集合A'中排序位于最前一位的所述最外围坐标点从集合A'中删除,返回步骤S7.4;Step S7.4a1: delete the outermost coordinate point that is ranked first in the set A'from the set A', and return to step S7.4;
步骤S7.4a2:将第二标记坐标点p′置为集合A'中排序位于最前一位的所述最外围坐标点,将集合A'中排序位于最前一位的所述最外围坐标点添加到所述边缘坐标集合中并从集合A'中删除,返回步骤S7.4;Step S7.4a2: Set the second mark coordinate point p'as the outermost coordinate point ranked at the top in the set A', and add the outermost coordinate point ranked at the top in the set A' To the edge coordinate set and delete from the set A', return to step S7.4;
步骤S7.4b:将集合A'中的所述最外围坐标点添加到所述边缘坐标集合中,输出所述边缘坐标集合。Step S7.4b: Add the outermost coordinate points in the set A'to the edge coordinate set, and output the edge coordinate set.
在本发明的另一个特定的实施例中,所述步骤S8还包括:In another specific embodiment of the present invention, the step S8 further includes:
步骤S8.4:对待标注图像上的所述预测目标区域进行阴影处理。Step S8.4: Perform shadow processing on the prediction target area on the image to be labeled.
第二方面,本发明实施例提供了一种电子装置,包括:存储器、处理器及存储在所述存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现如本发明第一方面中任一项所述的半自动化图像分割数据标注方法。In a second aspect, an embodiment of the present invention provides an electronic device, including: a memory, a processor, and a computer program stored on the memory and capable of running on the processor. When the processor executes the program, the implementation is as follows: The semi-automatic image segmentation data labeling method described in any one of the first aspect of the present invention.
由于本发明实施例的一种电子装置执行如本发明第一方面中任一项所述的半自动化图像分割数据标注方法,因此具有本发明第一方面的所有有益效果。Since an electronic device of an embodiment of the present invention executes the semi-automatic image segmentation data labeling method according to any one of the first aspect of the present invention, it has all the beneficial effects of the first aspect of the present invention.
第三方面,本发明实施例提供了计算机可读存储介质,存储有计算机可执行指令,其特征在于:所述计算机可执行指令用于执行如本发明第一方面中任一项所述的半自动化图像分割数据标注方法。In a third aspect, an embodiment of the present invention provides a computer-readable storage medium that stores computer-executable instructions, and is characterized in that: the computer-executable instructions are used to execute the semiconductor device according to any one of the first aspects of the present invention. Automatic image segmentation data labeling method.
由于本发明实施例的计算机可读存储介质上存储有用于执行如本发明第一方面中任一项所述的半自动化图像分割数据标注方法的计算机可执行指令,因此具有本发明第一方面的所有有益效果。Since the computer-readable storage medium of the embodiment of the present invention stores computer-executable instructions for executing the semi-automated image segmentation data labeling method according to any one of the first aspect of the present invention, it has the advantages of the first aspect of the present invention. All beneficial effects.
本发明的附加方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本发明的实践了解到。The additional aspects and advantages of the present invention will be partly given in the following description, and partly will become obvious from the following description, or be understood through the practice of the present invention.
本发明的上述和/或附加的方面和优点从结合下面附图对实施例的描述中将变得明显和容易理解,其中:The above and/or additional aspects and advantages of the present invention will become obvious and easy to understand from the description of the embodiments in conjunction with the following drawings, in which:
图1为本发明第二实施例的方法流程图;Fig. 1 is a flowchart of a method according to a second embodiment of the present invention;
图2为本发明实施例的半自动化图像分割数据标注方法的效果图;2 is an effect diagram of a semi-automatic image segmentation data labeling method according to an embodiment of the present invention;
图3为本发明第二实施例的图分割算法原理示意图;3 is a schematic diagram of the principle of the graph segmentation algorithm according to the second embodiment of the present invention;
图4为本发明第一实施例的电子装置结构示意图;4 is a schematic diagram of the structure of the electronic device according to the first embodiment of the present invention;
附图标记:Reference signs:
电子装置100、处理器101、存储器102。The electronic device 100, the processor 101, and the memory 102.
下面详细描述本发明的实施例,下面通过参考附图描述的实施例是示例性的,仅用于解释本,而不能理解为对本的限制。The embodiments of the present invention are described in detail below. The embodiments described below with reference to the accompanying drawings are exemplary, and are only used for explaining the present invention, and should not be construed as limiting the present invention.
在本发明的描述中,若干的含义是一个或者多个,多个的含义是两个以上,大于、小于、超过等理解为不包括本数,以上、以下、以内等理解为包括本数。如果有描述到第一、第二只是用于区分技术特征为目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量或者隐含指明所指示的技术特征的先后关系。In the description of the present invention, several means one or more, multiple means two or more, greater than, less than, exceeding, etc. are understood to not include the number, and above, below, and within are understood to include the number. If it is described that the first and second are only used for the purpose of distinguishing technical features, and cannot be understood as indicating or implying the relative importance or implicitly specifying the number of the indicated technical features or implicitly specifying the order of the indicated technical features relationship.
现有的数据标注方法是使用开源的图像分割数据标注工具(如Labelme,Labelimg等)对目标图像的边缘进行打点标绘(通过鼠标点击把图像中的目标物体的边缘描绘出来,如要对形状复杂的目标物体进行打点标绘,则可能需要点击鼠标上百次),然后把目标图像从背景中分离出来,即对图像进行分割,得到目标图像,后对所得到的目标图像进行数据标注。如果目标图像形状复杂,则可能需要标绘上百个点,即点击鼠标的次数可能高达上百次,这会使得数据标注员的眼睛产生疲劳。如果目标图像形状复杂,数据标注员的手部同样也会因为长时间点击鼠标而产生疲劳酸痛状况。这样的数据标注方法效率低下,当需要进行数据标注的目标图像很多,或者目标图像形状复杂时,图像分割时间会很长,严重影响数据标注的速度,导致数据标注效率低下。The existing data labeling method is to use open source image segmentation data labeling tools (such as Labelme, Labelimg, etc.) to mark and plot the edges of the target image (by clicking the mouse to draw the edge of the target object in the image, if you want to Dotting and plotting of complex target objects may require hundreds of clicks of the mouse), and then separate the target image from the background, that is, segment the image to obtain the target image, and then perform data annotation on the obtained target image. If the shape of the target image is complex, it may be necessary to plot hundreds of points, that is, the number of clicks of the mouse may be as high as hundreds of times, which will make the eyes of the data annotator fatigue. If the shape of the target image is complex, the hand of the data annotator will also experience fatigue and soreness due to long-time mouse clicks. Such a data labeling method is inefficient. When there are many target images that need to be labelled or the shape of the target image is complex, the image segmentation time will be very long, which will seriously affect the speed of data labeling, resulting in low data labeling efficiency.
基于此,本发明提供了一种半自动化图像分割数据标注方法、电子装置及存储介质,通过将待标注图像显示在电子屏幕上,根据用户两次在待标注图像上分别用鼠标获取待标注图像中的真实目标区域内的多个第一像素坐标以及真实背景区域内的多个第二像素坐标,通过构造能量函数,基于最小割算法将待标注图像进行分割,获得对应于真实目标区域的预测目标区域的最外围坐标点,最后由标注员判断预测目标区域与真实目标区域的差别,判断是否输入“完全选中指令”,使电脑最终将所述边缘坐标集合和所述类别信息保存为json文件,完成最终的标注。因此,本发明可以减少标注员标注数据时点击鼠标的数量,并允许 标注员通过是否输入“完全选中指令”来选择是否接受本次自动分割的结果,既提升了标注的效率,又保证了标注的精度。Based on this, the present invention provides a semi-automatic image segmentation data labeling method, electronic device and storage medium. By displaying the image to be labelled on the electronic screen, the user uses the mouse to obtain the image to be labelled twice on the image to be labelled. The multiple first pixel coordinates in the real target area and multiple second pixel coordinates in the real background area in, the energy function is constructed, and the image to be labeled is segmented based on the minimum cut algorithm to obtain a prediction corresponding to the real target area The outermost coordinate points of the target area, and finally the labeler judges the difference between the predicted target area and the real target area, and judges whether to enter the "full selection instruction", so that the computer finally saves the edge coordinate set and the category information as a json file , Complete the final annotation. Therefore, the present invention can reduce the number of clicks of the mouse when annotator mark data, and allows annotator to choose whether to accept the result of this automatic segmentation by inputting a "full selection instruction", which not only improves the efficiency of annotation, but also ensures the annotation The precision.
下文提供了许多不同的实施方式或例子用来实现本发明的不同方案,值得理解的是,下述描述仅是示例性说明,而不是对发明的具体限制。Many different embodiments or examples are provided below to implement different solutions of the present invention. It should be understood that the following descriptions are only exemplary descriptions, rather than specific limitations on the invention.
参照图4所示,为本发明第一实施例提供的一种电子装置100,包括存储器102、处理器101,图4中以一个处理器101和一个存储器102为例。Referring to FIG. 4, an electronic device 100 provided by the first embodiment of the present invention includes a memory 102 and a processor 101. In FIG. 4, a processor 101 and a memory 102 are taken as an example.
处理器和存储器可以通过总线或者其他方式连接,图4中以通过总线连接为例。The processor and the memory may be connected through a bus or in other ways. In FIG. 4, the connection through a bus is taken as an example.
存储器102作为一种非暂态计算机可读存储介质,可用于存储非暂态软件程序以及非暂态性计算机可执行程序。此外,存储器102可以包括高速随机存取存储器,还可以包括非暂态存储器,例如至少一个磁盘存储器件、闪存器件、或其他非暂态固态存储器件。在一些实施方式中,存储器102可选包括相对于处理器远程设置的存储器102,这些远程存储器可以通过网络连接至该电子装置100。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。As a non-transitory computer-readable storage medium, the memory 102 can be used to store non-transitory software programs and non-transitory computer-executable programs. In addition, the memory 102 may include a high-speed random access memory, and may also include a non-transitory memory, such as at least one magnetic disk storage device, a flash memory device, or other non-transitory solid-state storage devices. In some embodiments, the memory 102 may optionally include a memory 102 remotely provided with respect to the processor, and these remote memories may be connected to the electronic device 100 via a network. Examples of the aforementioned networks include, but are not limited to, the Internet, corporate intranets, local area networks, mobile communication networks, and combinations thereof.
本领域技术人员可以理解,图4中示出的装置结构并不构成对电子装置100的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。Those skilled in the art can understand that the device structure shown in FIG. 4 does not constitute a limitation on the electronic device 100, and may include more or fewer components than shown in the figure, or a combination of certain components, or a different component arrangement.
参照图1和图4所示,为本发明第一实施例的电子装置,在该实施例中,电子装置100中处理器101可以用于调用存储器102中存储的半自动化图像分割数据标注方法,并执行以下步骤:1 and 4 show the electronic device according to the first embodiment of the present invention. In this embodiment, the processor 101 in the electronic device 100 can be used to call the semi-automatic image segmentation data labeling method stored in the memory 102, And perform the following steps:
步骤S1:显示待标注图像,所述待标注图像包括真实目标区域和真实背景区域;Step S1: Display an image to be annotated, where the image to be annotated includes a real target area and a real background area;
步骤S2:获取位于所述真实目标区域内的多个第一像素坐标,并将多个所述第一像素坐标生成目标区域种子点集合;Step S2: Obtain a plurality of first pixel coordinates located in the real target area, and generate a target area seed point set from the plurality of first pixel coordinates;
步骤S3:获取位于所述真实背景区域内的多个第二像素坐标,并将多个所述 第二像素坐标生成背景区域种子点集合;Step S3: Obtain a plurality of second pixel coordinates located in the real background area, and generate a background area seed point set from the plurality of second pixel coordinates;
步骤S4:基于所述目标区域种子点集合建立目标种子灰度值直方图;Step S4: Establish a target seed gray value histogram based on the target area seed point set;
步骤S5:基于所述背景区域种子点集合建立背景种子灰度值直方图;Step S5: Establish a background seed gray value histogram based on the background area seed point set;
步骤S6:建立用于表示所述待标注图像的无向图,构造能量函数,基于最小割算法对所述待标注图像进行图像分割,获得所述待标注图像被分割后的二值化图片,所述二值化图片包括预测目标区域以及预测背景区域;Step S6: establishing an undirected graph representing the image to be labeled, constructing an energy function, performing image segmentation on the image to be labeled based on the minimum cut algorithm, and obtaining a binarized image after the image to be labeled is segmented, The binarized picture includes a predicted target area and a predicted background area;
步骤S7:基于边缘跟踪算法获取所述二值化图片上所述预测目标区域的最外围坐标点,并将多个所述最外围坐标点生成边缘坐标集合;Step S7: Obtain the outermost coordinate points of the predicted target area on the binarized image based on an edge tracking algorithm, and generate an edge coordinate set from a plurality of the outermost coordinate points;
步骤S8:基于所述边缘坐标集合生成所述预测目标区域的轮廓,并在所述待标注图像上高亮显示所述轮廓;Step S8: Generate a contour of the prediction target area based on the edge coordinate set, and highlight the contour on the image to be annotated;
步骤S9:判断是否接收到完全选中指令,若是,则获取所述目标区域的类别信息,并将所述边缘坐标集合和所述类别信息保存为json文件;若否,则返回步骤S1。Step S9: Determine whether a complete selection instruction is received, if yes, obtain the category information of the target area, and save the edge coordinate set and the category information as a json file; if not, return to step S1.
基于上述电子装置的硬件结构,提出本发明的一种半自动化图像分割数据标注方法的各个实施例。Based on the hardware structure of the above electronic device, various embodiments of a semi-automatic image segmentation data labeling method of the present invention are proposed.
参照图1所示,为本发明第二实施例的半自动化图像分割数据标注方法,包括以下步骤:Referring to FIG. 1, it is a semi-automatic image segmentation data labeling method according to a second embodiment of the present invention, which includes the following steps:
步骤S1:显示待标注图像,所述待标注图像包括真实目标区域和真实背景区域;Step S1: Display an image to be annotated, where the image to be annotated includes a real target area and a real background area;
步骤S2:获取位于所述真实目标区域内的多个第一像素坐标,并将多个所述第一像素坐标生成目标区域种子点集合;Step S2: Obtain a plurality of first pixel coordinates located in the real target area, and generate a target area seed point set from the plurality of first pixel coordinates;
步骤S3:获取位于所述真实背景区域内的多个第二像素坐标,并将多个所述第二像素坐标生成背景区域种子点集合;Step S3: Obtain a plurality of second pixel coordinates located in the real background area, and generate a background area seed point set from the plurality of second pixel coordinates;
步骤S4:基于所述目标区域种子点集合建立目标种子灰度值直方图;Step S4: Establish a target seed gray value histogram based on the target area seed point set;
步骤S5:基于所述背景区域种子点集合建立背景种子灰度值直方图;Step S5: Establish a background seed gray value histogram based on the background area seed point set;
步骤S6:建立用于表示所述待标注图像的无向图,构造能量函数,基于最小 割算法对所述待标注图像进行图像分割,获得所述待标注图像被分割后的二值化图片,所述二值化图片包括预测目标区域以及预测背景区域;Step S6: establishing an undirected graph representing the image to be labeled, constructing an energy function, performing image segmentation on the image to be labeled based on the minimum cut algorithm, and obtaining a binarized image after the image to be labeled is segmented, The binarized picture includes a predicted target area and a predicted background area;
步骤S7:基于边缘跟踪算法获取所述二值化图片上所述预测目标区域的最外围坐标点,并将多个所述最外围坐标点生成边缘坐标集合;Step S7: Obtain the outermost coordinate points of the predicted target area on the binarized image based on an edge tracking algorithm, and generate an edge coordinate set from a plurality of the outermost coordinate points;
步骤S8:基于所述边缘坐标集合生成所述预测目标区域的轮廓,并在所述待标注图像上高亮显示所述轮廓;Step S8: Generate a contour of the prediction target area based on the edge coordinate set, and highlight the contour on the image to be annotated;
步骤S9:判断是否接收到完全选中指令,若是,则获取所述目标区域的类别信息,并将所述边缘坐标集合和所述类别信息保存为json文件;若否,则返回步骤S1。Step S9: Determine whether a complete selection instruction is received, if yes, obtain the category information of the target area, and save the edge coordinate set and the category information as a json file; if not, return to step S1.
本实施例中,可以通过PC电脑屏幕或者平板电脑屏幕将待标注图像显示给标注员,待标注图像上包括真实目标区域以及真实背景区域。例如,待标注图像中有草坪和足球,标注员判断足球属于真实目标区域,草坪属于真实背景区域。获取第一像素坐标的方法是,标注员在PC电脑上利用持续按下鼠标左键并在真实目标区域内拖动,以获取鼠标拖动经过的真实目标区域的像素点作为第一像素坐标,为了便于反馈给标注员,可以第一像素点在真实目标区域显示出来。在获得第一像素坐标后,再以同样的方式在真实背景区域获取第二像素坐标。In this embodiment, the image to be annotated can be displayed to the annotator through a PC computer screen or a tablet computer screen, and the image to be annotated includes a real target area and a real background area. For example, if there are lawns and footballs in the image to be labeled, the tagger judges that the footballs belong to the real target area and the lawns belong to the real background area. The method of obtaining the first pixel coordinates is to use the labeler to press and hold down the left mouse button on the PC and drag in the real target area to obtain the pixels of the real target area that the mouse drags through as the first pixel coordinates. In order to facilitate feedback to the annotator, the first pixel can be displayed in the real target area. After the first pixel coordinate is obtained, the second pixel coordinate is obtained in the real background area in the same manner.
本实施例中,首先用一个无向图G=<V,E>表示要分割的图像(即待标注图像),V和E分别是顶点(vertex)和边(edge)的集合。此处的图和普通的图稍有不同。普通的图由顶点和边构成,如果边的有方向的,这样的图被则称为有向图,否则为无向图,且边是有权值的,不同的边可以有不同的权值,分别代表不同的物理意义。而本实施例中的图是在普通的图基础上多了2个顶点,这2个顶点分别用符号”S”和”T”表示,统称为终端顶点。其它所有的顶点都必须和这2个顶点相连形成边集合中的一部分。所以本实施例中用来表示待标注图像的图中有两种顶点,也有两种边。In this embodiment, first, an undirected graph G=<V, E> is used to represent the image to be divided (ie, the image to be labeled), and V and E are sets of vertices and edges, respectively. The picture here is slightly different from the normal picture. An ordinary graph is composed of vertices and edges. If the edges are directional, such a graph is called a directed graph, otherwise it is an undirected graph, and the edges are weighted. Different edges can have different weights. , Respectively represent different physical meanings. However, the graph in this embodiment has two more vertices on the basis of the ordinary graph, and these two vertices are represented by the symbols "S" and "T", and are collectively referred to as terminal vertices. All other vertices must be connected to these 2 vertices to form part of the edge set. Therefore, there are two types of vertices and two types of edges in the graph used to represent the image to be annotated in this embodiment.
第一种顶点和边是:第一种普通顶点对应于图像中的每个像素。每两个邻域顶点(对应于图像中每两个邻域像素)的连接就是一条边。这种边也叫n-links。The first type of vertices and edges are: the first type of ordinary vertices corresponds to each pixel in the image. The connection of every two neighboring vertices (corresponding to every two neighboring pixels in the image) is an edge. This kind of edge is also called n-links.
第二种顶点和边是:除图像像素外,还有另外两个终端顶点,分别叫S和T。每个普通顶点和这2个终端顶点之间都有连接,组成第二种边。这种边也叫t-links。The second type of vertices and edges are: In addition to image pixels, there are two other terminal vertices, called S and T respectively. There is a connection between each ordinary vertex and these 2 terminal vertices, forming the second type of edge. Such edges are also called t-links.
图3示出了图像对应的s-t图,每个像素对应图中的一个相应顶点,另外还有s和t两个顶点。图3中的图有两种边,其中一种边表示每两个邻域普通顶点连接的边n-links,另外一种边表示每个普通顶点与s和t连接的边t-links。在前后景分割中,s一般表示前景目标(即真实目标区域),t一般表示背景(即真实背景区域)。Figure 3 shows the s-t graph corresponding to the image, each pixel corresponds to a corresponding vertex in the graph, and there are two vertices s and t. The graph in Figure 3 has two kinds of edges, one of which represents the n-links that connect every two ordinary vertices in the neighborhood, and the other represents the t-links that connects every ordinary vertex to s and t. In the foreground and background segmentation, s generally represents the foreground target (ie, the real target area), and t generally represents the background (ie, the real background area).
图3中的图每条边都有一个非负的权值we,也可以理解为代价。一个cut(割)就是图中边集合E的一个子集C,那这个割的代价(表示为|C|)就是边的子集C的所有边的权值的总和。Each edge of the graph in Figure 3 has a non-negative weight we, which can also be understood as a cost. A cut is a subset C of the edge set E in the graph, and the cost of this cut (represented as |C|) is the sum of the weights of all the edges of the edge subset C.
本实施例需要求出这样的一个边的集合,这个边的集合包括了上面2种边,该集合中所有边的断开会导致残留”S”和”T”图的分开,所以就称这个边的集合为“割”。如果一个割,它的边的所有权值之和最小,那么这个就称为最小割,也就是图割的结果。而福特-富克森定理表明,网路的最大流max flow与最小割min cut相等。所以由Boykov和Kolmogorov发明的max-flow/min-cut算法就可以用来获得s-t图的最小割。这个最小割把图的顶点划分为两个不相交的子集S和T,其中s∈S,t∈T和S∪T=V。这两个子集就对应于图像的前景像素集(即真实目标区域)和背景像素集(即真实背景区域),那就相当于完成了图像分割。This embodiment needs to require such a set of edges. This set of edges includes the above two types of edges. The disconnection of all edges in the set will result in the separation of the remaining "S" and "T" graphs, so it is called this The set of edges is "cut". If a cut has the smallest sum of the ownership values of its edges, then this is called the minimum cut, which is the result of the graph cut. The Ford-Folkson theorem shows that the maximum flow max flow of the network is equal to the minimum cut min cut. Therefore, the max-flow/min-cut algorithm invented by Boykov and Kolmogorov can be used to obtain the minimum cut of the s-t graph. This minimum cut divides the vertices of the graph into two disjoint subsets S and T, where s∈S, t∈T and S∪T=V. These two subsets correspond to the foreground pixel set (ie, the real target area) and the background pixel set (ie, the real background area) of the image, which is equivalent to completing the image segmentation.
图中边的权值就决定了最后的分割结果,边的权值可以通过能量函数来确定。图像分割可以看成像素标记问题,目标(s-node)的label设为1,背景(t-node)的label设为0,这个过程可以通过最小化图割来最小化能量函数得到。那很明显,发生在目标和背景的边界处的cut就是我们想要的(相当于把图像中背景和目标连接的地方割开,那就相当于把其分割了)。同时,这时候能量也应该是最小的。假设整幅图像的每个像素的标签为L={l
1,l
2,,,l
p},其中l
i为0(表 示该像素点位于真实背景区域)或者1(表示该像素点位于真实目标区域)。那假设图像的分割为L时,图像的能量可以表示为:
The weight of the edge in the graph determines the final segmentation result, and the weight of the edge can be determined by the energy function. Image segmentation can be regarded as a pixel labeling problem. The label of the target (s-node) is set to 1, and the label of the background (t-node) is set to 0. This process can be obtained by minimizing the graph cut to minimize the energy function. It is obvious that the cut that occurs at the boundary between the target and the background is what we want (equivalent to cutting off the connection between the background and the target in the image, which is equivalent to dividing it). At the same time, the energy should also be minimal at this time. Suppose the label of each pixel of the entire image is L={l 1 ,l 2 ,,,l p }, where l i is 0 (indicating that the pixel is located in the real background area) or 1 (indicating that the pixel is located in the real background area) target area). Assuming that the segmentation of the image is L, the energy of the image can be expressed as:
E(L)=aR(L)+B(L)E(L)=aR(L)+B(L)
其中,R(L)为区域项,B(L)为边界项,而a就是区域项和边界项之间的重要因子,决定它们对能量的影响大小。如果a为0,那么就只考虑边界因素,不考虑区域因素。E(L)表示的是权值,即损失函数,也叫能量函数,图割的目标就是优化能量函数使其值达到最小。Among them, R(L) is the regional term, B(L) is the boundary term, and a is an important factor between the regional term and the boundary term, which determines their influence on energy. If a is 0, then only boundary factors are considered, and regional factors are not considered. E(L) represents the weight, that is, the loss function, also called the energy function. The goal of the graph cut is to optimize the energy function to minimize its value.
对于区域项,
其中R
p(l
p)表示像素p分配标签l
p的惩罚,R
p(l
p)能量项的权值可以通过比较像素p的灰度和给定的目标和前景的灰度直方图来获得,换句话说就是像素p属于标签l
p的概率,我希望像素p分配为其概率最大的标签l
p,这时候我们希望能量最小,所以一般取概率的负对数值,故t-link的权值如下:
For area items, Among them, R p (l p ) represents the penalty for assigning label l p to pixel p, and the weight of the energy term of R p (l p ) can be obtained by comparing the gray level of pixel p with the gray histogram of the given target and foreground In other words, it is the probability that the pixel p belongs to the label l p . I want the pixel p to be assigned to its label l p with the highest probability. At this time, we want the energy to be the smallest, so we generally take the negative logarithm of the probability, so the weight of t-link The values are as follows:
R
p(1)=-ln Pr(L
p|'obj');R
p(0)=-ln Pr(L
p|'bkg')
R p (1)=-ln Pr(L p |'obj'); R p (0)=-ln Pr(L p |'bkg')
由上面两个公式可以看到,当像素p的灰度值属于目标的概率Pr(Ip|’obj’)大于背景Pr(L
p|'bkg'),那么R
p(1)就小于R
p(0)
,也就是说当像素p更有可能属于目标时
,将p归类为目标就会使能量R(L)小。那么,如果全部的像素都被正确划分为目标或者背景,那么这时候能量就是最小的。
It can be seen from the above two formulas that when the probability that the gray value of the pixel p belongs to the target Pr(Ip|'obj') is greater than the background Pr(L p |'bkg'), then R p (1) is less than R p (0) , which means that when the pixel p is more likely to belong to the target , classifying p as the target will make the energy R(L) small. Then, if all pixels are correctly divided into targets or backgrounds, then the energy is the smallest at this time.
对于边界项,For the boundary term,
其中,p和q为邻域像素,边界项主要体现分割L的边界属性,B<p,q>可以解析为像素p和q之间不连续的惩罚,一般来说如果p和q越相似(例如它们的灰度),那么B
<p,q>越大,如果他们非常不同,那么B
<p,q>就接近于0。换句话说,如果两邻域像素差别很小,那么它属于同一个目标或者同一背景的可能性就很大,如果他们的差别很大,那说明这两个像素很有可能处于目标和背景的边缘部 分,则被分割开的可能性比较大,所以当两邻域像素差别越大,B
<p,q>越小,即能量越小。
Among them, p and q are neighborhood pixels, and the boundary term mainly reflects the boundary attributes of segmenting L. B<p,q> can be parsed as a penalty for discontinuity between pixels p and q. Generally speaking, if p and q are more similar ( For example, their gray scale), then the larger B <p,q> , if they are very different, then B <p,q> is close to 0. In other words, if the difference between the two neighboring pixels is very small, then it is likely to belong to the same target or the same background. If they are very different, it means that the two pixels are likely to be between the target and the background. The edge part is more likely to be divided, so when the difference between the two neighboring pixels is larger, the smaller B<p,q> , that is, the smaller the energy.
本实施例中,将一幅图像分为目标和背景两个不相交的部分,运用图分割技术来实现。首先,图由顶点和边来组成,边有权值。那我们需要构建一个图,这个图有两类顶点,两类边和两类权值。普通顶点由图像每个像素组成,然后每两个邻域像素之间存在一条边,它的权值由上面说的“边界项”来决定。还有两个终端顶点s(目标)和t(背景),每个普通顶点和s都存在连接,也就是边,边的权值由“区域能量项”R
p(1)来决定,每个普通顶点和t连接的边的权值由“区域能量项”R
p(0)来决定。这样所有边的权值就可以确定了,也就是图就确定了。这时候,就可以通过min cut(最小割)算法来找到最小的割,这个min cut就是权值和最小的边的集合,这些边的断开恰好可以使目标和背景被分割开,也就是min cut对应于能量的最小化。而min cut和图的max flow(最小割)是等效的,故可以通过max flow算法来找到s-t图的min cut。
In this embodiment, an image is divided into two disjoint parts of the target and the background, and the image segmentation technology is used to achieve this. First, the graph is composed of vertices and edges, and edges have weights. Then we need to build a graph, this graph has two types of vertices, two types of edges and two types of weights. An ordinary vertex is composed of each pixel of the image, and then there is an edge between every two neighboring pixels, and its weight is determined by the "boundary term" mentioned above. There are also two terminal vertices s (target) and t (background). Every common vertex and s have a connection, that is, an edge. The weight of the edge is determined by the "regional energy term" R p (1), each The weight of the edge connecting the common vertex and t is determined by the "regional energy term" R p (0). In this way, the weights of all edges can be determined, that is, the graph is determined. At this time, the min cut (minimum cut) algorithm can be used to find the smallest cut. This min cut is the set of weights and the smallest edges. The disconnection of these edges can just cause the target and the background to be separated, that is, min. Cut corresponds to the minimization of energy. The min cut and the max flow (minimum cut) of the graph are equivalent, so the min cut of the st graph can be found through the max flow algorithm.
本实施例使用图结构来表示待标注图像,并构建能量函数,基于最小割或最大流算法实现待标注图像的分割。并且图像分割后,图像会被分为即预测目标区域和预测背景区域两部分,然后将预测目标区域的所有像素设置为黑色,预测背景区域的所有像素设置为白色,所以待标注图像的所有像素也就被分为了两个数值,黑色和白色,这就是二值化图片。根据这些数据对二值化图片使用边缘跟踪算法获取二值化图片上所述预测目标区域的最外围坐标点。In this embodiment, a graph structure is used to represent the image to be labeled, an energy function is constructed, and the image to be labeled is segmented based on the minimum cut or maximum flow algorithm. And after the image is segmented, the image will be divided into two parts, namely the prediction target area and the prediction background area, and then all the pixels in the prediction target area are set to black, and all the pixels in the predicted background area are set to white, so all pixels of the image to be labeled It is divided into two values, black and white, which is the binary image. According to these data, an edge tracking algorithm is used on the binarized picture to obtain the outermost coordinate points of the prediction target area on the binarized picture.
由于本实施例中使用图分割算法来自动获取预测目标区域的最外围坐标点,因此,为了便于标注员确定本次图分割获得的预测目标区域的最外围坐标点的准确性,基于所述边缘坐标集合生成所述预测目标区域的轮廓,并在所述待标注图像上高亮显示所述轮廓,便于标注员进行对比。Since the graph segmentation algorithm is used in this embodiment to automatically obtain the outermost coordinate points of the predicted target area, in order to facilitate the annotator to determine the accuracy of the outermost coordinate points of the predicted target area obtained by this graph segmentation, it is based on the edge The coordinate set generates the contour of the prediction target area, and the contour is highlighted on the to-be-labeled image, which is convenient for the labeler to compare.
当标注员认为,本次获得的预测目标区域是理想的,则可以向第一实施例中的电子装置发送“完全选中指令”,例如,该指令可以通过敲打键盘上的回车键发送。接着,显示屏会显示输入类别信息的界面,标注员通过键盘输入与该预测 目标区域对应的类别信息,例如足球,然后边缘坐标集合和所述类别信息保存为json文件,完成半自动化标注。另外一种情况下,标注员认为本次获得的预测目标区域不理想时,可以向第一实施例中的电子装置发送“未完全选中指令”,例如标注员可以通过敲打键盘上的空格键向电子装置发送该指令,则在接收到该指令后,电子装置重新执行步骤S1。因此,本实施例提供的半自动化图像分割数据标注方法,在提高了标注效率的同时,也可以允许标注员控制标注的精度。When the annotator thinks that the predicted target area obtained this time is ideal, he can send a "full selection instruction" to the electronic device in the first embodiment. For example, the instruction can be sent by hitting the enter key on the keyboard. Next, the display screen will display an interface for inputting category information. The labeler inputs category information corresponding to the predicted target area through the keyboard, such as football, and then the edge coordinate set and the category information are saved as a json file to complete the semi-automatic labeling. In another case, when the annotator thinks that the predicted target area obtained this time is not ideal, he can send an "incomplete selection instruction" to the electronic device in the first embodiment. For example, the annotator can press the space bar on the keyboard to The electronic device sends the instruction, and after receiving the instruction, the electronic device executes step S1 again. Therefore, the semi-automatic image segmentation data labeling method provided in this embodiment improves labeling efficiency while also allowing labelers to control the accuracy of labeling.
本发明第三实施例的半自动化图像分割数据标注方法,在第二实施例的基础上,所述步骤S7中的“将多个所述最外围坐标点生成边缘坐标集合”,还包括以下步骤:In the semi-automatic image segmentation data labeling method of the third embodiment of the present invention, on the basis of the second embodiment, the step S7 of "generating an edge coordinate set from a plurality of the outermost coordinate points" further includes the following steps :
步骤S7.1:将全部的所述最外围坐标点建立集合A,建立集合A',将集合A中的任一个所述最外围坐标点p
0添加到集合A'中,将集合A中除了p
0之外的其他所述最外围坐标点建立集合A
2,将第一标记坐标点p置为p
0;
Step S7.1: Create a set A of all the outermost coordinate points, establish a set A', add any one of the outermost coordinate points p 0 in the set A to the set A', except for the set A The outermost coordinate points other than p 0 establish a set A 2 , and the first marked coordinate point p is set to p 0 ;
步骤S7.2:判断集合A
2中元素数量是否为零,若否,则执行步骤S7.2a,若是,则执行步骤S7.2b;
Step S7.2: Determine whether the number of elements in the set A 2 is zero, if not, execute step S7.2a, if yes, execute step S7.2b;
步骤S7.2a:计算集合A
2中的所有最外围坐标点与第一标记坐标点p的距离d,将第一标记坐标点p置为集合A
2中对应于距离d的最小值的所述最外围坐标点,将集合A
2中对应于距离d的最小值的所述最外围坐标点添加到集合A'中并从集合A
2中删除,返回步骤S7.2;
Step S7.2a: Calculate the distance d between all the outermost coordinate points in the set A 2 and the first mark coordinate point p, and set the first mark coordinate point p to the minimum value of the distance d in the set A 2 The outermost coordinate point, the outermost coordinate point corresponding to the minimum distance d in the set A 2 is added to the set A'and deleted from the set A 2 , and the step S7.2 is returned;
步骤S7.2b:按照所述最外围坐标点添加到所述集合A'中的顺序将所述最外围坐标点进行排序;Step S7.2b: Sort the outermost coordinate points according to the order in which the outermost coordinate points are added to the set A';
步骤S7.3:建立边缘坐标集合,将p
0添加到所述边缘坐标集合中并从集合A'删除,将第二标记坐标点p'置为p
0;
Step S7.3: Establish an edge coordinate set, add p 0 to the edge coordinate set and delete it from the set A', and set the second mark coordinate point p'to p 0 ;
步骤S7.4:判断集合A'中的元素数量是否为一,若否,则执行步骤S7.4a,若是,则执行步骤S7.4b;Step S7.4: Determine whether the number of elements in the set A'is one, if not, execute step S7.4a, if yes, execute step S7.4b;
步骤S7.4a:判断第二标记坐标点p'与集合A'中排序位于最前两位的两个所述最外围坐标点是否三点共线,若是,则执行步骤S7.4a1,若否,则执行步 骤S7.4a2;Step S7.4a: Determine whether the second mark coordinate point p'and the two outermost coordinate points in the top two positions in the set A'are three-point collinear, if yes, perform step S7.4a1, if not, Step S7.4a2 is executed;
步骤S7.4a1:将集合A'中排序位于最前一位的所述最外围坐标点从集合A'中删除,返回步骤S7.4;Step S7.4a1: delete the outermost coordinate point that is ranked first in the set A'from the set A', and return to step S7.4;
步骤S7.4a2:将第二标记坐标点p'置为集合A'中排序位于最前一位的所述最外围坐标点,将集合A'中排序位于最前一位的所述最外围坐标点添加到所述边缘坐标集合中并从集合A'中删除,返回步骤S7.4;Step S7.4a2: Set the second marking coordinate point p'as the outermost coordinate point ranked at the top in the set A', and add the outermost coordinate point ranked at the top in the set A' To the edge coordinate set and delete from the set A', return to step S7.4;
步骤S7.4b:将集合A'中的所述最外围坐标点添加到所述边缘坐标集合中,输出所述边缘坐标集合。Step S7.4b: Add the outermost coordinate points in the set A'to the edge coordinate set, and output the edge coordinate set.
纯人工标注的时候,对于真实目标区域的部分轮廓是直线的情况,例如,对于真实目标区域是正方形的情况,标注员一般只取用鼠标点击该正方形的四个顶点,两个相邻顶点之间拉一条直线。因此,表示该正方形的坐标只需要四个像素点,极大减少了数据量。而使用半自动化标注方式时,由于预测目标区域的边缘坐标是通过边缘跟踪算法获取,都是一系列互为邻域的像素点组成,造成数据量很大。In purely manual labeling, for the case where part of the outline of the real target area is a straight line, for example, for the case where the real target area is a square, the annotator generally only uses the mouse to click on the four vertices of the square, and one of the two adjacent vertices Draw a straight line between. Therefore, only four pixels are needed to represent the coordinates of the square, which greatly reduces the amount of data. When the semi-automatic labeling method is used, since the edge coordinates of the predicted target area are obtained by the edge tracking algorithm, they are composed of a series of pixels that are neighbors, resulting in a large amount of data.
基于此,本实施例提供了将获得预测目标区域的最外围坐标点进行简化的算法。该算法包括两部分,第一部分是步骤S7.1至步骤S7.2,这些步骤将获得的预测目标区域的最外围坐标点按照所述最外围坐标点添加到所述集合A'中的顺序进行排序。如果按照所述最外围坐标点添加到所述集合A'中的顺序依次经过这些最外围坐标点刚好就是围成了预测目标区域的轮廓。因此,由步骤S7.3至步骤S7.4组成的第二部分,就是按照所述最外围坐标点添加到所述集合A'中的顺序依次检验该轮廓上相邻的三个点是否共线,如果共线则把中间的点去掉,只保留首尾两个点,实现人工标注时的效果,减少了半自动标注产生的数据量。Based on this, this embodiment provides a simplified algorithm for obtaining the outermost coordinate points of the predicted target area. The algorithm includes two parts. The first part is steps S7.1 to S7.2. These steps are performed in the order in which the outermost coordinate points of the obtained prediction target area are added to the set A'. Sort. If the outermost coordinate points are passed through the outermost coordinate points in the order in which the outermost coordinate points are added to the set A′, it will just form the contour of the prediction target area. Therefore, the second part consisting of step S7.3 to step S7.4 is to check whether the three adjacent points on the contour are collinear in the order in which the outermost coordinate points are added to the set A'. , If collinear, remove the middle point, only keep the first and last two points, realize the effect of manual labeling, and reduce the amount of data generated by semi-automatic labeling.
本发明第四实施例的半自动化图像分割数据标注方法,基于第二实施例和第三实施例基础上,所述步骤S8中的“基于所述边缘坐标集合生成所述预测目标区域的轮廓,并在所述待标注图像上高亮显示所述轮廓”,还包括以下步骤:The semi-automatic image segmentation data labeling method of the fourth embodiment of the present invention is based on the second embodiment and the third embodiment. In the step S8, "generate the contour of the prediction target area based on the edge coordinate set, And highlighting the outline on the image to be annotated," also includes the following steps:
步骤S8.1:在所述待标注图像上,将相邻顺序添加到所述边缘坐标集合中 的两个所述最外围坐标点之间用直线连接;Step S8.1: On the image to be annotated, the two outermost coordinate points that are added to the edge coordinate set in an adjacent order are connected by a straight line;
步骤S8.2:在所述待标注图像上,将最后顺序添加到所述边缘坐标集合中的所述最外围坐标点与p
0之间用直线连接;
Step S8.2: On the image to be annotated, the outermost coordinate point added to the edge coordinate set in the last order is connected with p 0 by a straight line;
步骤S8.3:将所述直线在所述待标注图像上经过的像素点生成所述轮廓,并高亮显示所述直线对应的像素点。Step S8.3: Generate the contour from the pixels passing by the straight line on the image to be annotated, and highlight the pixels corresponding to the straight line.
基于第三实施例,可知,按照所述最外围坐标点添加到所述集合A'中的顺序依次经过这些最外围坐标点刚好就是围成了预测目标区域的轮廓。因此,在使用第三实施例减少半自动化标注数据量的同时,也同时便于生成预测目标区域的轮廓,减少了生成轮廓时的运算时间,提升了算法的效率。同时,对轮廓的像素进行亮度调高、颜色调白的操作,方便用户识别目前选中区域的边缘。Based on the third embodiment, it can be known that passing through these outermost coordinate points in the order in which the outermost coordinate points are added to the set A′ just encloses the contour of the prediction target area. Therefore, while using the third embodiment to reduce the amount of semi-automatic labeling data, it also facilitates the generation of the contour of the predicted target area, reduces the calculation time when generating the contour, and improves the efficiency of the algorithm. At the same time, the pixels of the outline are increased in brightness and white in color, which is convenient for the user to identify the edge of the currently selected area.
本发明第五实施例的半自动化图像分割数据标注方法,基于第四实施例基础上,所述步骤S8还包括:The semi-automatic image segmentation data labeling method of the fifth embodiment of the present invention is based on the fourth embodiment, and the step S8 further includes:
步骤S8.4:对待标注图像上的所述预测目标区域进行阴影处理。Step S8.4: Perform shadow processing on the prediction target area on the image to be labeled.
本实施例中,将预测目标区域进行阴影处理,输出为局部区域变暗的图像,作用是方便用户识别目标物体中被选中的局部区域。In this embodiment, shadow processing is performed on the predicted target area, and the output is a darkened image of the local area. The function is to facilitate the user to identify the selected local area in the target object.
本发明第五实施例的计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令用于执行如上述第二至第五实施例任一项所述的半自动化图像分割数据标注方法。The computer-readable storage medium of the fifth embodiment of the present invention stores computer-executable instructions for executing the semi-automated image segmentation data labeling according to any one of the second to fifth embodiments above method.
参照图2所示,为使用本发明实施例的半自动化图像分割数据标注方法对待标注图像处理的效果图。Referring to FIG. 2, it is an effect diagram of image processing to be labeled using the semi-automatic image segmentation data labeling method of an embodiment of the present invention.
第一步,待标注图像显示在电脑屏幕上,待标注图像上的真实目标区域是足球,其真实背景区域是草坪。In the first step, the image to be labeled is displayed on the computer screen. The real target area on the image to be labeled is a football, and the real background area is a lawn.
第二步,标注员将鼠标移动到足球上,单击鼠标左键不放然后拖动鼠标在足球上画一笔;In the second step, the marker moves the mouse on the football, clicks the left button of the mouse, and then drags the mouse to draw a stroke on the football;
第三步,标注员将鼠标移动到草坪上,单击鼠标左键不放然后拖动鼠标在草坪上画一笔;In the third step, the marker moves the mouse to the lawn, clicks the left button of the mouse, and then drags the mouse to draw a stroke on the lawn;
第四步,通过图分割自动获取足球的轮廓坐标并将足球阴影处理;The fourth step is to automatically obtain the contour coordinates of the football through image segmentation and process the shadow of the football;
第五步,通过执行本发明第三实施例的简化算法,将图分割自动获取的足球轮廓坐标位于同一执行上的点进行合并,减少数据量。In the fifth step, by executing the simplified algorithm of the third embodiment of the present invention, the points where the football contour coordinates automatically obtained by the graph segmentation are located on the same execution are merged, thereby reducing the amount of data.
上面结合附图对本发明实施例作了详细说明,但是本发明不限于上述实施例,在所述技术领域普通技术人员所具备的知识范围内,还可以在不脱离本发明宗旨的前提下作出各种变化。The embodiments of the present invention are described in detail above with reference to the accompanying drawings, but the present invention is not limited to the above-mentioned embodiments, and within the scope of knowledge possessed by a person of ordinary skill in the technical field, it is possible to make various changes without departing from the purpose of the present invention. Kind of change.
Claims (5)
- 一种半自动化图像分割数据标注方法,其特征在于,包括以下步骤:A semi-automatic image segmentation data labeling method is characterized in that it comprises the following steps:步骤S1:显示待标注图像,所述待标注图像包括真实目标区域和真实背景区域;Step S1: Display an image to be annotated, where the image to be annotated includes a real target area and a real background area;步骤S2:获取位于所述真实目标区域内的多个第一像素坐标,并将多个所述第一像素坐标生成目标区域种子点集合;Step S2: Obtain a plurality of first pixel coordinates located in the real target area, and generate a target area seed point set from the plurality of first pixel coordinates;步骤S3:获取位于所述真实背景区域内的多个第二像素坐标,并将多个所述第二像素坐标生成背景区域种子点集合;Step S3: Obtain a plurality of second pixel coordinates located in the real background area, and generate a background area seed point set from the plurality of second pixel coordinates;步骤S4:基于所述目标区域种子点集合建立目标种子灰度值直方图;Step S4: Establish a target seed gray value histogram based on the target area seed point set;步骤S5:基于所述背景区域种子点集合建立背景种子灰度值直方图;Step S5: Establish a background seed gray value histogram based on the background area seed point set;步骤S6:建立用于表示所述待标注图像的无向图,构造能量函数,基于最小割算法对所述待标注图像进行图像分割,获得所述待标注图像被分割后的二值化图片,所述二值化图片包括预测目标区域以及预测背景区域;Step S6: establishing an undirected graph representing the image to be labeled, constructing an energy function, performing image segmentation on the image to be labeled based on the minimum cut algorithm, and obtaining a binarized image after the image to be labeled is segmented, The binarized picture includes a predicted target area and a predicted background area;步骤S7:基于边缘跟踪算法获取所述二值化图片上所述预测目标区域的最外围坐标点,并将多个所述最外围坐标点生成边缘坐标集合;Step S7: Obtain the outermost coordinate points of the predicted target area on the binarized image based on an edge tracking algorithm, and generate an edge coordinate set from a plurality of the outermost coordinate points;步骤S8:基于所述边缘坐标集合生成所述预测目标区域的轮廓,并在所述待标注图像上高亮显示所述轮廓;Step S8: Generate a contour of the prediction target area based on the edge coordinate set, and highlight the contour on the image to be annotated;步骤S9:判断是否接收到完全选中指令,若是,则获取所述目标区域的类别信息,并将所述边缘坐标集合和所述类别信息保存为json文件;若否,则返回步骤S1。Step S9: Determine whether a complete selection instruction is received, if yes, obtain the category information of the target area, and save the edge coordinate set and the category information as a json file; if not, return to step S1.
- 根据权利要求1所述的一种半自动化图像分割数据标注方法,其特征在于,所述步骤S7中的“将多个所述最外围坐标点生成边缘坐标集合”,还包括以下步骤:A semi-automatic image segmentation data labeling method according to claim 1, wherein the step S7 of "generating a plurality of outermost coordinate points into an edge coordinate set" further comprises the following steps:步骤S7.1:将全部的所述最外围坐标点建立集合A,建立集合A',将集合A中的任一个所述最外围坐标点p 0添加到集合A'中,将集合A中除了p 0之外的其他所述最外围坐标点建立集合A 2,将第一标记坐标点p置为p 0; Step S7.1: Create a set A of all the outermost coordinate points, establish a set A', add any one of the outermost coordinate points p 0 in the set A to the set A', except for the set A The outermost coordinate points other than p 0 establish a set A 2 , and the first marked coordinate point p is set to p 0 ;步骤S7.2:判断集合A 2中元素数量是否为零,若否,则执行步骤S7.2a,若是,则执行步骤S7.2b; Step S7.2: Determine whether the number of elements in the set A 2 is zero, if not, execute step S7.2a, if yes, execute step S7.2b;步骤S7.2a:计算集合A 2中的所有最外围坐标点与第一标记坐标点p的距离d,将第一标记坐标点p置为集合A 2中对应于距离d的最小值的所述最外围坐标点,将集合A 2中对应于距离d的最小值的所述最外围坐标点添加到集合A'中并从集合A 2中删除,返回步骤S7.2; Step S7.2a: Calculate the distance d between all the outermost coordinate points in the set A 2 and the first mark coordinate point p, and set the first mark coordinate point p to the minimum value of the distance d in the set A 2 The outermost coordinate point, the outermost coordinate point corresponding to the minimum distance d in the set A 2 is added to the set A'and deleted from the set A 2 , and the step S7.2 is returned;步骤S7.2b:按照所述最外围坐标点添加到所述集合A'中的顺序将所述最外围坐标点进行排序;Step S7.2b: Sort the outermost coordinate points according to the order in which the outermost coordinate points are added to the set A';步骤S7.3:建立边缘坐标集合,将p 0添加到所述边缘坐标集合中并从集合A'删除,将第二标记坐标点p'置为p 0; Step S7.3: Establish an edge coordinate set, add p 0 to the edge coordinate set and delete it from the set A', and set the second mark coordinate point p'to p 0 ;步骤S7.4:判断集合A'中的元素数量是否为一,若否,则执行步骤S7.4a,若是,则执行步骤S7.4b;Step S7.4: Determine whether the number of elements in the set A'is one, if not, execute step S7.4a, if yes, execute step S7.4b;步骤S7.4a:判断第二标记坐标点p'与集合A'中排序位于最前两位的两个所述最外围坐标点是否三点共线,若是,则执行步骤S7.4a1,若否,则执行步骤S7.4a2;Step S7.4a: Determine whether the second mark coordinate point p'and the two outermost coordinate points in the top two positions in the set A'are three-point collinear, if yes, perform step S7.4a1, if not, Step S7.4a2 is executed;步骤S7.4a1:将集合A'中排序位于最前一位的所述最外围坐标点从集合A'中删除,返回步骤S7.4;Step S7.4a1: delete the outermost coordinate point that is ranked first in the set A'from the set A', and return to step S7.4;步骤S7.4a2:将第二标记坐标点p'置为集合A'中排序位于最前一位的所述最外围坐标点,将集合A'中排序位于最前一位的所述最外围坐标点添加到所述边缘坐标集合中并从集合A'中删除,返回步骤S7.4;Step S7.4a2: Set the second marking coordinate point p'as the outermost coordinate point ranked at the top in the set A', and add the outermost coordinate point ranked at the top in the set A' To the edge coordinate set and delete from the set A', return to step S7.4;步骤S7.4b:将集合A'中的所述最外围坐标点添加到所述边缘坐标集合中,输出所述边缘坐标集合。Step S7.4b: Add the outermost coordinate points in the set A'to the edge coordinate set, and output the edge coordinate set.
- 根据权利要求3所述的一种半自动化图像分割数据标注方法,其特征在于,所述步骤S8还包括:A semi-automatic image segmentation data labeling method according to claim 3, wherein said step S8 further comprises:步骤S8.4:对待标注图像上的所述预测目标区域进行阴影处理。Step S8.4: Perform shadow processing on the prediction target area on the image to be labeled.
- 一种电子装置,包括:存储器、处理器及存储在所述存储器上并可在处理 器上运行的计算机程序,其特征在于:所述处理器执行所述程序时实现如权利要求1至3中任一项所述的半自动化图像分割数据标注方法。An electronic device, comprising: a memory, a processor, and a computer program stored on the memory and capable of running on the processor, characterized in that: the processor executes the program as described in claims 1 to 3 Any one of the semi-automatic image segmentation data labeling methods.
- 计算机可读存储介质,存储有计算机可执行指令,其特征在于:所述计算机可执行指令用于执行如权利要求1至3中任一项所述的半自动化图像分割数据标注方法。A computer-readable storage medium storing computer-executable instructions, wherein the computer-executable instructions are used to execute the semi-automatic image segmentation data labeling method according to any one of claims 1 to 3.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911052198.7 | 2019-10-31 | ||
CN201911052198.7A CN110910401A (en) | 2019-10-31 | 2019-10-31 | Semi-automatic image segmentation data annotation method, electronic device and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021082507A1 true WO2021082507A1 (en) | 2021-05-06 |
Family
ID=69814888
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/100347 WO2021082507A1 (en) | 2019-10-31 | 2020-07-06 | Semi-automated image segmentation and data annotation method, electronic device and storage medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110910401A (en) |
WO (1) | WO2021082507A1 (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110910401A (en) * | 2019-10-31 | 2020-03-24 | 五邑大学 | Semi-automatic image segmentation data annotation method, electronic device and storage medium |
CN111985394B (en) * | 2020-08-19 | 2021-05-28 | 东南大学 | Semi-automatic instance labeling method and system for KITTI data set |
CN113111716B (en) * | 2021-03-15 | 2023-06-23 | 中国科学院计算机网络信息中心 | Remote sensing image semiautomatic labeling method and device based on deep learning |
CN113806573A (en) * | 2021-09-15 | 2021-12-17 | 上海商汤科技开发有限公司 | Labeling method, labeling device, electronic equipment, server and storage medium |
CN114485717A (en) * | 2021-12-30 | 2022-05-13 | 中智行(苏州)科技有限公司 | Road image ground marking method |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104915950A (en) * | 2015-04-29 | 2015-09-16 | 华南理工大学 | Region growth ultrasound image automation segmentation method based on energy constraint |
CN105761266A (en) * | 2016-02-26 | 2016-07-13 | 民政部国家减灾中心 | Method of extracting rectangular building from remote sensing image |
CN106373168A (en) * | 2016-11-24 | 2017-02-01 | 北京三体高创科技有限公司 | Medical image based segmentation and 3D reconstruction method and 3D printing system |
US20170270664A1 (en) * | 2016-03-21 | 2017-09-21 | The Board Of Trustees Of The Leland Stanford Junior University | Methods for characterizing features of interest in digital images and systems for practicing same |
CN110910401A (en) * | 2019-10-31 | 2020-03-24 | 五邑大学 | Semi-automatic image segmentation data annotation method, electronic device and storage medium |
Family Cites Families (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB0818561D0 (en) * | 2008-10-09 | 2008-11-19 | Isis Innovation | Visual tracking of objects in images, and segmentation of images |
CN103227864B (en) * | 2013-04-10 | 2014-12-31 | 深圳市金立通信设备有限公司 | Method of detecting image edge and mobile terminal |
CN103345743B (en) * | 2013-06-18 | 2016-05-25 | 宁波成电泰克电子信息技术发展有限公司 | A kind of image partition method for battery tail end smart flaw detection |
CN104463067B (en) * | 2014-12-04 | 2017-03-22 | 四川大学 | Method for extracting macro blocks of Grid Matrix two-dimensional bar code |
CN110322550B (en) * | 2015-02-16 | 2023-06-20 | 深圳迈瑞生物医疗电子股份有限公司 | Display processing method of three-dimensional imaging data and three-dimensional ultrasonic imaging method and system |
CN106203377B (en) * | 2016-07-20 | 2017-11-28 | 西安科技大学 | A kind of coal dust image-recognizing method |
CN106875397B (en) * | 2017-01-04 | 2020-04-14 | 努比亚技术有限公司 | Method, device and terminal for realizing interactive image segmentation |
CN108960011B (en) * | 2017-05-23 | 2021-12-03 | 湖南生物机电职业技术学院 | Partially-shielded citrus fruit image identification method |
CN109614871B (en) * | 2018-11-13 | 2022-05-13 | 远景能源(南京)软件技术有限公司 | Automatic identification method for photovoltaic roof and photovoltaic barrier |
CN110020657A (en) * | 2019-01-15 | 2019-07-16 | 浙江工业大学 | A kind of bitmap silhouettes coordinate extraction method of cutting |
CN110119741B (en) * | 2019-04-08 | 2022-09-27 | 浙江大学宁波理工学院 | Card image information identification method with background |
CN110355365B (en) * | 2019-06-13 | 2020-11-10 | 西安交通大学 | EBM variable-direction forming dynamic slicing method cooperating with 840D numerical control system |
CN110232365B (en) * | 2019-06-19 | 2021-08-24 | 清华大学 | Method, system and medium for measuring cell contour bending degree in cell image |
-
2019
- 2019-10-31 CN CN201911052198.7A patent/CN110910401A/en active Pending
-
2020
- 2020-07-06 WO PCT/CN2020/100347 patent/WO2021082507A1/en active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104915950A (en) * | 2015-04-29 | 2015-09-16 | 华南理工大学 | Region growth ultrasound image automation segmentation method based on energy constraint |
CN105761266A (en) * | 2016-02-26 | 2016-07-13 | 民政部国家减灾中心 | Method of extracting rectangular building from remote sensing image |
US20170270664A1 (en) * | 2016-03-21 | 2017-09-21 | The Board Of Trustees Of The Leland Stanford Junior University | Methods for characterizing features of interest in digital images and systems for practicing same |
CN106373168A (en) * | 2016-11-24 | 2017-02-01 | 北京三体高创科技有限公司 | Medical image based segmentation and 3D reconstruction method and 3D printing system |
CN110910401A (en) * | 2019-10-31 | 2020-03-24 | 五邑大学 | Semi-automatic image segmentation data annotation method, electronic device and storage medium |
Non-Patent Citations (1)
Title |
---|
YU, YUJIE: "A Methodology for Semi-automatic Building Extraction from Very High Resolution Remote Sensing Imagery", CHINESE MASTER’S THESES FULL-TEXT DATABASE, INFORMATION SCIENCE AND TECHNOLOGY, no. 1, 31 January 2018 (2018-01-31), pages 1 - 90, XP055809138, ISSN: 1674-0246 * |
Also Published As
Publication number | Publication date |
---|---|
CN110910401A (en) | 2020-03-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021082507A1 (en) | Semi-automated image segmentation and data annotation method, electronic device and storage medium | |
US11586863B2 (en) | Image classification method and device | |
Andres et al. | Probabilistic image segmentation with closedness constraints | |
CN110309824B (en) | Character detection method and device and terminal | |
US9519660B2 (en) | Information processing apparatus, clustering method, and recording medium storing clustering program | |
CN110210387B (en) | Method, system and device for detecting insulator target based on knowledge graph | |
CN109800698A (en) | Icon detection method based on depth network | |
US20220375188A1 (en) | Semi-automatic image data labeling method, electronic apparatus, and storage medium | |
US20220414892A1 (en) | High-precision semi-automatic image data labeling method, electronic apparatus, and storage medium | |
CN109919149B (en) | Object labeling method and related equipment based on object detection model | |
CN110889437B (en) | Image processing method and device, electronic equipment and storage medium | |
CN110910414B (en) | Image contour generation method, image labeling method, electronic device and storage medium | |
CN112069985A (en) | High-resolution field image rice ear detection and counting method based on deep learning | |
US20180211108A1 (en) | Chart analysis method and system | |
CN114359932B (en) | Text detection method, text recognition method and device | |
CN117437406A (en) | Multi-target detection method and device | |
WO2023151299A1 (en) | Data generation method and apparatus, device, and storage medium | |
CN115410173B (en) | Multi-mode fused high-precision map element identification method, device, equipment and medium | |
WO2023082588A1 (en) | Semantic annotation method and apparatus, electronic device, storage medium, and computer program product | |
CN112991308B (en) | Image quality determining method and device, electronic equipment and medium | |
CN103886333B (en) | Method for active spectral clustering of remote sensing images | |
WO2024021321A1 (en) | Model generation method and apparatus, electronic device, and storage medium | |
CN113256581B (en) | Automatic defect sample labeling method and system based on visual attention modeling fusion | |
US20240212239A1 (en) | Logo Labeling Method and Device, Update Method and System of Logo Detection Model, and Storage Medium | |
CN114511862A (en) | Form identification method and device and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20882859 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20882859 Country of ref document: EP Kind code of ref document: A1 |