CN114723940A

CN114723940A - Method, device and storage medium for labeling picture data based on rules

Info

Publication number: CN114723940A
Application number: CN202210427984.6A
Authority: CN
Inventors: 彭进华; 韩旭
Original assignee: Guangzhou Weride Technology Co Ltd
Current assignee: Guangzhou Weride Technology Co Ltd
Priority date: 2022-04-22
Filing date: 2022-04-22
Publication date: 2022-07-08
Anticipated expiration: 2042-04-22
Also published as: CN114723940B

Abstract

The application belongs to the technical field of data labeling and discloses a method, a device and a storage medium for labeling picture data based on rules, wherein the method comprises the steps of obtaining picture labeling data; judging whether the original data exists in the marked data, wherein the original data comprises the existing marked information; if no original data exists, marking all marking frames on the marking data, and selecting one marking frame as a parent frame; labeling the child frame according to the association degree of the parent frame and other surrounding labeling frames, and outputting a labeling result; if the original data exists, classifying the labeled data according to the original relation of the labeling frame in the labeling information; calculating the association degree between the labeling frames, and establishing a bipartite graph for matching according to the relationship between the labeling frames; and outputting a labeling result according to the matching result of the bipartite graph, so that the effects of improving the labeling efficiency and precision can be achieved.

Description

Method, device and storage medium for labeling picture data based on rules

Technical Field

The present application relates to the field of data annotation technologies, and in particular, to a method, an apparatus, and a storage medium for annotating picture data based on rules.

Background

Data annotation is the use of automated tools to capture and collect data from the internet, including text, pictures, voice, etc., and then sort and annotate the captured data.

Data annotation belongs to basic work in the artificial intelligence industry, a large number of data annotation specialists are required to engage in relevant part work to meet requirements of artificial intelligence training data, and the accuracy of the annotation data influences the quality of artificial intelligence training. With the increase of the labeled data volume and labeled scenes, the increase of the data volume base number can lead to the situation that the quality of the labeled data is poor, but the data volume is huge, and the cost of re-labeling is too high.

Disclosure of Invention

Therefore, the embodiment of the application provides a method, a device and a storage medium for labeling picture data based on rules, which can solve the technical problems of low efficiency and low precision of the existing labeling method, and the specific technical scheme content is as follows:

in a first aspect, an embodiment of the present application provides a method for labeling picture data based on a rule, where the method includes:

acquiring image annotation data;

judging whether the original data exists in the marked data, wherein the original data comprises the existing marked information;

if no original data exists, marking all marking frames on the marking data, and selecting one marking frame as a parent frame;

labeling the child frame according to the association degree of the parent frame and other surrounding labeling frames, and outputting a labeling result;

if the original data exists, classifying the labeled data according to the original relation of the labeling frame in the labeling information;

calculating the association degree between the labeling frames, and establishing a bipartite graph for matching according to the relationship between the labeling frames;

and outputting a labeling result according to the matching result of the bipartite graph.

Further, the formula for calculating the degree of association is:

F＝(IoU-Ad^1/2-Bp^1/3)×k

wherein F is the degree of the relationship between parents and children, IoU is the intersection ratio of the parent frame and the child frame, d represents the area ratio of the area of the child frame to the area of the parent frame, p represents the volume ratio of the child frame cube to the parent frame cube, and k represents the ratio of the physical distance between the vehicle corresponding to the parent frame and the vehicle corresponding to the child frame, corresponding to the physical distance between the vehicle corresponding to the parent frame and the vehicle corresponding to the child frame, and the vehicle corresponding to the child frame; a and B are preset weights.

Further, if corresponding point cloud data exists, p is obtained from the point cloud data;

if there is no corresponding point cloud data, p is determined by the ratio of the diagonal distance of the child frame on the picture data to the diagonal distance of the parent frame on the picture.

Further, the method further comprises:

after the parent-child relationship between parent frames and child frames is established, if at least two parent frames are overlapped, calculating the number of child frames of the same class having the parent-child relationship with the parent frames, and comparing the number of child frames of the same class with preset values of corresponding classes;

if the number of the children frames in the same class is more than the preset value of the corresponding class, recalculating the relative positions between the children frames in the class and the at least two father frames, and re-determining the father-son relationship between the children frames and the father frames.

Further, the method further comprises:

acquiring a trigger signal, and judging whether the trigger signal belongs to a child frame or a father frame;

if the trigger signal belongs to the child frame, the parent-child relationship between the child frame and the parent frame is released;

and if the trigger signal belongs to the parent frame, removing the parent-child relationship between the parent frame and all child frames belonging to the parent frame.

Further, setting ID for the father frame, setting the ID of the father frame in the child frame, and establishing the father-child relationship between the father frame and the child frame.

Further, the setting the ID of the parent frame to the child frame, and the establishing the parent-child relationship between the parent frame and the child frame includes:

receiving a parent frame selection signal and displaying child frames within a predetermined distance of the selected parent frame;

and setting the ID of the parent frame to any one or two or more of the child frames, and establishing a parent-child relationship between the parent frame and the corresponding child frame.

Further, the maximum weight matching algorithm of the bipartite graph adopts a KM algorithm or a minimum cost maximum flow algorithm.

In a second aspect, an embodiment of the present application provides an apparatus for labeling picture data based on rules, the apparatus including:

the data acquisition module is used for acquiring the image annotation data;

the judging module is used for judging whether the labeled data has original data or not, and the original data comprises the existing labeled information;

the marking module is used for marking all marking frames on the marking data if the original data does not exist, and selecting one marking frame as a parent frame;

the first output module is used for labeling the child frame according to the association degree of the parent frame and other surrounding labeling frames and outputting a labeling result;

the classification module classifies the marked data according to the original relation of the marked frame in the marked information if the original data exists;

the bipartite graph matching module is used for calculating the association degree between the labeling frames and establishing a bipartite graph for matching according to the relation between the labeling frames;

and the second output module is used for outputting the labeling result according to the bipartite graph matching result.

In a third aspect, an embodiment of the present application provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the computer program implements the steps of the method for labeling picture data based on rules in any one of the preceding claims.

In summary, compared with the prior art, the beneficial effects brought by the technical scheme provided by the embodiment of the present application at least include:

when the labeling is carried out, whether the original data exists in the labeled data or not is distinguished, if the original data exists, the labeling information in the original data is used for labeling, the association degree of the labeling frame is calculated, and the bipartite graph is used for removing matching, so that the accuracy of the labeling can be improved; and for the labeled data without the original data, automatically screening the score quarts according to the association degree after selecting the parent frame, and improving the labeling precision.

Drawings

Fig. 1 is a flowchart illustrating a method for labeling picture data based on rules according to an embodiment of the present disclosure.

Detailed Description

The present embodiment is only for explaining the present application, and it is not limited to the present application, and those skilled in the art can make modifications of the present embodiment without inventive contribution as needed after reading the present specification, but all of them are protected by patent law within the scope of the claims of the present application.

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

In addition, the term "and/or" in the present application is only one kind of association relationship describing the associated object, and means that three kinds of relationships may exist, for example, a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, in the present application, the character "/" indicates that the preceding and following related objects are in an "or" relationship, unless otherwise specified.

The terms "first," "second," and the like in this application are used for distinguishing between similar items and items that have substantially the same function or similar functionality, and it should be understood that "first," "second," and "nth" do not have any logical or temporal dependency or limitation on the number or order of execution.

The term "at least one" in this application means one or more, and the meaning of "a plurality" means three or more, e.g. a plurality of first locations means three or more first locations.

The embodiments of the present application will be described in further detail with reference to the drawings attached hereto.

Referring to fig. 1, in an embodiment of the present application, a method for labeling picture data based on rules is provided, and main steps of the method are described as follows:

s1, acquiring image annotation data;

s2, judging whether the annotation data has original data, wherein the original data comprises existing annotation information;

s3, if no original data exists, marking all the marking frames on the marking data, and selecting one marking frame as a parent frame;

s4, labeling the child box according to the relevance degree of the father box and other labeling boxes around, and outputting a labeling result;

s5, if the original data exists, classifying the labeled data according to the original relation of the labeled box in the labeled information;

s6, calculating the association degree between the labeling frames, and establishing a bipartite graph for matching according to the relationship between the labeling frames;

and S7, outputting a labeling result according to the matching result of the bipartite graph.

Specifically, in this embodiment, the type of the label data is picture data, and the picture data is acquired by the telling camera; the current high speed camera technology is mature, and the influence of camera distortion of different types of high speed cameras on pictures is smaller and smaller. The precision of the pictures is higher and higher, and more data can be utilized for marking the data of the pictures. In the aspect of image data annotation, generally, a point cloud annotation result is obtained from the end of point cloud data annotation, then image annotation is carried out, and image data is annotated based on the point cloud annotation result, because the annotation accuracy can be changed by utilizing information in the point cloud annotation result. However, there is also image data without point cloud data, and the image data without point cloud data is generally labeled by means of its own information.

In this embodiment, if the obtained label data has been previously labeled, that is, there is original data, the label information includes a label frame, an ID of the label frame, and the like. The vehicle body image data is classified according to the labeling frames in the labeling information, wherein the classification is attribution relation classification, namely the labeling frames of each frame in the labeling data are classified according to types, such as wheels, lamps, windows and the like. After classifying the labeled boxes, calculating the association degree between the parent box and the child box near the parent box and the parent box, establishing a bipartite graph, and obtaining a labeled prediction result from the bipartite graph.

If the original data does not exist, selecting one marking frame as a parent frame, and calculating the association degree of the surrounding marking frames and the parent frame so as to mark the child frame of the parent frame.

In this embodiment, the way of calculating the association degree may be: respectively setting preset distances according to the classification of possible child frames of the parent frame, and according to whether the distance between the parent frame and the surrounding marking frame or the child frame is within the range of the preset distance +/-preset error, if so, the smaller the error between the distance and the preset distance is, the higher the correlation degree is, and the child frame is possibly marked; and calculating the number of the child frames, and if the number of the child frames exceeds the preset number of the classifications corresponding to the preset distance, selecting the preset number of the labeled frames with high association degree as the child frames.

For example, a child frame having a fixed distance relationship with a parent frame has a parent-child relationship with the parent frame, the fixed distance relationship has an error range, and assuming that the fixed distance is 60cm, the error range is plus or minus 10cm, and the criterion is that the smaller the difference between the fixed distance and the fixed distance, the higher the relationship degree, and for the accessory rearview mirror, the distance between the midpoint of the vehicle corresponding to the parent frame a and the midpoint of the rearview mirror is 62.4cm, and the distance between the midpoint of the vehicle corresponding to the parent frame B and the midpoint of the rearview mirror is 68.4cm, the rearview mirror and the parent frame a have a parent-child relationship.

In other embodiments, the association degree may also be calculated in other manners, for example, the association degree is calculated according to the relative proportion and distance between the parent frame and the surrounding child frames, which is not described herein again.

Through the setting of the application, when the marking is carried out, whether the marking data have the original data or not is distinguished, if the marking data have the original data, the marking information in the original data is used for marking, the association degree of the marking frame is calculated, and the bipartite graph is used for removing matching, so that the marking accuracy can be improved.

Optionally, in another embodiment, the formula for calculating the association degree is:

F＝(IoU-Ad^1/2-Bp^1/3)×k

wherein F is the degree of the relationship between parents and children, IoU is the intersection ratio of the parent frame and the child frame, d represents the area ratio of the child frame to the image data, p represents the volume ratio of the child frame cube to the parent frame cube, and k represents the ratio of the physical distance between the vehicle corresponding to the parent frame and the vehicle corresponding to the child frame and the vehicle corresponding to the parent frame and the child frame; a and B are preset weights.

Further, for picture data with three-dimensional information (namely, point cloud data exists), A and B respectively take 0.2 and 0.4, and for data without three-dimensional information, A and B respectively take 0.4 and 0.2; in other embodiments, the weights of a and B may be adjusted according to traffic needs.

The annotation of the picture data is usually non-real-time data, so when the parent-child relationship in the picture data is annotated, a parent frame needs to be selected to be annotated, and when the child frame is annotated, the relationship degree between the child frame and all current parent frames needs to be calculated, so that the parent-child relationship is set with high association degree.

Further, in another embodiment, if the annotation data is formed from annotated point cloud data, then p is obtained from the point cloud data;

and if the annotation data does not have corresponding point cloud data, determining p by the ratio of the diagonal distance of the child frame on the picture data to the diagonal distance of the parent frame on the picture.

Furthermore, the bipartite graph maximum weight matching algorithm adopts a KM algorithm or a minimum cost maximum flow algorithm, after the bipartite graph maximum weight matching algorithm defines the parent-child relationship degree of two boxes, an adjacency matrix with the relationship degree corresponding to each child box and parent box can be obtained, the problem is converted into the problem of finding the matching of the child box and the parent box to enable the similarity and the maximum of all the connections, and the KM algorithm or the minimum cost maximum flow algorithm is adopted.

Optionally, in another embodiment, the method further includes:

s8: after the parent-child relationship between parent frames and child frames is established, if at least two parent frames are overlapped, calculating the number of child frames of the same class having the parent-child relationship with the parent frames, and comparing the number of child frames of the same class with preset values of corresponding classes;

s9: if the number of the children frames in the same class is more than the preset value of the corresponding class, recalculating the relative positions between the children frames in the class and the at least two father frames, and re-determining the father-son relationship between the children frames and the father frames.

Specifically, in this embodiment, if there is overlap between parent frames, it indicates that there may be an association error in child frames between the two parent frames, and if the number of child frames of a certain category of parent frames exceeds a preset value, it indicates that there is an association error in child frames, for example, a car has only two rearview mirrors, so the preset value of child frames corresponding to the rearview mirrors is 2, and if the number of child frames corresponding to the rearview mirrors is 3, it indicates that there is at least one child frame with an association error in the 3 child frames, at this time, the relative position between the child frame and the parent frame corresponding to the rearview mirrors is calculated, for example, the distance between the child frame and the parent frame of the car body and/or the relative positional relationship between the child frames, and the relative positional relationship between the child frames is calculated, for example, if there is a child frame corresponding to the rearview mirror on the left side of the rearview mirror, this child box association error is declared.

Optionally, in another embodiment, an ID is set for the parent frame, the ID of the parent frame is set for the child frame, and a parent-child relationship between the parent frame and the child frame is established.

Determining the parent-child relationship between the parent frame and the child frame according to the association degree, labeling the determined ID to the parent frame, copying the ID of the parent frame to the child frame with the parent-child relationship, and generating the parent-child relationship between the parent frame and the child frame.

1. receiving a parent frame selection signal and displaying child frames within a predetermined distance of the selected parent frame;

2. and setting the ID of the parent frame to any one or two or more of the child frames, and establishing a parent-child relationship between the parent frame and the corresponding child frame.

Specifically, in the present embodiment, when a parent frame is selected in the picture data, the parent frame selection signal is triggered, and child frames within a predetermined distance around the parent frame are displayed. The predetermined distance is set according to the type of the parent frame, such as a car, a large truck, and the like, and the proportion of the picture data to the real situation. The predetermined distance represents the farthest distance that the child box on the parent box may exist.

The ID of the parent frame is copied to the child frame within a preset distance around the displayed parent frame, and then the parent-child relationship can be quickly established.

Optionally, in another embodiment, the method further includes:

s10: acquiring a trigger signal, and judging whether the trigger signal belongs to a child frame or a father frame;

s11: if the trigger signal belongs to the child frame, the parent-child relationship between the child frame and the parent frame is released;

s12: and if the trigger signal belongs to the parent frame, removing the parent-child relationship between the parent frame and all child frames belonging to the parent frame.

The trigger signal may be generated by a user clicking a mouse, or may be generated by the mouse moving to a corresponding label frame, which is not described herein, and this embodiment takes the example of the user moving the mouse to the corresponding label frame and clicking a right button of the mouse to generate; whether a pair of parent subframes is expressed according to the given ID, after the data annotation is completed, a worker can use a computer device, such as a computer, to move a mouse to an incorrect contact frame on the annotated data, whether the annotated frame is a parent frame or a child subframe can be judged, and the right button of the mouse cancels the parent-child relationship of the two frames. A left key on the marking data selects a father frame, then the marking data is switched to a son frame which is not established with father-son relationship near the just selected father frame, and then the left key selects a son frame, so that a specific identical ID can be given to the two frames to represent the father-son relationship of the two frames.

It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.

In an embodiment of the present application, a device for labeling picture data based on rules is provided, where the device for labeling picture data based on rules corresponds to the method for labeling picture data based on rules in the foregoing embodiments one to one. The device for labeling the picture data based on the rules comprises:

the data acquisition module is used for acquiring the image annotation data;

the first output module is used for labeling the child frames according to the association degree of the parent frame and other labeling frames around the parent frame and outputting labeling results;

Further, in another embodiment, the formula for calculating the association degree is:

F＝(IoU-Ad^1/2-Bp^1/3)×k

wherein F is the father-son relationship degree, IoU is the intersection ratio of the father frame and the child frame, d represents the area ratio of the area of the child frame to the area of the father frame, p represents the volume ratio of the child frame cube to the father frame cube, and k represents the ratio of the physical distance between the vehicle corresponding to the father frame and the vehicle corresponding to the son frame, corresponding to the physical distance between the vehicle corresponding to the father frame and the vehicle corresponding to the son frame; a and B are preset weights.

Further, in another embodiment, if there is corresponding point cloud data, p is obtained from the point cloud data;

Further, in another embodiment, the apparatus further comprises:

the comparison module is used for calculating the number of children frames of the same type having a parent-child relationship with the parent frame if at least two parent frames are overlapped after the parent-child relationship between the parent frame and the children frames is established, and comparing the number of children frames of the same type with the preset numerical values of the corresponding classification;

and the relationship establishing module is used for recalculating the relative positions between the child frame and the at least two parent frames if the number of the child frames in the same class is more than the preset numerical value of the corresponding class, and re-determining the parent-child relationship between the child frame and the parent frame.

Further, in another embodiment, the apparatus further comprises:

the relation releasing module is used for acquiring the trigger signal and judging whether the trigger signal belongs to the child frame or the father frame; if the trigger signal belongs to the child frame, releasing the parent-child relationship between the child frame and the parent frame; and if the trigger signal belongs to the parent frame, removing the parent-child relationship between the parent frame and all child frames belonging to the parent frame.

Further, in another embodiment, an ID is set to the parent frame, the ID of the parent frame is set to the child frame, and the parent-child relationship between the parent frame and the child frame is established.

Further, in another embodiment, the setting the ID of the parent frame to the child frame, and the establishing the parent-child relationship between the parent frame and the child frame includes:

receiving a parent frame selection signal, and displaying child frames within a predetermined distance of the selected parent frame;

Further, in another embodiment, the maximum weight matching algorithm of the bipartite graph adopts a KM algorithm or a minimum cost maximum flow algorithm.

All or part of the modules of the device for labeling the picture data based on the rules can be realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In an embodiment of the present application, a computer-readable storage medium is provided, which stores a computer program, which when executed by a processor implements the method steps for labeling picture data based on rules as described in the above embodiment. The computer-readable storage medium includes a ROM (Read-Only Memory), a RAM (Random-Access Memory), a CD-ROM (Compact Disc Read-Only Memory), a magnetic disk, a floppy disk, and the like.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of each functional unit or module is illustrated, and in practical applications, the above-mentioned function may be distributed as different functional units or modules as required, that is, the internal structure of the apparatus described in this application may be divided into different functional units or modules to implement all or part of the above-mentioned functions.

Claims

1. A method for labeling picture data based on rules, the method comprising:

acquiring image annotation data;

if the original data does not exist, marking all the marking frames on the marking data, and selecting one marking frame as a parent frame;

labeling the child frame according to the association degree of the parent frame and other labeling frames around, and outputting a labeling result;

2. The method for labeling picture data based on rules according to claim 1, wherein the formula for calculating the degree of association is:

F＝((IoU-Ad^1/2-Bp^1/3))×k

wherein F is the degree of the relationship between parents and children, IoU is the intersection ratio of the parent frame and the child frame, d represents the area ratio of the area of the child frame to the area of the parent frame, p represents the volume ratio of the child frame cube to the parent frame cube, and k represents the ratio of the physical distance between the vehicle corresponding to the parent frame and the vehicle corresponding to the child frame; a and B are preset weights.

3. The method for rule-based annotation of picture data according to claim 2, wherein p is obtained from the point cloud data if there is corresponding point cloud data;

4. The method for labeling picture data based on rules according to claim 1, further comprising:

5. The method for labeling picture data based on rules according to claim 1, further comprising:

6. The method according to claim 1, wherein an ID is set for the parent frame, the ID of the parent frame is set for the child frame, and the parent-child relationship between the parent frame and the child frame is established.

7. The method according to claim 6, wherein the ID of the parent frame is set to the child frame, and establishing the parent-child relationship between the parent frame and the child frame comprises:

8. The method for labeling picture data based on rules according to claim 1, wherein the maximum weight matching algorithm of the bipartite graph adopts a KM algorithm or a minimum cost maximum flow algorithm.

9. An apparatus for labeling picture data based on rules, the apparatus comprising:

the data acquisition module is used for acquiring the image annotation data;

10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when being executed by a processor, carries out the steps of the method for rule-based annotation of picture data according to any one of claims 1 to 8.