CN113392230A

CN113392230A - Method for processing and operating labeled data, labeled platform and database

Info

Publication number: CN113392230A
Application number: CN202010177214.1A
Authority: CN
Inventors: 严林; 林泽锐; 宋扬
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2020-03-13
Filing date: 2020-03-13
Publication date: 2021-09-14
Anticipated expiration: 2040-03-13
Also published as: CN113392230B

Abstract

A processing method and an operation method of annotation data, a corresponding annotation platform and an annotation database are disclosed. The method for processing the labeled data comprises the following steps: generating a labeling item corresponding to a labeling item, wherein the labeling item contains labeling object information in the labeling item; acquiring the labeling object information of a labeling object contained in a labeling item; and generating an object item based on the labeled object according to the labeled object information. The scheme describes the change of the annotation object through the metadata, and can compress data and reduce the quantity of transmission and the number of modification. Further, the scheme can be combined with a specific annotation platform, cross-item (for example, cross-frame) automatic tracking for the annotation object is provided for the annotation personnel, and therefore efficiency and accuracy of the annotation operation are improved.

Description

Method for processing and operating labeled data, labeled platform and database

Technical Field

The present disclosure relates to the field of data processing, and in particular, to a method for processing and operating labeled data, and a corresponding labeling platform and a corresponding labeling database.

Background

In recent years, the artificial intelligence technology has been increasing in a blowout manner, and the artificial intelligence technology has been making great progress in the fields of image recognition, voice processing and intelligent man-machine interaction. In artificial intelligence techniques, various types of models (e.g., neural network models) are typically trained to obtain trained models for various types of predictions. Training of the model requires a large amount of training data. The training data may be images, audio, or text, depending on the field. In order to make the training model converge, for example, towards a desired direction, the training data typically needs to be labeled in order to facilitate control of the convergence direction of the model.

Currently, the labeling of training data is mostly implemented manually. For example, for a set of image frames, the annotator can manually select a particular object in the image to annotate. In some applications, it is often desirable to track one or more particular objects in a series of consecutive frames. At this time, the marking precision and efficiency can be greatly reduced by completely depending on manual frame-by-frame operation or modifying the database one by one.

For this reason, there is a need for an improved annotation data processing and manipulation scheme.

Disclosure of Invention

The technical problem to be solved by the present disclosure is to provide a processing method and an operation method of labeled data, and a corresponding labeling platform and a labeling database. The scheme describes the change of the annotation object through the metadata, and can compress data and reduce the quantity of transmission and the number of modification. Further, the scheme can be combined with a specific annotation platform, cross-item (for example, cross-frame) automatic tracking for the annotation object is provided for the annotation personnel, and therefore efficiency and accuracy of the annotation operation are improved.

According to a first aspect of the present disclosure, there is provided an annotation data processing method, comprising: generating a labeling item corresponding to a labeling item, wherein the labeling item contains partial information of a labeling object in the labeling item; acquiring the labeling object information of a labeling object contained in a labeling item; and generating an object item based on the labeled object according to the labeled object information.

According to a second aspect of the present disclosure, there is provided a method of operation of annotation data, comprising:

based on the selection of a user on a specific labeling item, acquiring an associated object entry comprising a corresponding labeling item ID of the specific labeling item, wherein the object entry comprises a labeling object ID and a labeling item ID comprising a labeling object; and acquiring a specific labeling item corresponding to the specific labeling item, wherein the labeling item comprises partial information of a labeling object in the labeling item.

According to a third aspect of the present disclosure, there is provided an image annotation data manipulation method, comprising: a user selecting a particular tagged object on a particular image data item; generating a common attribute of the particular tagged object; generating a specific object entry corresponding to the specific markup object containing the common attribute; and mapping the particular tagged object to an adjacent image data item based on the particular object entry.

According to a fourth aspect of the present disclosure, there is provided an annotation platform for receiving user operations and interacting with an annotation database and presenting image data items, annotation items and annotation objects to a user, the annotation platform performing the method according to the third aspect.

According to a fifth aspect of the present disclosure, there is provided an annotation database for receiving user instructions and interacting with an annotation platform, the annotation database performing the method according to the first aspect as described above.

According to a sixth aspect of the present disclosure, there is provided a computing device comprising: a processor; and a memory having executable code stored thereon, which when executed by the processor, causes the processor to perform the method as described in the second or third aspect above.

According to a seventh aspect of the present disclosure, there is provided a non-transitory machine-readable storage medium having stored thereon executable code, which when executed by a processor of an electronic device, causes the processor to perform the method as described in the second or third aspect above.

Therefore, the invention can realize convenient linkage modification of cross-data items by storing the object items as the meta information in the annotation database. Accordingly, the present invention provides the ability to track annotations on an annotation platform, thereby achieving improvements in efficiency and accuracy on both the database and the client.

Drawings

The above and other objects, features and advantages of the present disclosure will become more apparent by describing in greater detail exemplary embodiments thereof with reference to the attached drawings, in which like reference numerals generally represent like parts throughout.

FIG. 1 shows a schematic flow diagram of an annotation data processing method according to one embodiment of the invention.

FIG. 2 shows a schematic flow diagram of a method of annotation data manipulation according to one embodiment of the invention.

FIGS. 3A-3D show an example of the add/delete modify lookup operation using metaInfo.

FIG. 4 shows a schematic flow diagram of a method of image annotation data manipulation in accordance with one embodiment of the invention.

FIG. 5 illustrates an example of an operational screenshot of an annotation platform according to the present invention.

FIG. 6 illustrates an example of an operational screenshot of an annotation platform according to the present invention.

FIG. 7 illustrates an example of an operational screenshot of an annotation platform according to the present invention.

FIG. 8 is a schematic structural diagram of a computing device which can be used for implementing the above-mentioned operation method of the annotation data according to an embodiment of the invention.

Detailed Description

Preferred embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While the preferred embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

In recent years, artificial intelligence technology has been advanced in the fields of image recognition, speech processing, and intelligent human-computer interaction. In artificial intelligence techniques, various types of models (e.g., neural network models) are typically trained to obtain trained models for various types of predictions. Training of the model requires a large amount of training data. The training data may be images, audio, or text, depending on the field. In order to make the training model converge, for example, towards a desired direction, the training data typically needs to be labeled in order to facilitate control of the convergence direction of the model.

Therefore, the invention provides a processing method and an operation method of the labeling data, and a corresponding labeling platform and a corresponding labeling database. The scheme describes the change of the annotation object through the metadata, and can compress data and reduce the quantity of transmission and the number of modification. Further, the scheme can be combined with a specific annotation platform, cross-item (for example, cross-frame) automatic tracking for the annotation object is provided for the annotation personnel, and therefore efficiency and accuracy of the annotation operation are improved.

For convenience of connection, the basic concept related to the present invention will be described first by taking the image labeling and training in the field of unmanned driving as an example.

Object Tracking is a key technology in unmanned driving, including how to predict the next action of a pedestrian or a vehicle, how to track the pedestrian or the vehicle, and the like. In the object tracking process, the object also needs to be identified.

When performing object tracking, two scenes, namely 3D tracking annotation and 2D tracking annotation, can be included. In 3D tracking scene labeling of an unmanned vehicle, for example, in a 3D point cloud scene, the unmanned vehicle needs to continuously track objects within a certain time range. For this reason, the generated annotation data needs to be able to describe this segment of continuously changing objects. The type of label provides important training data for perception and identification of objects which continuously change in an enhanced 3D scene of the unmanned vehicle, and is an important material source for model training of the unmanned vehicle. In the 2D tracking scene labeling of the unmanned vehicle, for example, in a 2D point cloud and/or picture scene, the unmanned vehicle needs to continuously track the object within a certain time range. The generated annotation data needs to be able to describe this segment of continuously changing objects. The type of labeling data mainly aims at 2D scenes, and provides important model training materials for object recognition and action prediction of the unmanned vehicle in the 2D scenes.

In order to facilitate the labeling of the labeling personnel, a special labeling platform can be provided for the labeling personnel for the addition, deletion, modification and check of the labeling content. Here, the annotation platform may be a platform for performing data annotation, which is used for performing image annotation, 2d/3d point cloud annotation, and the like. For convenience of operation, the annotation platform usually has a visual interface, and the data item to be annotated is displayed in the interface, for example, a 2D picture, or a 3D point cloud image, or a combination of a 2D picture and a 3D point cloud. The annotator can annotate the data item on the annotation platform, so that annotation information associated with the data item (e.g., 3D point cloud information for a frame) can form an annotated item. The annotation item can correspond to the data item one to one. The annotation item and the data item can be stored in a database and acquired by an annotation platform installed on the client when needed. In one embodiment, the annotation item can be stored in association with the data item in a database. However, in the preferred embodiment, since the annotation item and the data item are of different data types, they are preferably stored in different databases. For example, the annotation item is stored as a particular annotation entry in an annotation database, and the data item is stored in a specialized database, such as an object store (OSS) database. The corresponding annotation item and data item can be associated according to the same or corresponding ID. When the annotation platform needs to perform annotation and visual display, the annotation item and the data item can be acquired from the annotation database and the OSS database respectively for merged display.

The data items typically contain objects of interest, such as vehicles or pedestrians that need to be identified in an image frame or a 3D point cloud. When the object is marked in the data item, a callout object is formed. That is, in a label entry, an object is labeled, such as: a rectangular frame marked in a picture for identifying a vehicle.

Since object tracking typically involves tracking of a particular object or objects in a series of frames, the data used for training typically includes a grouping of data, for example, 5000 pictures of successive frames or 5000 3D point cloud data generated therefrom. For the above packet data, packet marking is required. In the grouping annotation, a plurality of annotation entries have an association relationship therebetween and are ordered. Modifying an annotation object of any one of the annotation items affects objects in other annotation items associated with the modified annotation object.

In unmanned vehicle data labeling, a series of consecutive frames need to be tracked, which may be 3D point clouds, 2D images or a combination thereof. The frames are labeled as a group label that relates to a plurality of consecutive frames. An identified vehicle or pedestrian is a tagged object and is assigned a unique obj Id. In other frames, if the vehicle or the pedestrian appears, the vehicle or the pedestrian is marked as the same labeled object.

The underlying concepts of the annotation domain are received above in connection with unmanned image annotation. It should be understood that the data item may also be other data of the image fingerprint, such as a piece of speech or a piece of text, etc., and the label may also be a speech or text label. The labeling platform can support or be compatible with the voice or text labeling and is provided with a matched database. Although the principles of the present invention will be primarily explained below in connection with an unmanned image annotation scenario, it should be understood that the principles of the present invention are applicable to a variety of data annotation scenarios, including speech and text, so long as they involve object tracking across data items.

In a grouping annotation scene for performing object tracking training, because annotation items and annotation items are numerous, for example, up to 5000 associated images and up to 300 objects on each image, in this case, how to synchronize annotation data with each other is a very difficult problem to overcome.

In the existing method, the labeled contents of 5000 pictures need to be stored respectively, and the labeled contents of each picture are stored as a labeled item in a corresponding database data item. The problem caused by this method is that it is unable to cope with the linked operations of adding, deleting and modifying, for example, modifying one picture, it may need to modify the following 1000 pictures, and modify the same labeled object. Thus, a single operation by a user may require 1000 database modifications. This is clearly undesirable, with each modification having a significant system impact.

To this end, the present invention proposes the concept of object entry. The object entries may be stored in the annotation database as meta (meta) information and by manipulation on the object entries, linkage modification across data items is achieved. Accordingly, the present invention provides the ability to track annotations on an annotation platform, thereby achieving improvements in efficiency and accuracy on both the database and the client.

In one embodiment, the invention may be embodied as a method of annotation data processing. FIG. 1 shows a schematic flow diagram of an annotation data processing method according to one embodiment of the invention. The method can be an operation implemented within a corresponding annotation database. The above operations may be combined with operations on a client (e.g., annotation platform) as described below in conjunction with fig. 2 and 3, thereby providing a comprehensive optimization scheme that promotes database efficiency and user annotation efficiency.

In step S110, a annotation item corresponding to the annotation item is generated. The annotation item includes partial information (e.g., unique information, as described in detail below) of the annotation object in the annotation item. In step S120, annotation object information of an annotation object included in an annotation item is acquired, and in step S130, an object entry based on the annotation object is generated based on the annotation object information.

In step S110, the generated annotation item can correspond to a specific item in the annotation database, which is associated with an annotation item (and its corresponding data item). For example, after an original data item (e.g., a 2D image frame or a 3D point cloud) is acquired, annotation information related to the data item can be generated based on automatic processing by the system or artificial annotation. The annotation information may be an annotation item corresponding to the data item, and is stored in the annotation database as an annotation entry in a prescribed format in step S110. The annotation entry includes at least an ID of the annotation item (e.g., a unique ID in a group in the case of a group annotation), and may include partial information of an annotation object within the annotation item. For example, part information corresponding to a certain annotation object that has been annotated in the image data, for example, the selection box position and size information of the annotation object.

Specifically, in the case of grouping annotation, for example, when 5000 consecutive frames of images are annotated for one group, a corresponding annotation item may be generated for each annotation item in the grouping annotation data, wherein each annotation item is assigned with an annotation item ID unique within the group, and each annotation object is assigned with an annotation object ID unique within the group.

Similarly, in step S120, the annotation object information of the annotation object included in the annotation item may be acquired by automatic processing by the system or by manual annotation. In different embodiments, the information may be obtained directly from the labeled item, or may be obtained from the labeled item (which may involve subsequent modification of the labeled item), and this application does not limit this. After the annotation object information is acquired, an object entry based on the annotation object may be generated in step S130. Specifically, the annotation object ID and the annotation item ID including the annotation object may be stored as an object entry of the annotation object. Thus, by providing a specialized object entry (hereinafter also referred to as a "metadata entry" or a "meta entry") in addition to the conventional annotation entry, cross-item object tracking is made possible.

As described above, the related information of the annotation object is stored in both the annotation entry and the object entry. In order to improve the storage utilization efficiency of the database and reduce the transmission and storage amount of data, the annotation item and the object item should respectively store different types (i.e. annotation object information that is not repeated with each other) in addition to the ID information of the associated annotation object. In one embodiment, common information for the annotation object is preferably stored within the object entry.

To this end, the step S120 may include acquiring common attribute information of the annotation object included in the annotation item, and further storing the common attribute information of the annotation object in the associated object entry. The common attribute information may be attribute information that commonly appears in a plurality of annotation items including the annotation object. For example, a vehicle appearing in 20 adjacent annotation items (in 20 consecutive frame image frames), the attribute (for example, type "vehicle", color "white", and model "car") of the vehicle, etc. do not change in the 20 annotation items, and thus may be stored as common attribute information in the object entry. In addition, the common attribute information may also refer to attribute information that does not change from one annotation item to another, for example, the color and the type of the attribute information remain unchanged compared to the displacement of the vehicle in different image frames. This applies, for example, to the case where a certain annotation object appears only in a single annotation item. Such as a flashing pedestrian. At this time, it is still possible to establish a tagging entry for the tagging object and store therein the ID of the pedestrian object, the common attribute, and the tagging item ID in which the object appears. As can be seen from the above, the object entry stores the IDs of all the tagged items where the object appears. In one embodiment, to further reduce data transmission and storage, the ID of the tagged item may be represented continuously. For example, a certain object (id: 0000 + 0001) is included in the 1001 + 1100 annotation item of a certain component. Then in the object entry, the ID of the annotation item is not stored as [1001,1002,10033 … 1100] but is directly stored as [1000- & 1100] (in start and end frames) or [1001,100] (in start and consecutive frames), thereby reducing the number of stored items from 100 to two.

Alternatively or additionally, the annotation entry stores specific information of the annotated object within the corresponding data item. To this end, the generated annotation item includes: a labeling item ID, a labeling object ID in the labeling item, and unique information of the labeling object. The unique information is, for example, information that generally varies from one annotation item to another, such as position and/or size information of the annotation box.

Based on the above-described configuration of the annotation database, an operation that would normally involve a large number of annotation items can be changed to an operation that involves one object item, thereby providing a faster response to the user's operation.

In the invention, when a user needs to acquire information of a specific annotation item, not only the annotation item corresponding to the annotation item itself needs to be acquired, but also information of all object items including the ID of the specific annotation item needs to be acquired.

To this end, the processing method of the present invention further includes: and acquiring the information of the specific marked item based on the user selection instruction. Specifically, when receiving an instruction for acquiring a specific annotation item from a user of the annotation platform, for example, the associated object entry including the corresponding annotation item ID of the specific annotation item may be read first, and then the specific annotation item corresponding to the specific annotation item may be read. The read content can be transmitted to a labeling platform, and then the information of the associated object entry and the specific labeling entry is merged to present the merged information as the information of the specific labeling item to a user.

When the marked object is required to be deleted, the processing method of the invention further comprises the following steps: and deleting the specific marked object based on the user deleting instruction. Due to the introduction of object entries, the deletion of a callout object may involve only the object entry of its object. The user can choose to delete the object completely, and delete the object entry after searching the specific object entry corresponding to the specific tagged object, or delete the tagged item ID specified in the specific object entry based on the user deletion instruction. At this time, the unique information of the annotation object is still stored in the corresponding annotation entry. However, since the object entry does not include the label entry ID, the unique information of the label object is not read.

When the annotated object needs to be modified, the processing method of the invention further comprises the following steps: and modifying the specific annotation object based on the user modification instruction. The modifications can include modifications to the annotation item and/or the object entry. For example, when the modification relates to a specific attribute, modifying the annotation item containing the specific attribute; and modifying the object entry of the annotation object when the modification relates to the common attribute.

When the labeled object needs to be added, the processing method of the invention further comprises the following steps: and adding the specific annotation object based on the user adding instruction. Likewise, the above-described addition may include modifications to both the annotation item and the object entry, and to this end, adding a particular annotation object may include: and adding annotation object information to the annotation item containing the specific annotation object, and adding a new object entry or adding an annotation item ID to the existing object entry of the specific annotation object.

For the annotation database, the invention can also be realized as an operation method of the annotation data. FIG. 2 shows a schematic flow diagram of a method of annotation data manipulation according to one embodiment of the invention. The method can be made by the annotation platform against an annotation database.

In step S210, based on the selection of the specific annotation item by the user, the associated object entry including the corresponding annotation item ID of the specific annotation item is obtained. The associated object entry may be an object entry corresponding to each of a plurality of objects (associated objects) appearing within the particular annotation item. The object entry includes the annotation object ID and the annotation item ID containing the annotation object, and preferably includes common attribute information of the object. In step S220, a specific annotation item corresponding to the specific annotation item is obtained, where the annotation item includes partial information (e.g., unique information) of an annotation object in the annotation item.

After the above information is obtained, the information contained in the associated object entry and the specific annotation entry may be merged, and the merged information is presented to the user. Specifically, the common attribute information of the annotation object included in the associated object entry may be merged with the unique information of the annotation object included in the specific annotation entry.

In particular, a labeling operation for an object should typically be made in connection with a data item, for example, labeling on a specific image frame or 3D point cloud. To this end, the operation method of the present invention may further include: and acquiring a specific data item corresponding to the specific labeling item. The consolidated information may then be presented to the user on the particular data item retrieved.

By utilizing the item arrangement of the database, the operation method of the invention can further simplify the operation of adding, deleting, modifying and checking at the operation platform.

Specifically, the operation method of the present invention may further include: acquiring a mark of a user for a new object on the specific data item; mapping the mark to an adjacent frame to generate a mapped mark; and sending the mark and the mapping information for generating or updating the corresponding object entry and updating the specific mark entry and the adjacent mark entries. Here, for example, when the user selects a new object on a certain image or 3D point cloud by using a rectangular frame, the user can select the object on the adjacent frame or the system can automatically select the object. For example, the annotation box of the object within the current image frame may be mapped to an adjacent frame (e.g., a previous frame or a next frame). Since the displacement distance of an object in the adjacent frames is usually limited, the position of the object in the adjacent frames can be conveniently confirmed through, for example, neighborhood searching in the adjacent frames, so that the labeling in the adjacent frames is automatically completed. Subsequently, the above mentioned label box information can be sent to the corresponding entries in the database respectively for updating the unique information of the label object. When the common attribute of the labeled object is selected by a user or determined by a system, for example, the existing object entry can be updated, or a corresponding object entry can be added.

Similarly, the method of operation of the present invention may further comprise: acquiring the modification of the mark object on the specific data item by the user; and sending the modifications for modification of the corresponding object entry and/or tag entry. The modification may be to a modification of a common attribute of the markup objects; and a modification to a unique property of the tagged object. Accordingly, common attributes in the corresponding object entry may be modified, and/or unique attributes in the tag entry may be modified.

Similarly, the method of operation of the present invention may further comprise: acquiring the deletion of a mark object on the specific data item by a user; and sending the deletion instruction for modifying or deleting the corresponding object entry. In other words, with the data processing and operation scheme of the present invention, deletion may only involve deletion of the whole object entry or deletion of the ID of the tagged item therein, without operating on the related tagged item.

The preferred embodiment of the invention for label data processing and operation is described below in conjunction with a specific implementation of the invention. Specifically, the entire group tag information may be expressed in the form of a memo record group component information metaInfo. MetaInfo is mainly composed of the identified objects, and is added to the objects that are boxed in this group. The metaInfo may be in schema format, and each object may contain:

id: the global unique id is used for identifying a certain annotation object, and if the annotation object exists in a plurality of annotation items, the id is shared in the plurality of annotation items;

contacts: this field is used to describe which annotation entries in this group contain the object;

common tags describes a generic property of the object, such as the common property of the object in each annotation item.

Described in json string:

using metaInfo as the form of the object entry as above, it is described that two objects (id: 0000-. The metaInfo may also include a "timestamp field that records the time.

If successive index expressions, such as "continains": 1,2,3,4,5,6, appear, they may be further described as "continains": 1-6 or [1,6] to reduce data transmission and memory.

The marking operation can be generally divided into an adding and deleting and modifying check. FIGS. 3A-3D show an example of the add/delete modify lookup operation using metaInfo.

Fig. 3A shows operations involved in adding an object. When an object is added, an object with the same id is added in the next frame, and the added object is generated based on the previous object and according to certain rules (such as label box mapping and neighborhood searching).

As shown in the figure, it is assumed that the newly added object is an object whose obj id (object id) is 000-xxx1 at frame 2000. Subsequently, an object with an object id of 000-xxx1 may be added at frame 2001. The object is generated based on previous frames and according to certain rules, such as label box mapping and neighborhood finding. Subsequently, the object entry needs to be modified. That is, a new value is added to the "contacts" field of metaInfo: 2001.

fig. 3B illustrates operations involved in deleting an object. When an object is deleted, the related objects behind the frame image are all deleted.

As shown, it is assumed that an object in a certain frame in the middle is deleted, for example, an object with obj id 000-xxx1 in the 2000 th frame. Subsequently, the index numbers of all frames after 2000 frames are erased from the "contents" field by modifying only the corresponding field of metaInfo without modifying the tag entry of the subsequent frame.

FIG. 3C illustrates operations involved in modifying an object. Every time an object is modified, all relevant entries in the entire group are changed.

As shown, assume that some piece of data in the middle needs to be modified, such as an object with obj id 000-xxx1 in frame 2000. Subsequently, the type of the modification information of the annotation object needs to be determined. When the modification relates to the specific attribute, the content of the related annotation object is needed, i.e., the annotation result of the 2000 th frame (e.g., the position of the annotation box of the object in the 2000 th frame) is modified. When the modification relates to the common attribute, the corresponding field of metaInfo needs to be modified, and the corresponding content in the "common label" is modified.

Fig. 3D shows operations involved in playing back a certain frame. When playing back the specific content of a certain frame, metaInfo and the specific label of the frame need to be read for merging and displaying.

As shown in the figure, assuming that the annotation content of the 2000 th frame needs to be played back, the original data and annotation content of the 2000 th frame need to be read, for example, the image frame or 3D point cloud data corresponding to the 2000 th frame is read from the OSS database, the annotation entry corresponding to the 2000 th frame is read from the annotation database, and then the meta info information in the packet to which the frame belongs is read.

Specifically, it is necessary to traverse an object entry in the metaInfo information and determine whether 2000 is included in the "details" field of the object entry (or a value covering 2000 in the case of compressed storage). When 2000 is included in the "contacts" field of the object entry, the common attribute of the object is merged with the annotation object information (e.g., unique attribute information) in the annotation entry of frame 2000. The above process is repeated until the common attribute information of all the annotation objects appearing in the 2000 th frame is merged, and the merged result of the metaInfo and the 2000 th frame annotation result is displayed on, for example, the image frame or the 3D point cloud data of the 2000 th frame.

According to another aspect of the present invention, a method of manipulating image annotation data is also implemented. FIG. 4 shows a schematic flow diagram of a method of image annotation data manipulation in accordance with one embodiment of the invention. The operation method of the image annotation data can be realized by an annotation person on the annotation platform of the invention comprising the image data annotation function.

In step S410, the user selects a specific mark object on a specific image data item. In step S420, a common attribute of the specific markup object is generated. In step S430, a specific object entry corresponding to the specific markup object containing the common attribute is generated. Subsequently, in step S440, the specific markup object and its common attribute are mapped to adjacent image data items based on the specific object entry.

FIG. 5 illustrates an example of an operational screenshot of an annotation platform according to the present invention. As shown, a current frame 10 (corresponding to a particular image data item) implemented as 3D point cloud data is displayed in the annotation platform. The image data may be, for example, obtained by the annotation platform from the OSS database based on the image ID. Since the data is 3D point cloud data, the current frame 10 shown can be adjusted by perspective transformation, scaling, and the like (as shown by operation options at the top of the current frame 10).

Subsequently, a user (e.g., a annotator) can mark an object in the current frame 10. For example, the user may click or frame a vehicle object in the map. The system can perform corresponding calculation according to the click position and the frame size of the user to determine the labeling frame 11 with the proper size and position of the framed vehicle. As shown on the left side of the figure, top, side and rear views of a vehicle object may also be shown based on a user's selection of the object to further facilitate user determination of object characteristics. In this case, it is determined whether the object is a new object not recorded in the database or a mark of an existing object in a new frame based on the object characteristics or the like, and a new ID or an existing ID may be assigned to the object. At this time, the updated object entry information may be returned to the annotation database. Further, a specific mark object and a frame selection position selected by a user can be transmitted back to the annotation database for updating the mark item corresponding to the specific image data item.

Common attributes for the particular tagged object may then be generated based on the user-based input. Alternatively or in lieu thereof, the common attributes of the particular tagged objects may be automatically generated based on a system's determination of the appearance of the particular tagged object. In the example of fig. 5, however, the common attribute input box 20 is automatically popped out based on the user's selection of the particular tagged object. Subsequently, the user may select the common attribute information of the specific markup object in the common attribute input box. In other application scenarios, the user may also enter specific public attribute information by himself, which is not limited herein. In addition, the first and second substrates are,

after obtaining the common attribute, a specific object entry corresponding to the specific markup object may be generated that includes the common attribute. At this time, the specific object entry generated includes the object ID newly assigned to the object, the common attribute acquired as above, and the frame number (corresponding to the image frame 10) at which it appears. Whereas in the case of an existing object entry, the common attributes may be merged and the frame number increased.

The relevance of the labeling platform to the cross-item tracking of the labeling object can be conveniently realized through the self-contained mapping function of the labeling platform. FIG. 6 illustrates an example of an operational screenshot of an annotation platform according to the present invention. The above-described screen shot may be considered a subsequent operational screen shot to the screen shot of fig. 5. For example, after the frame selection and common attribute selection of the annotation box 11 are completed, the "map next frame" button 40 below the current frame 10 may be illuminated. After the user clicks the button, the button is grayed out (enters a non-clickable state), and is automatically mapped to the same vehicle object in the next frame 30, and a label box 31 is added to the same vehicle object. Since the same object does not generate a particularly large displacement in the adjacent frames, the position of the labeling box 31 can be determined according to the position of the labeling box 11 in the frame 10, the mapped position and the neighborhood search. If the mapping is successful, the corresponding number of the frame 30 may be added to the object entry of the annotation object.

The user may then click on a "next frame" button next to, for example, the "map next frame" button to continue the operation for the next frame 30 that is next to the next frame, e.g., an automatic mapping operation. In an alternative embodiment, the mapping may be done by the system itself. Then, mapping the specific markup object and its common attribute to neighboring image data items based on the specific object entry includes: automatically locating the position of the particular tagged object in a plurality of adjacent image data items; and mapping the specific markup object and its common attributes to the plurality of adjacent image data items. For example, the user may call and select the "map subsequent frame" option in the menu, and the system may repeat the mapping operation of the label frame of the previous frame for each subsequent frame, and add the frame number to the corresponding object entry. In other embodiments, the automatic marking may be forward, such as a "map previous frame" option in a menu, and the like, as the present invention is not limited in this respect.

Alternatively or additionally, the user may choose not to perform cross-frame object association processing. Thus, based on a selection by the user not to perform the association processing, the specific markup object and the common attribute thereof can be updated to the markup entry corresponding to the specific image data item in place of the generation of the specific object entry.

To facilitate the mapping operation, adjacent image data items of the particular image data item may be displayed to the user simultaneously. For example, as shown in fig. 6, the current frame 10 and the next frame 30 are displayed simultaneously to facilitate the mapping operation in the same interface.

In other embodiments, the same interface sheet may also include more data item information. FIG. 7 illustrates an example of an operational screenshot of an annotation platform according to the present invention. In contrast to fig. 6, fig. 7 also shows a 2D image frame corresponding to a 3D point cloud image, for example, a 2D image frame 12 corresponding to the current 3D point cloud 10 and a 2D image frame 32 corresponding to the next frame of 3D point cloud 30. In a preferred embodiment, the framing for the objects in the 3D point cloud may also result in the automatic generation of a labeling frame in its corresponding 2D image frame, thereby helping the labeling personnel and system to perform more accurate positioning.

Further, for convenience of operation, a list of tagged objects included within the particular image data item may also be displayed to the user in the interface. Such as shown at 50 in fig. 6 and 7. For this purpose, the user's operation on the markup object may include operating on the markup object on an image data item including the markup object or operating on the markup object in a markup object list.

Based on the user's manipulation of the tagged object, the corresponding object entry and/or tagged entry may be updated. Specifically, the corresponding object entry may be deleted based on a deletion operation of the user on the tagged object; modifying the corresponding object item based on the common attribute modification operation of the user on the marked object; and modifying the corresponding annotation item based on the user's specific attribute modification operation of the tagged object within a certain image data item.

In addition, the operation platform of the invention can also play back the image data item containing the corresponding annotation information based on the playback operation of the user on the image data item. Playing back the image data item containing the corresponding annotation information comprises: reading the image data item from an image database; reading an associated object entry comprising a corresponding annotation item ID of the particular annotation item from an annotation database; reading a specific labeling item corresponding to the specific labeling item from a labeling database; combining the information of the associated object entry and the specific labeling entry; and presenting the image data item with the merging information superimposed thereon to a user.

According to one aspect of the present invention, it can also be realized as an annotation platform for receiving user operations and interacting with an annotation database and presenting image data items, annotation items and annotation objects to a user, the annotation platform performing the method as described above in connection with fig. 2 or 4-7.

According to one aspect of the invention, the method can also be implemented as an annotation database for receiving user instructions and interacting with an annotation platform, the annotation database performing the method described above in connection with fig. 1 or 3.

The method for processing and operating the annotation data, the corresponding database and the corresponding annotation platform are particularly suitable for the following scenes: 1. and (3) marking a 3D tracking scene of the unmanned vehicle, wherein the unmanned vehicle needs to continuously track the object within a certain time range in the 3D point cloud scene. The generated annotation data can describe the segment of the continuously changing object. 2. And (3) marking a 2D tracking scene of the unmanned vehicle, wherein the unmanned vehicle needs to continuously track the object within a certain time range in the 2D point cloud and picture scene. The generated annotation data can describe the segment of the continuously changing object. 3. And identifying and marking the face/object of the video. Video annotation requires dividing a video into frames for annotation, and marking related objects in each frame. The object entries in the present invention can describe such multiple frames of associated object labels, such as continuously changing gestures, vehicles, expressions. The method is not only suitable for object recognition, but also suitable for all scenes needing continuous labeling, such as human faces, gestures, postures and the like. It should be understood that the principles of the present invention are also applicable to the associated object tagging scenarios of other tagging items such as speech, text, etc.

Referring to fig. 8, computing device 800 includes memory 810 and processor 820.

The processor 820 may be a multi-core processor or may include multiple processors. In some embodiments, processor 820 may include a general-purpose host processor and one or more special coprocessors such as a Graphics Processor (GPU), a Digital Signal Processor (DSP), or the like. In some embodiments, processor 820 may be implemented using custom circuitry, such as an Application Specific Integrated Circuit (ASIC) or a Field Programmable Gate Array (FPGA).

The memory 810 may include various types of storage units, such as system memory, Read Only Memory (ROM), and permanent storage. Wherein the ROM may store static data or instructions for the processor 820 or other modules of the computer. The persistent storage device may be a read-write storage device. The persistent storage may be a non-volatile storage device that does not lose stored instructions and data even after the computer is powered off. In some embodiments, the persistent storage device employs a mass storage device (e.g., magnetic or optical disk, flash memory) as the persistent storage device. In other embodiments, the permanent storage may be a removable storage device (e.g., floppy disk, optical drive). The system memory may be a read-write memory device or a volatile read-write memory device, such as a dynamic random access memory. The system memory may store instructions and data that some or all of the processors require at runtime. In addition, the memory 810 may include any combination of computer-readable storage media, including various types of semiconductor memory chips (DRAM, SRAM, SDRAM, flash memory, programmable read-only memory), magnetic and/or optical disks, may also be employed. In some embodiments, memory 810 may include a removable storage device that is readable and/or writable, such as a Compact Disc (CD), a read-only digital versatile disc (e.g., DVD-ROM, dual layer DVD-ROM), a read-only Blu-ray disc, an ultra-density optical disc, a flash memory card (e.g., SD card, min SD card, Micro-SD card, etc.), a magnetic floppy disc, or the like. Computer-readable storage media do not contain carrier waves or transitory electronic signals transmitted by wireless or wired means.

The memory 810 has stored thereon executable code which, when processed by the processor 820, causes the processor 820 to perform the above-mentioned annotation data manipulation methods.

The processing method and the operation method of the annotation data, and the corresponding annotation platform and annotation database according to the present invention have been described above in detail with reference to the accompanying drawings. The invention introduces special object items as meta-information in grouping labels, so that each time of adding, deleting, modifying and checking operation only relates to two types of data operation: 1 is for the annotation item itself, and 2 is to modify metaInfo, so that the performance is greatly improved. Specifically, 1, the data transmission amount of a network is greatly reduced; 2. based on database operation, data consistency is guaranteed; 3. and (3) availability improvement: each modification can be stored in the database, and the packet marking data can not be lost due to accidents; 4. and (3) extensibility prompting: each time of adding, deleting, modifying and checking operation can not aim at the full data of the group, and theoretically, the expansibility is not limited; 5. the data storage size is reduced, and the common attributes are extracted through common Label without repeated storage in the labeled items.

Furthermore, the method according to the invention may also be implemented as a computer program or computer program product comprising computer program code instructions for carrying out the above-mentioned steps defined in the above-mentioned method of the invention.

Alternatively, the invention may also be embodied as a non-transitory machine-readable storage medium (or computer-readable storage medium, or machine-readable storage medium) having stored thereon executable code (or a computer program, or computer instruction code) which, when executed by a processor of an electronic device (or computing device, server, etc.), causes the processor to perform the steps of the above-described method according to the invention.

Those of skill would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the disclosure herein may be implemented as electronic hardware, computer software, or combinations of both.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems and methods according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Having described embodiments of the present invention, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims

1. An annotation data processing method, comprising:

generating a labeling item corresponding to a labeling item, wherein the labeling item contains partial information of a labeling object in the labeling item;

acquiring the labeling object information of a labeling object contained in a labeling item; and

and generating an object item based on the labeled object according to the labeled object information.

2. The method of claim 1, wherein generating an object entry based on a markup object according to the markup object information comprises:

and storing the ID of the labeling object and the ID of the labeling item containing the labeling object as the object entry of the labeling object.

3. The method of claim 2, wherein the obtaining of the annotation object information of the annotation object included in the annotation item comprises:

acquiring common attribute information of a labeling object contained in a labeling item;

and the related object entry also stores the common attribute information of the labeling object.

4. The method of claim 3, wherein the common attribute information of the annotation object comprises at least one of:

attribute information which commonly appears in a plurality of labeling items of the labeling object is contained; and

attribute information that does not change from one annotation item to another.

5. The method of claim 1, wherein the generated annotation entry comprises:

a labeling item ID, a labeling object ID in the labeling item, and unique information of the labeling object.

6. The method of claim 1, wherein generating an annotation entry corresponding to an annotation item comprises:

generating a corresponding annotation entry for each annotation item in the grouped annotation data, wherein each annotation entry is assigned a unique annotation entry ID within the group, and each annotation object is assigned a unique annotation object ID within the group.

7. The method of claim 1, further comprising:

and acquiring the information of the specific marked item based on the user selection instruction.

8. The method of claim 7, wherein obtaining information for a particular annotation item comprises:

reading the associated object entry comprising the corresponding labeling item ID of the specific labeling item;

reading a specific labeling item corresponding to the specific labeling item, and merging the information of the associated object item and the specific labeling item to present the merged information as the information of the specific labeling item to a user.

9. The method of claim 1, further comprising:

and deleting the specific marked object based on the user deleting instruction.

10. The method of claim 9, wherein deleting a particular annotation object comprises at least one of:

searching a specific object entry corresponding to the specific labeling object; and

based on the user deletion instruction, the annotation item ID specified in the specific object item is deleted.

11. The method of claim 1, further comprising:

and modifying the specific annotation object based on the user modification instruction.

12. The method of claim 11, wherein modifying the particular annotation object comprises at least one of:

when the modification relates to the specific attribute, modifying the annotation item containing the specific attribute; and

modifying the object entry of the annotation object when the modification relates to a common attribute.

13. The method of claim 1, further comprising:

and adding the specific annotation object based on the user adding instruction.

14. The method of claim 13, wherein adding a particular annotation object comprises:

adding annotation object information to an annotation item containing the specific annotation object; and

and adding a labeling item ID in the object entry of the specific labeling object.

15. A method of operation of annotation data, comprising:

based on the selection of a user on a specific labeling item, acquiring an associated object entry comprising a corresponding labeling item ID of the specific labeling item, wherein the object entry comprises a labeling object ID and a labeling item ID comprising a labeling object; and

and acquiring a specific labeling item corresponding to the specific labeling item, wherein the labeling item comprises partial information of a labeling object in the labeling item.

16. The method of claim 15, further comprising:

combining the information contained in the associated object entry and the specific labeling entry; and

and presenting the merging information to a user.

17. The method of claim 16, wherein merging the information contained in the associated object entry and the particular annotation entry comprises:

and merging the common attribute information of the labeling object contained in the associated object entry and the unique information of the labeling object contained in the specific labeling entry.

18. The method of claim 16, further comprising:

and acquiring a specific data item corresponding to the specific labeling item.

19. The method as recited in claim 18, further comprising:

presenting the consolidated information to a user on the obtained particular data item.

20. The method as recited in claim 18, further comprising:

acquiring a mark of a user for a new object on the specific data item;

mapping the mark to an adjacent frame to generate a mapped mark;

and sending the mark and the mapping information for generating or updating the corresponding object entry and updating the specific mark entry and the adjacent mark entries.

21. The method as recited in claim 18, further comprising:

acquiring the modification of the mark object on the specific data item by the user;

sending the modifications for modification of the corresponding object entry and/or tag entry.

22. The method of claim 21, wherein the modifying comprises at least one of:

a modification to a common attribute of the markup objects; and

for the modification of the unique property of the markup object,

and, the modification of the corresponding object entry and/or tag entry includes at least one of;

modifying the common attribute in the corresponding object entry; and

modifying the unique attribute in the tagged entry.

23. The method as recited in claim 18, further comprising:

acquiring the deletion of a mark object on the specific data item by a user;

and sending the deletion instruction for modifying or deleting the corresponding object entry.

24. An image annotation data manipulation method, comprising:

a user selecting a particular tagged object on a particular image data item;

generating a common attribute of the particular tagged object;

generating a specific object entry corresponding to the specific markup object containing the common attribute; and

mapping the particular tagged object to an adjacent image data item based on the particular object entry.

25. The method of claim 24, wherein generating common attributes for the particular tagged object comprises at least one of:

generating a common attribute of the specific markup object based on input content of a user; and

generating a common attribute of the particular tagged object based on the shape determination for the particular tagged object.

26. The method of claim 25, wherein generating common attributes for the particular tagged object based on the user's input content comprises:

jumping out of a common attribute input box based on the selection of the specific mark object by the user; and

and selecting and/or typing the public attribute information of the specific mark object in the public attribute input box by the user.

27. The method of claim 24, wherein the specific object entry includes ID information of a specific markup item including the specific markup object, and image data items are in one-to-one correspondence with the markup items,

and, the method further comprises:

adding corresponding tag item IDs of adjacent image data items to the particular object entry.

28. The method of claim 27, further comprising:

and returning the specific object item to an annotation database.

29. The method of claim 24, wherein user selection of a particular tagged object on a particular image data item comprises:

the user selects a particular tagged object on the particular image data item.

30. The method of claim 29, wherein mapping the particular tagged object and its common attributes to adjacent image data items comprises:

automatically frame the particular tagged object in an adjacent image data item based on the user's framed selection location.

31. The method of claim 29, further comprising:

and transmitting the specific mark object and the frame selection position selected by the user back to the annotation database for updating the mark item corresponding to the specific image data item.

32. The method of claim 24, further comprising:

different forms or numbers of image data items are displayed to the user.

33. The method of claim 32, wherein a 3d point cloud item of the image data item,

and, the method further comprises:

and simultaneously displaying the image frame item corresponding to the 3d point cloud item.

34. The method of claim 32, further comprising:

while displaying to the user image data items adjacent to the particular image data item.

35. The method of claim 32, further comprising:

while displaying to the user a list of tagged objects included within the particular image data item.

36. The method of claim 24, further comprising:

and updating the corresponding object entry and/or the mark entry based on the operation of the mark object by the user.

37. The method of claim 36, wherein the user manipulation of the tagged object comprises at least one of:

the user operates the marked object on the image data item containing the marked object; and

and the user operates the marked object in a marked object list.

38. Based on the operation of the user on the marked object, updating the corresponding object entry and/or the marked entry further comprises at least one of the following items:

deleting the corresponding object entry based on the deleting operation of the user on the marked object;

modifying the corresponding object item based on the common attribute modification operation of the user on the marked object; and

and modifying the corresponding annotation item based on the user's specific attribute modification operation of the annotation object in a certain image data item.

39. The method of claim 24, further comprising:

and playing back the image data item containing the corresponding annotation information based on the playback operation of the image data item by the user.

40. The method of claim 39, wherein playing back the image data item containing corresponding annotation information comprises:

reading the image data item from an image database;

reading an associated object entry comprising a corresponding annotation item ID of the particular annotation item from an annotation database;

reading a specific labeling item corresponding to the specific labeling item from a labeling database;

combining the information of the associated object entry and the specific labeling entry; and

presenting the image data item with the merging information superimposed thereon to a user.

41. The method of claim 24, further comprising:

updating the specific mark object and the common attribute thereof to the label entry corresponding to the specific image data item based on the selection that the user does not perform the association processing, instead of the generation of the specific object entry.

42. The method of claim 24, wherein mapping the particular tagged object and its common attributes to adjacent image data items based on the particular object entry comprises:

automatically locating the position of the particular tagged object in a plurality of adjacent image data items; and

mapping the particular tagged object and its common attributes to the plurality of adjacent image data items.

43. An annotation platform for receiving user actions and interacting with an annotation database and presenting image data items, annotation items and annotation objects to a user, the annotation platform performing the method of any one of claims 24 to 42.

44. An annotation database for receiving user instructions and interacting with an annotation platform, the annotation database performing the method of any one of claims 1-14.

45. A computing device, comprising:

a processor; and

memory having stored thereon executable code which, when executed by the processor, causes the processor to perform the method of any one of claims 15-42.

46. A non-transitory machine-readable storage medium having stored thereon executable code, which when executed by a processor of an electronic device, causes the processor to perform the method of any of claims 15-42.