CN117765217A

CN117765217A - Method and system for generating high-precision labels of image data set

Info

Publication number: CN117765217A
Application number: CN202311492552.4A
Authority: CN
Inventors: 王骏; 杨琦; 黄继盛; 何海清; 陈超; 李凯恩
Original assignee: Lincang Power Supply Bureau of Yunnan Power Grid Co Ltd
Current assignee: Lincang Power Supply Bureau of Yunnan Power Grid Co Ltd
Priority date: 2023-11-10
Filing date: 2023-11-10
Publication date: 2024-03-26

Abstract

The application provides a method and a system for generating high-precision labels of an image data set, wherein the method comprises the following steps: creating a virtual reality environment by using a virtual reality platform; the virtual reality environment includes: terrain, objects, illumination, materials; importing the custom object to be annotated into the virtual reality environment; acquiring target circumscribed rectangular frame or circumscribed polygon information of the custom object in the virtual reality environment based on the virtual reality platform; generating annotation data based on the target circumscribed rectangular frame or the circumscribed polygon information; based on the labeling data and the custom objects, an image data set is generated to solve the problems that the current data labeling method usually involves a large amount of manual work, is time-consuming and labor-consuming, and is difficult to realize high-precision labeling, so that the performance limitation of a deep learning model is caused in some key applications, such as dangerous object identification on a power transmission line of a power system.

Description

Method and system for generating high-precision labels of image data set

Technical Field

The present disclosure relates to the field of image data processing technologies, and in particular, to a method and system for generating high-precision labels of an image dataset.

Background

In the context of current technological developments, deep learning and Virtual Reality (VR) technology has become two core drivers in the field of information processing and simulation. Deep learning, which is a sub-field of artificial intelligence, has achieved great success in the fields of image processing, natural language processing, automatic driving, etc., by simulating the working principle of neural networks. Its core advantage is its ability to learn and extract features from a large amount of data, enabling computers to make breakthrough progress in sensing and understanding the world.

A virtual reality engine, such as UE4, enables a user to interact with a computer-generated virtual world by simulating a three-dimensional environment and virtual interactions. The technology has been widely used in the fields of game development, medical simulation, virtual training and the like.

However, integration between deep learning and virtual reality has not been fully achieved in many fields. Current methods remain challenging, especially in creating high quality, high precision deep learning data sets. Traditional data labeling methods typically involve a large amount of manual work, are time consuming and labor intensive, and are difficult to implement with high accuracy. This results in performance limitations of deep learning models in some critical applications, such as dangerous object identification on power lines of electrical power systems.

Disclosure of Invention

The embodiment of the application provides a method and a system for generating high-precision labeling of an image dataset, which are used for solving the technical problems that the existing data labeling method generally involves a large amount of manual work, is time-consuming and labor-consuming, is difficult to realize high-precision labeling, and is limited by the performance of a deep learning model in some key applications, such as dangerous object identification on a power transmission line of a power system.

A first aspect of the present application provides a method of generating high-precision annotations of an image dataset, comprising:

creating a virtual reality environment by using a virtual reality platform; the virtual reality environment includes: terrain, objects, illumination, materials;

importing the custom object to be annotated into the virtual reality environment;

acquiring target circumscribed rectangular frame or circumscribed polygon information of the custom object in the virtual reality environment based on the virtual reality platform;

generating annotation data based on the target circumscribed rectangular frame or the circumscribed polygon information;

and generating an image data set based on the annotation data and the custom object.

In some embodiments, the custom object is a three-dimensional model containing information for labeling; the noted information includes: the outline, position and size of the three-dimensional model.

In some embodiments, the annotation data comprises: and the position coordinates, the shape information and the vertex coordinates of the circumscribed rectangle frame or the circumscribed polygon of the target of the custom object.

In some embodiments, the obtaining, based on the virtual reality platform, the target bounding rectangle or bounding polygon information of the custom object in the virtual reality environment includes:

based on the virtual reality platform, executing preprocessing operation on the custom object by using an application program interface and a tool provided by the virtual reality platform and writing a program or a script; the preprocessing operation includes: selecting, moving, rotating and scaling the custom object;

constructing a target circumscribed rectangular frame or a circumscribed polygon of the custom object in the virtual reality environment based on the preprocessing operation;

and acquiring target circumscribed rectangular frame or circumscribed polygon information of the custom object in the virtual reality environment based on the circumscribed rectangular frame or circumscribed polygon.

In some embodiments, the constructing a target bounding rectangle or bounding polygon of the custom object in the virtual reality environment based on the preprocessing operation includes:

based on the preprocessing operation, constructing an external rectangular frame or an external polygon of the custom object in the virtual reality environment;

dividing the circumscribed rectangular frame or the circumscribed polygon based on the circumscribed rectangular frame or the circumscribed polygon to generate a plurality of divided rectangles;

generating a minimum circumscribed rectangle frame or circumscribed polygon of the custom object in a plurality of the divided rectangles based on the divided rectangles;

and generating a target circumscribed rectangular frame or a circumscribed polygon based on the minimum circumscribed rectangular frame or the circumscribed polygon.

In some embodiments, the generating the image dataset based on the annotation data and the custom object comprises:

based on the labeling data, sorting the labeling data; the finishing operation includes: converting the format of the annotation data, cleaning the annotation data and removing unqualified annotation data;

and associating the marking data after the sorting operation with the custom object based on the marking data after the sorting operation and the custom object to generate an image dataset.

In some embodiments, the method further comprises:

performing a data enhancement operation on the image dataset based on the image dataset; the data enhancement operation includes: and performing rotation, flipping and noise adding operations on the image data set.

A second aspect of the present application provides a system for generating high-precision annotations of an image dataset, comprising:

the creation module is used for creating a virtual reality environment by utilizing the virtual reality platform; the virtual reality environment includes: terrain, objects, illumination, materials;

the importing module is used for importing the custom object into the virtual reality environment;

the acquisition module is configured to acquire circumscribed rectangular frame or circumscribed polygon information of the custom object in the virtual reality environment based on the virtual reality platform;

the first generation module is configured to generate annotation data based on the circumscribed rectangular frame or the circumscribed polygon information;

and a second generation module configured to generate an image dataset based on the annotation data and the custom object.

In some embodiments, the acquisition module comprises:

the execution unit is configured to execute preprocessing operation on the custom object by utilizing an application program interface and a tool, an edge writing program or a script provided by the virtual reality platform based on the virtual reality platform; the preprocessing operation includes: selecting, moving, rotating and scaling the custom object;

a construction unit configured to construct an circumscribed rectangular frame or an circumscribed polygon of the custom object in the virtual reality environment based on the preprocessing operation;

and the acquisition unit is configured to acquire the information of the circumscribed rectangular frame or the circumscribed polygon of the custom object in the virtual reality environment based on the circumscribed rectangular frame or the circumscribed polygon.

The embodiment of the application provides a method and a system for generating high-precision labels of an image dataset, wherein the method comprises the following steps: creating a virtual reality environment by using a virtual reality platform; the virtual reality environment includes: terrain, objects, illumination, materials; importing the custom object to be annotated into the virtual reality environment; acquiring target circumscribed rectangular frame or circumscribed polygon information of the custom object in the virtual reality environment based on the virtual reality platform; generating annotation data based on the target circumscribed rectangular frame or the circumscribed polygon information; based on the annotation data and the custom object, an image dataset is generated to enable the simulation of a real world scene in a virtual environment and the automatic generation of high-precision annotation data.

Drawings

In order to more clearly illustrate the technical solutions of the present application, the drawings that are needed in the embodiments will be briefly described below, and it will be obvious to those skilled in the art that other drawings can be obtained from these drawings without inventive effort.

FIG. 1 is a flow chart of a method of generating high-precision annotations of an image dataset in the present application;

FIG. 2 is a schematic diagram of a system for generating high-precision annotations of an image dataset in the present application;

fig. 3 is a schematic structural diagram of the acquisition module in the present application;

FIG. 4 is a schematic structural diagram of a custom object labeled in the present application under an embodiment;

fig. 5 is a schematic structural diagram of a custom object labeled in the present application in another embodiment.

Reference numerals illustrate:

1-creating a module; 2-an import module; 3-an acquisition module; 31-an execution unit; 32-a building unit; 33-an acquisition unit; 4-a first generation module; 5-a second generation module.

Detailed Description

In order to better understand the technical solutions in the present application, the following description will clearly and completely describe the technical solutions in the embodiments of the present application with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, shall fall within the scope of the present application.

Computer vision annotation technology is a technology for automatically annotating images and video. It relies on computer vision algorithms such as Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) to generate accurate labeling information. In the computer vision labeling technology, key elements such as image segmentation and object recognition are generally used to realize the target of automatic labeling. Although computer vision labeling technology has made some progress in the field of computer vision, it still suffers from the following significant drawbacks: precision limitation: computer vision labeling techniques have certain accuracy limitations in generating high-accuracy labels for deep-learning image datasets. This technique may produce inaccurate annotations in some cases due to complex scenes and changes in the image. Complexity and efficiency issues: computer vision labeling techniques typically require complex data preprocessing and post-processing steps, as well as extensive computational resources, which result in increased complexity and computational time for labeling tasks, limiting their application to large-scale data sets. The integration of virtual reality is insufficient: current computer vision labeling techniques fail to adequately integrate virtual reality engines, such as UE4, to achieve a higher degree of interoperability and scalability. This limits its potential in the field of virtual reality applications, and in order to solve this technical problem, the present application provides a method and a system for generating a high-precision annotation of an image dataset, and the following describes a method and a system for generating a high-precision annotation of an image dataset:

as can be seen from fig. 1, a first aspect of the present application provides a method of generating high-precision annotations of an image dataset, comprising:

s100: creating a virtual reality environment by using a virtual reality platform; the virtual reality environment includes: terrain, objects, illumination, materials; in this application, an applicable virtual reality platform, such as the Unreal Engine 4 (UE 4) or other virtual reality platform development environment, is first selected. The virtual reality platform will be used to create a virtual environment. A virtual scene is created by utilizing tools of a virtual reality platform, and the virtual scene comprises terrain, objects, illumination, materials and the like. A scene should include one or more custom objects that will be targets of annotations.

S200: importing the custom object to be annotated into the virtual reality environment; the custom object comprises a three-dimensional model with information for labeling; the noted information includes: the outline, position and size of the three-dimensional model. Custom objects to be annotated are introduced into the virtual environment, and the custom objects can be three-dimensional models of various shapes and types. The custom object should contain information for labeling, such as the outline, location, size, etc. of the object.

S300: acquiring target circumscribed rectangular frame or circumscribed polygon information of the custom object in the virtual reality environment based on the virtual reality platform; and constructing a target circumscribed rectangular frame or circumscribed polygon of the custom object in the virtual reality environment by using the API and the tool provided by the virtual reality platform, so as to acquire information of the target circumscribed rectangular frame or circumscribed polygon of the custom object in the virtual reality environment.

The obtaining the target circumscribed rectangular frame or circumscribed polygon information of the custom object in the virtual reality environment based on the virtual reality platform comprises the following substeps:

s310: based on the virtual reality platform, executing preprocessing operation on the custom object by using an application program interface and a tool provided by the virtual reality platform and writing a program or a script; the preprocessing operation includes: selecting, moving, rotating and scaling the custom object; and writing a program or a script by using an API and tools provided by the virtual reality platform to realize the operation on the custom object. This includes selection, movement, rotation, zooming in and out, etc. of the object.

S320: constructing a target circumscribed rectangular frame or a circumscribed polygon of the custom object in the virtual reality environment based on the preprocessing operation; and extracting an circumscribed rectangular frame or an circumscribed polygon of the custom object in the virtual scene through selecting, moving, rotating, zooming-in and zooming-out operations on the custom object.

Based on the preprocessing operation, constructing a target circumscribed rectangular frame or circumscribed polygon of the custom object in the virtual reality environment comprises the following substeps:

s321: based on the preprocessing operation, constructing an external rectangular frame or an external polygon of the custom object in the virtual reality environment; as shown in fig. 4, the custom object is first constructed as a whole to form an circumscribed rectangular frame or circumscribed polygon of the custom object in the virtual reality environment.

S322: dividing the circumscribed rectangular frame or the circumscribed polygon based on the circumscribed rectangular frame or the circumscribed polygon to generate a plurality of divided rectangles; in order to more accurately customize the position of the object, firstly, the circumscribed rectangular frame or circumscribed polygon of the customized object in the virtual reality environment is segmented.

S323: generating a minimum circumscribed rectangle frame or circumscribed polygon of the custom object in a plurality of the divided rectangles based on the divided rectangles; and the minimum circumscribed rectangle frame or circumscribed polygon of the custom object in the segmentation rectangle is extracted, so that the circumscribed rectangle or circumscribed polygon is more attached to the custom object.

S324: and generating a target circumscribed rectangular frame or a circumscribed polygon based on the minimum circumscribed rectangular frame or the circumscribed polygon. And sequentially connecting the minimum circumscribed rectangular frames or circumscribed polygons in the split rectangles, wherein the connected circumscribed rectangular frames or circumscribed polygons are target circumscribed rectangular frames or circumscribed polygons, as shown in fig. 5. By the method, the vertex coordinates of the circumscribed rectangular frame or the circumscribed polygon of the target of the custom object can be more accurate.

S330: and acquiring target circumscribed rectangular frame or circumscribed polygon information of the custom object in the virtual reality environment based on the circumscribed rectangular frame or circumscribed polygon.

S400: generating annotation data based on the target circumscribed rectangular frame or the circumscribed polygon information; the annotation data comprises: and the position coordinates, the shape information and the vertex coordinates of the circumscribed rectangle frame or the circumscribed polygon of the target of the custom object. And automatically extracting a target circumscribed rectangular frame or a circumscribed polygon of the custom object in the virtual image through the operation of a program or a script of the virtual reality platform, and generating the information into annotation data. These data may include coordinates of the object, coordinates of boundary points, shape information, and the like.

S500: and generating an image data set based on the annotation data and the custom object. The image data set can be used for training a deep learning model so as to realize tasks such as identifying and tracking the pile driver.

The generating an image dataset based on the annotation data and the custom object comprises the sub-steps of:

s510: based on the labeling data, sorting the labeling data; the finishing operation includes: converting the format of the annotation data, cleaning the annotation data and removing unqualified annotation data; the arrangement of the image dataset comprises: and sorting and recording the generated image data set and corresponding labeling information, wherein the method comprises the processing steps of data format conversion, data cleaning, unqualified data removal and the like.

S520: and associating the marking data after the sorting operation with the custom object based on the marking data after the sorting operation and the custom object to generate an image dataset.

The method further comprises the steps of:

s600: performing a data enhancement operation on the image dataset based on the image dataset; the data enhancement operation includes: and performing rotation, flipping and noise adding operations on the image data set. To increase the diversity of the image dataset and the robustness of the model, the diversity of the image dataset and the robustness of the model are increased by applying data enhancement techniques, such as rotation, flipping, adding noise, etc.

The application provides a specific embodiment of a method for generating high-precision labels of an image dataset: firstly, virtual reality modeling is carried out: in the present application, a virtual scene will be established using the Unreal Engine 4 (UE 4) based on a virtual reality environment, simulating a pile driver operation in a real scene. First, a suitable scene is selected, including elements of the ground, building, sky, etc., to ensure the realism of the virtual environment. Pile driver model importation: in a virtual environment, a pile driver model is introduced, including the body, arms, hammers, etc. The pile driver model is positioned in the scene for simulating an actual working situation. Virtual reality operation: through the computer control interface, the pile driver model can be freely operated in a virtual environment, including operations such as moving, rotating, adjusting height and the like. The above operations may be performed by a handle or controller of the VR device. Simulation operation: the user may simulate the operation of the pile driver through the virtual reality environment, e.g., lowering the hammer head to the ground, driving the pile, etc. At the same time, the virtual environment records image data of the pile driver model at different angles. Extracting circumscribed rectangle data: and extracting circumscribed rectangular frame data of the pile driver model according to the images of each angle by using an API and a program provided by the virtual reality platform. These circumscribed rectangular boxes will accurately describe the position and boundaries of the pile driver at different angles. Data conversion and YOLO labeling: the extracted circumscribed rectangle frame data is converted into YOLO (You Only Look Once) labeling data format, including conversion coordinates, definition category and other necessary labeling information. The YOLO annotation data for each image accurately describes the location and boundaries of the pile driver in the image. Generation and management of data sets: and correlating the generated YOLO labeling data with corresponding images and arranging the data into a data set form. The data set is used for training a deep learning model so as to realize tasks such as pile driver identification and tracking.

As can be seen from FIG. 2, a second aspect of the present application provides a system for generating high-precision annotations of an image dataset, comprising: the creation module 1 is used for creating a virtual reality environment by utilizing a virtual reality platform; the virtual reality environment includes: terrain, objects, illumination, materials; the importing module 2 is used for importing the custom object into the virtual reality environment; an obtaining module 3, configured to obtain information of an external rectangular frame or an external polygon of the custom object in the virtual reality environment based on the virtual reality platform; a first generation module 4 configured to generate labeling data based on the circumscribed rectangular frame or circumscribed polygon information; a second generation module 5 configured to generate an image dataset based on the annotation data and the custom object. The functional effects of the above-mentioned system in executing the above-mentioned method can be seen in the above-mentioned method embodiments, and are not described herein.

As can be seen from fig. 3, the acquisition module 3 comprises: an execution unit 31 configured to execute a preprocessing operation on the custom object based on the virtual reality platform using an application program interface and a tool provided by the virtual reality platform, an edge writer, or a script; the preprocessing operation includes: selecting, moving, rotating and scaling the custom object; a construction unit 32 configured to construct an circumscribed rectangular frame or an circumscribed polygon of the custom object in the virtual reality environment based on the preprocessing operation; and an obtaining unit 33 configured to obtain information of the circumscribed rectangular frame or the circumscribed polygon of the custom object in the virtual reality environment based on the circumscribed rectangular frame or the circumscribed polygon. The functional effects of the above-mentioned system in executing the above-mentioned method can be seen in the above-mentioned method embodiments, and are not described herein.

The method and the system for generating the high-precision annotation of the image dataset have the following beneficial effects: 1. high-precision data marking: the method and the device label the objects in the virtual reality scene in a highly accurate mode. Through the interactivity and the accuracy of the virtual reality, accurate labeling information such as object positions, boundaries, attributes and the like can be generated, and high-quality training data is provided for the deep learning model. 2. The performance of the model is improved: the generated high-quality annotation data set can be used for training a deep learning model, so that the performance of the model is improved. In tasks such as dangerous object identification, the model can more accurately identify and classify objects, and the false alarm rate are reduced. 3. Labor and time cost are saved: compared with the traditional manual labeling method, the method saves a great deal of manpower and time cost. The automatic labeling process is not only quicker, but also more reliable, and reduces human errors and inconsistencies. 4. Application across application fields: the technical scheme of the application can be widely applied to a plurality of fields including power system safety, medical image analysis, automatic driving, virtual training, game development and the like. This provides a more efficient and accurate method of data labeling for these areas. 5. Safety and reliability are improved: in dangerous object identification of a power transmission line of a power system, the safety and reliability of the power system are improved. By more accurately identifying potentially dangerous objects, the occurrence of accidents can be reduced, and the safety of personnel and assets can be protected. 6. Push deep learning and virtual reality fusion: the application facilitates fusion of deep learning and virtual reality fields. The method fully utilizes the interactivity and the image generation capability of the virtual reality engine, and provides a brand new data source and application scene for deep learning. 7. The technical threshold is reduced: through the method and the device, more researchers and developers can easily obtain high-quality annotation data without deep learning or virtual reality expertise, so that the technical threshold is reduced, and the wide application and innovation of the technology are promoted.

The application effect of the method and the system for generating the high-precision annotation of the image dataset provided by the application is that: 1. application of virtual reality environment: the method and the device fully utilize the interactivity and the image generation capability of the virtual reality environment. By simulating a real scene in a virtual environment, objects can be manipulated in a virtual manner and image data generated in real time, thereby reducing the need for in-field data acquisition. 2. Automatic labeling flow: the virtual object automatic labeling method and device achieve automatic labeling of the virtual object. Through the API provided by the virtual reality platform and based on interactive operation of users or programs, the labeling information such as circumscribed rectangular frames or polygons of the objects in the images is extracted in real time. This automated process greatly reduces the effort and error rate of manual labeling. 3. Generating high-precision annotation data: through the high-resolution image of the virtual reality environment and the accurate virtual object position, the method and the device provide more accurate annotation data than the traditional manual annotation, and are of great importance to the training effect of the deep learning model. 4. Cross application field applicability: the method is not only embodied on dangerous object identification on the power transmission line of the power system in application, but also can be widely applied to a plurality of application fields, including the fields of medical image analysis, automatic driving, virtual training, game development and the like, and provides a more efficient and accurate data labeling method for the fields. 5. Technology fusion promotion: the method successfully fuses the deep learning technology and the virtual reality technology, and opens up a new road for the cross innovation of the two fields. Not only provides new business opportunities, but also promotes further developments and exploration in the technology field. The method has the core innovation points that virtual reality and deep learning are combined, automatic high-precision annotation data generation is realized, applicability of a plurality of application fields is expanded, and the technological fusion front is promoted. Has important significance for improving the data labeling efficiency and the deep learning model performance.

The foregoing detailed description of the embodiments of the present application has further described the objects, technical solutions and advantageous effects thereof, and it should be understood that the foregoing is merely a specific implementation of the embodiments of the present application, and is not intended to limit the scope of the embodiments of the present application, and any modifications, equivalent substitutions, improvements, etc. made on the basis of the technical solutions of the embodiments of the present application should be included in the scope of the embodiments of the present application.

Claims

1. A method of generating high-precision annotations of an image dataset, comprising:

2. A method of generating high precision annotations for an image dataset as claimed in claim 1 wherein the custom object is a three dimensional model containing information for annotation; the noted information includes: the outline, position and size of the three-dimensional model.

3. A method of generating high-precision annotations for an image dataset as claimed in claim 1, wherein the annotation data comprises: and the position coordinates, the shape information and the vertex coordinates of the circumscribed rectangle frame or the circumscribed polygon of the target of the custom object.

4. The method of claim 1, wherein the obtaining, based on the virtual reality platform, target bounding rectangular boxes or bounding polygon information of the custom object in the virtual reality environment comprises:

5. The method of claim 4, wherein constructing a target bounding rectangle or bounding polygon of the custom object in the virtual reality environment based on the preprocessing operation comprises:

6. The method of generating high-precision annotations for an image dataset of claim 1, wherein said generating an image dataset based on said annotation data and said custom object comprises:

7. A method of generating high-precision annotations for an image dataset as claimed in claim 1, further comprising:

8. A system for generating high-precision annotations of an image dataset, comprising:

a creation module (1) for creating a virtual reality environment using a virtual reality platform; the virtual reality environment includes: terrain, objects, illumination, materials;

the importing module (2) is used for importing the custom object into the virtual reality environment;

an acquisition module (3) configured to acquire circumscribed rectangular frame or circumscribed polygon information of the custom object in the virtual reality environment based on the virtual reality platform;

a first generation module (4) configured to generate annotation data based on the circumscribed rectangular frame or circumscribed polygon information;

a second generation module (5) configured to generate an image dataset based on the annotation data and the custom object.

9. A system for generating high-precision annotations of an image dataset as claimed in claim 8, wherein the acquisition module (3) comprises:

an execution unit (31) configured to execute a preprocessing operation on the custom object based on the virtual reality platform using an application program interface and a tool provided by the virtual reality platform, an edge writer or a script; the preprocessing operation includes: selecting, moving, rotating and scaling the custom object;

a construction unit (32) configured to construct an circumscribed rectangular frame or an circumscribed polygon of the custom object in the virtual reality environment based on the preprocessing operation;

and an acquisition unit (33) configured to acquire circumscribed rectangular frame or circumscribed polygon information of the custom object in the virtual reality environment based on the circumscribed rectangular frame or circumscribed polygon.